Footy stats will also give me an opportunity to play around with R, my favorite statistical program.
This first post will look at Contested Possessions.
According to The Age newspaper, a contested possession is :
Champion Data defines a contested possession as a statistic ''credited to a player who wins the ball from a disputed situation''. A player's contested possession tally is the sum of his hard-ball gets, loose-ball gets, contested marks, free kicks for, gathers from hitouts and ''contested knock-ons'', a new variation introduced this year. (http://www.theage.com.au/afl/afl-news/its-easier-than-ever-to-get-a-contested-possession--stats-a-fact-20110608-1ft72.html )
A contested possession is one of those actions that wins games, so I thought it would it would be interesting to look at contested possessions created by all teams for the 2011 home and away series.
In this chart, I've displayed the information using a boxplot. The good thing about a boxplot is that it displays a lot of information- it describes the distribution of a continuous variable by plotting:
- minimum
- 25% percentile
- median (50% percentile)
- 75% percentile
- maximum.
As the label on horizontal axis indicates, the boxplots for each team are in 2011 ladder order.
Contested possessions are not the only action on the field that wins games, and that no doubt explains why the median number of contested possessions doesn't decrease from left to right, as ladder position gets worse.
The interesting observation though is St Kilda. It created a lot less contested possessions than all other clubs, yet didn't end up in the "back half". We'll look at that another time.
Other interesting observations include:
- Western Bulldogs are not at all consistent - look at the range for the number of contested possessions that achieved.
- Some clubs had outliers - performances either very low or high compared to the normal range.
The code to produce this chart is :
afl <- `2011.By.Round.CPcsv` byafl <- with(afl, reorder(V2,V1)) par(mar = c(6.4,4.1,2.7, 2.1)) boxplot(V3 ~ byafl, data = afl, main = "Contested Posessions 2011", xlab = "Teams - in Ladder Order ", ylab = "Contested Possessions Per Game", beside = T, axis.lty = 1, las = 2)
Stepping through the code :
- Rename the file. I saved the original file on my computer with a long descriptive name, but when I'm working in R, I don't want to be typing long names repetitively. Afl is the dataframe with the contested possessions data.
V1 V2 V3
14 Ad 145
14 Ad 140
14 Ad 119
14 Ad 121
14 Ad 136
14 Ad 140
Variable V1 is the ladder position of the team
Variable V2 is a two letter indicator of the team; in this case the team is Adelaide
Variable V3 is the number of contested possessions.
- Create a new object with data sorted in ladder order. The default order otherwise would be alphabetical.
- The with function evaluates an R expression in an environment constructed from data. It's used as an alternative to attach(). The format is
So in this case I am evaluating the reorder() function with the afl dataframe.
The reorder() function reorders levels of a factor.
The
"default"
method treats its first argument as a categorical variable, and reordersits levels based on the values of a second variable, usually numeric.
So in this case, we are reordering variable 2 - the team indicator - to the order shown in
variable 3 - ladder order.
- The next line shifts the chart up in the plotting window, so there is room to display the team code vertically.
- The 4th and final line produces the chart.
as interesting as this chart is, there may be a better chart to use. Next time I am going to plot the difference ratios for each team. The chart above plots the number of contested possessions without regard to the number of contested possessions achieved by the opposing side. Comparing the contested possessions achieved in relation the opposing side may show a different story.