Saturday, March 10, 2012

Contested Possessions

A few weeks ago I watched the film "Moneyball", and it inspired me to see what I could do with AFL statistics (AFL of course stands for : Australian Rules Football). 2012 is going to be the year I became involved with footy :   my partner and I have joined a club for the first time, and will be going to some games this year. For the record, we've joined Western Bulldogs, as we live in the "west".

Footy stats will also give me an opportunity to play around with R, my favorite statistical program.

This first post will look at Contested Possessions.

According to The Age newspaper, a contested possession is :

Champion Data defines a contested possession as a statistic ''credited to a player who wins the ball from a disputed situation''. A player's contested possession tally is the sum of his hard-ball gets, loose-ball gets, contested marks, free kicks for, gathers from hitouts and ''contested knock-ons'', a new variation introduced this year.   ( )

A contested possession is one of those actions that wins games, so I thought it would it would be interesting to look at contested possessions created by all teams for the 2011 home and away series.

In this chart, I've displayed the information using a boxplot. The good thing about a boxplot is that it displays a lot of information- it describes the distribution of a continuous variable by plotting:

  • minimum
  • 25% percentile
  • median (50% percentile)
  • 75% percentile
  • maximum.
In other words, a box plot gives you an indication of how consistent a team is, in this case, in creating contested possessions.
That's a lot more information than you'd get if you just used, say, the average.

As the label on horizontal axis indicates, the boxplots for each team are in 2011 ladder order.

Contested possessions are not the only action on the field that wins games, and that no doubt explains why the median number of contested possessions doesn't decrease from left to right, as ladder position gets worse.

The interesting observation though is St Kilda. It created a lot less contested possessions than all other clubs, yet didn't end up in the "back half".  We'll look at that another time.

Other interesting observations include:

  • Western Bulldogs are not at all consistent - look at the range for the number of contested possessions that achieved.
  • Some clubs had outliers - performances either very low or high compared to the normal range.

The code to produce this chart is :

afl <- `2011.By.Round.CPcsv`
byafl <- with(afl, reorder(V2,V1))
par(mar = c(6.4,4.1,2.7, 2.1))
boxplot(V3 ~ byafl, data = afl, main = "Contested Posessions 2011", xlab =      "Teams - in Ladder Order ", ylab = "Contested Possessions Per Game", beside = T, axis.lty = 1, las = 2)
Stepping through the code :

  • Rename the file. I saved the original file on my computer with a long descriptive name, but when I'm working in R, I don't want to be typing long names repetitively. Afl is the dataframe with the contested possessions data.
         The structure of the dataframe is shown below :

               V1 V2  V3
               14 Ad 145
               14 Ad 140
               14 Ad 119
               14 Ad 121
               14 Ad 136
               14 Ad 140

          Variable V1 is the ladder position of the team
          Variable V2 is a two letter indicator of the team; in this case the team is Adelaide
          Variable V3 is the number of contested possessions.

  • Create a new object with data sorted in ladder order. The default order otherwise would be alphabetical.

    • The with function evaluates an R expression in an environment constructed from data. It's used as an alternative to attach(). The format is
                    with(data, function(...))

                    So in this case I am evaluating the reorder() function with the afl dataframe.

                    The reorder() function reorders levels of a factor.

                    The "default" method treats its first argument as a categorical variable, and reorders
                    its levels based on the values of a second variable, usually numeric.

                    So in this case, we are reordering variable 2 - the team indicator - to the order shown in
                    variable 3 - ladder order.
  • The next line shifts the chart up in the plotting window, so there is room to display the team code vertically.
  • The 4th and final line produces the chart.

as interesting as this chart is, there may be a better chart to use. Next time I am going to plot the difference ratios for each team. The chart above plots the number of contested possessions without regard to the number of contested possessions achieved by the opposing side. Comparing the contested possessions achieved in relation the opposing side may show a different story.

No comments:

Post a Comment