Equestrian Data Science: Ranking Eventers in 2016

Coming up with a list of the top eventers based on their performance in 2016 is hard.  The sport of three-day eventing is complex and multi-faceted, and the decisions we make about which factors to consider make a significant difference to the final result of any evaluation process. It is a result of this complexity, and the fact that there is bound to be strong disagreement about who ends up being included in a list of this kind, that it is rare to see anything like this published. And yet, I still believe that this exercise has value, particularly for fans like myself who find rankings a useful way of understanding the sport.

Note that the ranking that I have produced is the result of a lot of thinking and expert consultation. It is also a work in progress. I have tried to document some of the theory and methods underlying the list(s), but if you want to bypass this discussion, feel free to skip over these sections and see the lists themselves.

Guiding Principles

All ranking schemes involve subjective judgement. They involve establishing criteria on the basis of values. Since values differ from individual to individual, disagreement is bound to happen and conflicting lists are bound to appear. But there are two guiding principles that I believe should apply to all rankings:

(1) Look to the dataHuman beings are great at making decisions and at coming up with justifications after the fact. We all have biases, and we are all terrible at overcoming them. By limiting ourselves to measurable qualities and available data, we can lessen the impact of irrelevant and inconsistently applied preferences.

(2) Be transparentBeing data-driven in our decision-making processes doesn’t mean being objective. Decisions have to be made about the kinds of data to include, the ways in which that data is transformed, and the analytical tools that are applied. This is not a bad thing. Not only are these decisions necessary, they are also important because it is here that data becomes meaningful. Here, I argue that making the ‘right’ decisions is less important than making your decisions explicit.

Method

Inclusion Criteria

Who should be considered for inclusion in a list of top eventers world-wide? Here is a list of criteria that I believe any eventer needs to satisfy in order to be considered among the top in the sport. This is where values and judgement come in, and there is bound to be some disagreement. So it goes.

CCI only
There are several significant differences between CCI and CIC events. The demands that each of these event types place on horse and rider are so different that, for all intents and purposes, they should be considered different sports entirely. Compared to CIC events, CCIs are characterized by longer cross country courses, have stricter vetting requirements, and include show jumping as the final of the three phases.  CIC competitions are developmental.  The most elite riders in the world must be able to compete, complete, and excel in CCI events.  For this reason, I have chosen only to include CCI riders in the list.

3* and 4* only
This list is meant to include the best of the best. What this means is only including riders who have successfully competed at either 3 star or 4 star levels. Why not just include riders who have competed at the 4 star level and exclude 3 star results? The fact that there are only six 4 star events means that we don’t have a whole lot of data from year to year. The decision to include 3 star data also makes sense in light of recent decisions to downgrade Olympic and World Equestrian Games events to the 3 star level.

At least two competitions
There is a difference between CCI 3*/4* pairs and pairs that have merely competed at that level. In order to be considered in the list, a horse and rider combination must have completed a minimum of two CCI events at either the 3 star or four star level.

100% event completion rate
As recent Olympic history has underscored, the most important quality of an elite rider is the ability to consistently complete events at the highest level. Consistency is key. So I have only included riders in the list that successfully completed every CCI event they entered in 2016.

Statistical Methods

Once we have established a pool of eligible pairs, what is the best way to rank them? Do we simply take an average of their final scores? How do we account for the fact that some pairs excel in dressage while others shine on cross country or in show jumping? How to we account for the fact that judging differs from event to event, and for differences in terrain, weather, and course design? From a statistical perspective, we know that some events are ‘easier’ than others. How do we fairly compare the relative performance of horses and riders competing under different sets of conditions, even at the same level?

One way of overcoming differences is through a statistical process called standardization. A z-score is the difference between the number of points that a pair earned and the average number of points earned by all competitors at the same event in standard deviation units. A score of 0 means that a pair is average.  A negative z-score means the pair is above average, and a positive score means that it is below.  By converting points into z-scores, we are able to account for various differences from event to event. By comparing average final z-scores, we can more easily and reliably compare horse and rider combinations on an even playing field.

Once we have standardized final scores, we can sort pairs according to their average z-score and take the top 10.  VOILA!  We have a list of top riders.  Here are the results, along with a little bit of more useful information about their performance at 3* and 4* levels.

The Results (worldwide)

  1. Michael Jung & Fischerrocana FST (GER)
  2. Maxime Livio & Qalao des Mers (FRA)
  3. Hazel Shannon & Clifford (Aus)
  4. Oliver Townend & ODT Ghareeb (GBR)
  5. Jonelle Price & Classic Moet (NZL)
  6. Andrew Nicholson & Teseo (NZL)
  7. Hannah Sue Burnett & Under Suspection (USA)
  8. Nicola Wilson & Annie Clover (GBR)
  9. Andreas Dibowski & FRH Butts Avedon (GER)
  10. Oliver Townend & Lanfranco (GBR)

The Results (USA)

If we apply the same criteria above, but only consider American CCI 3*/4* riders in 2016, we get the following list:

  1. Hannah Sue Burnett & Under Suspection
  2. Hannah Sue Burnett & Harbour Pilot
  3. Boyd Martin & Welcome Shadow
  4. Buck Davidson & Copper Beach
  5. Elisa Wallace & Simply Priceless
  6. Lauren Kieffer & Landmark’s Monte Carlo
  7. Lillian Heard & LCC Barnaby
  8. Kurt Martin & Delux Z
  9. Phillip Dutton & Fernhill Fugitive
  10. Sharon White & Cooley on Show

Some may find it odd that Phillip Dutton & Mighty Nice didn’t make either top 10 list, in spite of being a bronze medalist at the 2016 Olympic Games in Rio, Brazil.  The reason for this is that the FEI dataset that I have used intentionally excludes Olympic results because they are kind of strange…a horse of a different color, so to speak.  Not including the Olympics, this pair only competed at one CCI event in 2016: the Rolex Kentucky Three Day Event, where they finished in 4th with a final score of  57.8, which converts to a z-score of -1.11.  Based on this score, the pair would rank first in terms of national rankings, and fifth in the world.  But this is only one CCI event, and so I could not include them in the lists based on the criteria I established above.


Originally posted to horseHubby.com