Wednesday, May 25, 2011

For science!


This is my mad scientist expression, complete with my evil sidekick

Today is a red letter day for geeky eventing analysts everywhere. Even though I'm pretty sure I'm actually the only one.

I have finished the data entry for my riders with multiple horses project. All 40,000+ rows of data are done, putting every ride in the US in 2010 into one spreadsheet. I have never been so happy to see the end of a list of events. So far I've learned that we have some very creative namers out there. I want to know the story behind the horse named Silent Donut. There has to be a good story behind a name like that. I also liked Dad's Empty Pockets and My Tuition.

Now that I have my database, it's time for the actual data mining to start. The question is whether or not riding multiple horses makes a rider's fall or horse's fall more likely. Of course, after spending months building this database, I'll probably check out some other trends as well. I don't have a lot of data fields, just what's posted on the USEA's official results:

Horse name
Rider name
Dressage score
XC score
Jumping score
Final placing
Number of horses ridden by the rider at the trial
Horse trial name
Level
Area
Month
Year

I want to look at this question of multiple horses a couple of different ways. Some are really simple, some are a bit more complex. I don't think just one view is going to answer the question:

1. Do riders with more than one horse have more horse falls/rider falls and DNCs (did not complete) than riders with just one horse? This is the most literal way of looking at the question, but will probably be skewed due to riders in the lower levels that are just learning the game.

2. Is there a trend between horse falls/rider falls and DNCs when the number of horses ridden increases? This will take out all of the rides where there was just one horse and look at the trend as the number of horses increase. This should give a clear picture of whether or not increasing the number of horses actually has an effect.

3. What is the percentage of horse falls/rider falls for our riders with the highest average number of horses per trial? Is there a trend within their own riding as the number of horses increases? This will probably be the top twenty examples, compared back to the overall average. These are the extreme cases that can show what is actually happening when up to 12 horses are being ridden at a trial by one person.

Does anyone think I'm missing a question or a view? Also, should I use rider names or not? This is all public information to begin with, anyone can go get this, but it feels weird to use actual names after so many years of never, ever leaving identifying information in an analysis.

I'll also do some descriptive statistics, taking a look at what the database shows, but most of it is stuff the USEA publishes already, anyway. I'm already kicking around the idea of adding past years so that I can do year over year trending, but we'll see if I can convince myself to do it. For science!

1 comment:

  1. I am excited to learn what the results were! That is quite a project! I may have to name my next horse Donut.

    As for names...maybe for your personal results to make referencing easier, but names won't change the data itself so including them in any published results wouldn't be necessary IMO.

    ReplyDelete