The folks at Nordic Commentary Project were kind enough to point me towards an interesting book that raises the issue of ‘home snow advantage’. I actually haven’t had a chance to read the book yet, but the question of a home snow advantage is certainly an interesting one and it’s one that I haven’t talked about.
The idea that skiers tend to perform better on their ‘home snow’ is an intuitive one, and there is likely something to it, regardless of whatever we may or may not be able to show statistically. But showing it with data is going to be tricky, and involve honest-to-goodness modeling, rather than just a cleverly made graph.
For starters, I’m only going to address distance skiing in this post. This turned out to be complicated enough that I wanted to report back before I spent another few days slogging through a version for sprinting. Next, we need to think carefully about how to model this:
- I only have data at the scale of nations for both location of race and the nationality of the skier. So ‘home field advantage’ is going to be defined as racing in the nation you are from. Skiers changing nationalities happens rarely enough that I’m going to simply wave my hands and declare it to be not a problem. Feel free to disagree.
- I’m only going to consider World Cup races, not OWG or WSC. It’s possible I’m being overly cautious, but I worry that those major events will often be the primary focus of athletes, and this will influence their performance in a way that’s unrelated to where these races happen to be held. I’ve also restricted this to seasons from 2002-2003 onward.
- Not everyone has the opportunity to race both ‘home’ and ‘away’. Obviously, only those athletes from nations where WC races are held have this option at all, and even then, not all of them actually do. So I’m only considering those athletes that have raced both home and away at least three times each.
- What’s our response variable? Namely, how do we measure differences in performance for a skier at home versus away? What I settled on was to take the standardized percent back from the median skier and then adjusted it so that it measures the performance relative to that specific athlete, at that point in their career. So I am controlling for differences in performance across athletes and also differences within each athlete over time.
- What are our explanatory variables? I’ve included a Home vs. Away variable, of course, and also gender, nationality and a variable that roughly accounts for some of the wider discrepancies in field strength.
I used some fancy modeling techniques that allow me to estimate the Home vs Away effect separately for each athlete and also for each nation. Keep in mind that lots of athletes (and whole nations!) are being tossed out completely because so many folks haven’t actually raced both home and away. In particular, I made the (possibly contentious) decision to not to consider Canadian WCs as ‘home’ races for Americans, so the US doesn’t appear in this analysis at all.
Once I felt like I had a model that fit as well as I was going to get, I extracted information on a few athletes that had the largest Home vs Away splits in each direction. That information appears in the following table:
Name | Gender | Nation | #Home | #Away | Home Snow Adv |
GREY George | Men | CAN | 7 | 48 | -0.154 |
MAE Jaak | Men | EST | 9 | 70 | -0.145 |
NORTHUG Petter | Men | NOR | 8 | 35 | -0.114 |
DOMEIJ Sofia | Women | SWE | 6 | 9 | -0.108 |
PERSSON Fredrik | Men | SWE | 4 | 6 | -0.096 |
OLSSON Johan | Men | SWE | 12 | 49 | 0.164 |
ANDERSEN Remi | Men | NOR | 3 | 7 | 0.175 |
EILIFSEN Morten | Men | NOR | 4 | 11 | 0.189 |
JEFFRIES Chris | Men | CAN | 5 | 17 | 0.246 |
SVARTEDAL Jens Arne | Men | NOR | 8 | 44 | 0.279 |
Negative (green) values represent a skier performing better at home and positive (red) values represent skiers performing better away. The other two columns contain the number of home and away races each skier has done during this time period.
There are many important things to say about this before you jump to any conclusions. First, none of the home snow (dis)advantages for individual athletes were anywhere close to being statistically significant. As you can see from the table, the sample sizes are generally quite low, particularly for ‘home’ races, which is a major factor. Second, the estimated effects themselves are quite small, even for the most extreme cases that I’ve shown above. The Home Snow Advantage column doesn’t really have a sensible unit, but to give you a sense of scale, my calculations suggest that a value of -0.15 roughly represents an improvement of ~2 places at home races, on average. I conclude from this that at the level of an individual athlete, a small number of skiers probably do see an effect, but the magnitude of this effect is dwarfed by all the other sources of variation inherent in ski racing. Specifically, given the relatively small number of times many athletes race at home, it is nearly impossible to separate a ‘real’ difference from someone having a few bad races at home just from bad luck.
Now, if you go back and look directly at some of these skier’s results, you are likely to see an apparent difference. Jaak Mae is a good example. If you run through his raw results, it does seem like he’s skied better in Estonia. The issue is that the number of times he’s done this is small enough, and the natural variation in skier performance is so high, that it’s tough to rule out it happening just by chance.
So. This particular collection of skiers have displayed a tendency, relative to their peers, to ski better (or worse) at home than away. But if you dug into each skier’s races one by one, I suspect that you’d find explanations for this trend that, while correlated with home vs away races, are unrelated to what we normally think of as ‘home snow advantage’.
The estimates by nationality are on slightly better footing. Not surprisingly this is largely due to an increase in the effective sample size that comes with aggregating across athletes from a particular nation. After all my restrictions on the data, there were only 12 nations present in the analysis. The following graph shows the estimated home vs away effect for each nation, men and women combined:
Again, negative values represent improved performances at home, positive values represent improved performances away. The effect size at the extremes are slightly larger, but still not big. Once again, we’re probably only talking about an average difference of 3-4 places in the most extreme examples. However, we have gained a fair bit of statistical confidence in several cases, as indicated by the blue confidence intervals drifting away from zero.
Specifically, we can be fairly sure that, on average, Estonians and Finns tend to race slightly better at home, while Swedes, Russians and the French tend to race slightly worse at home. In the case of the French, I suspect that sample sizes are still at play, as the number of WC races in France has been quite small, so I’m a little leery of reading too much into that effect.
I want to emphasize once again that the average effect being estimated here is very small in terms of the difference you’d see in an actual race. The relative differences between nations are more interesting, really, than the estimated effects themselves. So it’s interesting to me that Estonia and Finland display more of a home field advantage than Norway, Sweden and Russia, but I wouldn’t read much into whether your favorite nation has an estimated effect that happens to be positive or negative. (I’m looking at you, Norway!)
Also of note is that despite omitting Olympic races and including a variable that attempts to control for field strength, Canada has the third largest home field advantage. That field strength variable was a fairly crude one, so it’s possible that if I devised a more sophisticated way of measuring field strength and re-ran the analysis that Canada would slip down this list a bit.
Like every other fairly complicated analysis I’ve done on this site, there are surely things I haven’t controlled for. Many of these are things that I just don’t have data on. A big one that puzzled me for a while was the fact that WC races in different nations tend to happen at roughly the same times of year. So it’s possible that some of these effects are being confounded with the fact that, say, WC races in Norway tend to fall towards the end of the season when athletes may be more tired. If I had to guess, I’d venture that it isn’t a huge deal, but I have to leave something for me to do as a follow-up, right?
[ad#adSenseBanner]
{ 2 } Comments