Skip to content

Is There A Home Snow Advantage?

The folks at Nordic Commentary Project were kind enough to point me towards an interesting book that raises the issue of ‘home snow advantage’. I actually haven’t had a chance to read the book yet, but the question of a home snow advantage is certainly an interesting one and it’s one that I haven’t talked about.

The idea that skiers tend to perform better on their ‘home snow’ is an intuitive one, and there is likely something to it, regardless of whatever we may or may not be able to show statistically. But showing it with data is going to be tricky, and involve honest-to-goodness modeling, rather than just a cleverly made graph.

For starters, I’m only going to address distance skiing in this post. This turned out to be complicated enough that I wanted to report back before I spent another few days slogging through a version for sprinting. Next, we need to think carefully about how to model this:

  • I only have data at the scale of nations for both location of race and the nationality of the skier. So ‘home field advantage’ is going to be defined as racing in the nation you are from. Skiers changing nationalities happens rarely enough that I’m going to simply wave my hands and declare it to be not a problem. Feel free to disagree.
  • I’m only going to consider World Cup races, not OWG or WSC. It’s possible I’m being overly cautious, but I worry that those major events will often be the primary focus of athletes, and this will influence their performance in a way that’s unrelated to where these races happen to be held. I’ve also restricted this to seasons from 2002-2003 onward.
  • Not everyone has the opportunity to race both ‘home’ and ‘away’. Obviously, only those athletes from nations where WC races are held have this option at all, and even then, not all of them actually do. So I’m only considering those athletes that have raced both home and away at least three times each.
  • What’s our response variable? Namely, how do we measure differences in performance for a skier at home versus away? What I settled on was to take the standardized percent back from the median skier and then adjusted it so that it measures the performance relative to that specific athlete, at that point in their career. So I am controlling for differences in performance across athletes and also differences within each athlete over time.
  • What are our explanatory variables? I’ve included a Home vs. Away variable, of course, and also gender, nationality and a variable that roughly accounts for some of the wider discrepancies in field strength.

I used some fancy modeling techniques that allow me to estimate the Home vs Away effect separately for each athlete and also for each nation. Keep in mind that lots of athletes (and whole nations!) are being tossed out completely because so many folks haven’t actually raced both home and away. In particular, I made the (possibly contentious) decision to not to consider Canadian WCs as ‘home’ races for Americans, so the US doesn’t appear in this analysis at all.

Once I felt like I had a model that fit as well as I was going to get, I extracted information on a few athletes that had the largest Home vs Away splits in each direction. That information appears in the following table: Continue reading ›

Tagged ,

Week In Review: Friday Apr 8th

Gee, I wonder what the big news in XC skiing this week could be? It sounds from the translated reports from Norway and Estonia that I’ve read that Andrus Veerpalu tested positive (both A and B samples) for human growth hormone, but is denying any wrongdoing. Naturally, this means I started the week off with:

  • A new post revisiting my older one on Veerpalu. A common accusation against Veerpalu (and other suspected dopers) is that they have an unusual ability to show up at major events and ski much faster than they “normally” do. What I hope people take away from these two posts is that (a) our intuitive sense for “unusual” results does not always match the data, and (b) the answer you get will depend strongly on how you measure performance. The resulting situation is pretty ambiguous, which is why I would never recommend this line of reasoning as a serious accusation against someone.
  • A look at what we can learn about pacing from split times.
  • The first of several posts looking back at the careers of skiers who have decided to retire. This week was Pirjo Muranen’s turn.

Finally, I’m going to take this opportunity expand briefly on something I tweeted about. I read that one of the statements made at Veerpalu’s press conference in his defense was that he had passed more than 100 drug tests in the past. Although sports fans are becoming more educated about the statistics of drug testing, some confusion still remains. Without getting into the technical details, here’s the basic story.

Drug tests can make two types of mistakes: false positives, where we incorrectly label a clean athlete as a doper, and false negatives, where we incorrectly label a doped athlete as clean. In general, any testing scheme will involve a trade-off between these two types of errors. If you tweak your methodology in order to reduce the number of false positives, you will inevitably increase the number of false negatives. This trade-off cannot be outwitted! I often hear people suggest that maybe if we combine two, or three tests, or engage in some other complicated scheme, that you can reduce both types of error at the same time. Some testing procedures will be better than others in terms of both types of error, but whatever complicated combination of procedures and tests you invent, the end result will always amount to a single, big test that is itself subject to this very trade-off.

Drug testing in sports, for obvious reasons, is often calibrated in a such a way that false positives are considered far worse than false negatives. Specifically, tests are often constructed in order to be very careful to avoid falsely accusing an athlete of doping. Sadly, this means that negative results are simply less informative, in that they are much less likely to actually mean the person is clean.

My general conclusion as a stat guy is that I’m much more likely to believe that a single positive result is accurate than I am a single (or even many) negative results. A string of negative results doesn’t receive zero weight in my book, but they don’t receive much. (This is ignoring extra-statistical issues like mishandling samples at the lab, corrupt labs, or other similar human factors.)

Tagged ,

Career Retrospective: Pirjo Muranen

Finnish sprinter Pirjo Muranen is one of several skiers hanging up their skis for good this year. I’ll be devoting a post to each of them over the next few Fridays, but first the Finnish sprinter.

Muranen was certainly a successful skier, though not an overpowering one. She has an individual Gold and Bronze from the World Championships in 2001 and 2009, but all of her other Olympic and World Championship medals are from relays. A solid distance skier, her best result was 4th in a 10km mass start in 2009, but she has numerous top tens. Additionally, Muranen has participated in the Tour de Ski three times improving from 22nd, to 15th, to 12th in the overall standings.

Let’s start with her WC, WSC and OWG distance results:

Recall that I’m using standardized percent back the median skier here, and in general values below 0 are good, values below -1 are excellent and values below -2 are unbelievable.

Muranen’s extraordinary race from this season was her 6th place in the waxing plagued 10km classic at World Championships. Other than that, her best races tended to be around -1 to -1.5. Obviously, Muranen experienced some difficulties leading up to the 2006 season. I haven’t followed her closely, so I’m not sure what caused this, but she sure rebounded back from 2007 forward.

Here’s the same graph broken down by technique: Continue reading ›

Tagged , , ,

Should You Start Fast Or Slow?

Pacing is a frequent topic of conversation in skiing, or really any endurance sport. Typically, the refrain is ‘don’t start to fast‘. In fact, I feel like I hardly ever hear people recommending that one should start a race harder. It must happen occasionally (and I’ll share a story about this later), but I suspect endurance athletes have a general bias towards worrying about going too hard early in a race rather than too easy.

If someone tells me that I shouldn’t start a race ‘too fast’, my first question is ‘Too fast compared to what?’ There are only two things skiers could compare themselves to: (1) the maximum speed that you, personally, could sustain during an entire race, and (2) the speed your competitors are travelling at.

Split times from races give you direct information about (2), which in turn gives us only indirect information about (1). Think for a moment about how runners can learn to tell how fast they’re going based on how they feel. Runners can run on a fixed course (e.g. track, fixed road or trail course) and glance at their watch every lap or every mile and get instant feedback on the connection between how they feel and how fast they’re running. Even with the aid of GPS devices, skiers don’t have a concrete, objective measure of speed to compare themselves to that’s independent of weather, snow conditions, wax, etc. They can’t look at their watch, see that they skied 1km in 2:47 and have that mean something to them. Obviously, skiers do develop a sense for pacing, and probably a pretty good one, too. It’s just harder and they have to do it mostly be feel.

Let’s think carefully about how we’d show with data that someone started ‘too hard’ in an interval start race. The first thing we’d look for is their split times slowing down during the later stages of the race, right? But how do we know for sure that this race tactic would be slower compared to the alternate universe in which the skier started at a slower pace? It’s possible that the time they lost late in the race is balanced out, or even outweighed, by the time they gained early in the race. Ultimately, it’s tough to know for sure without a time machine that let’s us go back and repeat the race using different tactics. Runners (particularly track runners) can experiment with these sorts of tactics with a fairly high level of precision, but skiers have to use whatever innate sense they develop over time.

Obviously, I’m not saying that it’s impossible to tell when you’ve started too hard. I’ve bonked in races myself, you know. My point is that while the extreme cases of poor pacing in either direction are easy to spot, there’s probably a large grey area that’s difficult to navigate with any certainty, at least in skiing.

Let’s get back to the old adage, ‘Don’t start too hard’. My point with this post is that this advice only makes sense compared to yourself (1) not your competitors (2). As I said above, how fast you’re going compared to everyone else is surely correlated with how fast you can go relative to own potential, but if you start too hard in a race against Petter Northug and bonk, it isn’t Northug that caused your body to shut down, it’s your own limitations.

In fact, we can easily flip this around and say that compared to everyone else, you can never start too fast. What do I mean by that? Well, one of the first things that has jumped out at me while digging through the split times from the past WC season is how closely correlated early splits are with final performance, particularly in interval start races: Continue reading ›

Tagged ,

Revisiting The Mysteries Of Andrus Veerpalu

With somewhat cryptic reports appearing that Andrus Veerpalu failed a drug test back in January (he retired just before World Championships this year), it’s probably a good idea to revisit my old post on the subject. In that post I considered the conventional wisdom about Veerpalu that he had an uncanny ability to pop outstanding races in big championship events. When I looked at all of his results as measured simply by rank, my basic conclusion was that he sometimes seemed to do this, but the difference wasn’t huge, and I could find some examples of skiers that were more extreme instances of people over-performing at major championships. In the end, I thought his record was ambiguous, but I could see why people would say this about him.

In this post, I think I’m going to end up backing off of my skepticism a little bit, but not because of an unconfirmed drug positive. Rather, if we acknowledge that Veerpalu was an extreme classic specialist, so much so that over his entire career he’s done 98 classic races and only 16 freestyle once (and 19 pursuits in various formats), and use a somewhat more sophisticated measure of performance, things begin to look a little different. If we focus in on just his classic races, plotting them by standardized percent back from the median skier we get the following: Continue reading ›

Tagged ,

Week In Review: Friday Apr 1st

No joke! Seriously, though, I thought about writing an April Fool’s post and then realized that I kind of hate April Fool’s posts. Probably because I’m an über-boring statistics geek who doesn’t know how to have fun at all. Or something. Winter’s ending elsewhere, though were I live it never really began. Spring here means I impatiently wait for the temps to start climbing above 60F, cause otherwise I won’t touch my road bike. Yeah, I’m a fair weather biker and proud of it. Of course, the fact that it’s rained pretty much every day this week hasn’t helped. Grrr.

Anyhoo, thanks again to Skaði Nordic for sponsoring this week’s Week In Review:

Tagged

Do More Consistent Skiers Ski Faster?

In a word, no.  But the relationship between consistency and speed is a little subtle.  To look at this question let’s take the distance results from major international competitions (OWG, WSC or World Cups) and restrict ourselves to those times when an athlete did at least ten races in a particular season.  Then for each season we’ll calculate a how variable their results were and also the average of their best five races. Continue reading ›

Tagged , , ,