Skip to content

Mid-Season Recap: North American Distance

With the first period of WC racing in the books and the next one just around the corner, I’m going to look back at the results of the North American skiers so far this season.  The theme of this post is context and expectations.

One of the reasons that I started this blog was that I felt as though the tools we use to measure performance in skiing don’t always match up with our expectations and this can lead to a lot of needless frustration.  The general paradigm for measuring performance in XC skiing is to examine only an athlete’s best few races over some time period.  FIS averages your best five races over the previous calendar year for their point lists; EISA picks out each skiers best two classic and freestyle races (if I recall correctly), and so on.

We may not be explicitly aware of it, but this methodology stems from an important observation: once in a while good skiers can have truly terrible races.  Averaging is particularly sensitive to extremely small or large values, so one terrible FIS point race could have a huge effect if we were to look at all of someones results.  This is an excellent observation, but I don’t think people are necessarily aware of the consequences.

Performance measures that look only at an athlete’s best few races are a terrible indicator of how well they typically ski.  Instead, it’s measuring how well they have skied at their very best.  It’s an estimate of an upper limit, not a “typical value”.

This, combined with short memories, leads to some strange tendencies, I think.  When we think about how good a skier is, we tend to think about their best results: Kris Freeman’s 4th place finishes, Kikkan Randall’s sprint podiums, etc.  But those aren’t necessarily indicative of how well they ski from week to week.

With all that in mind, let’s look at some graphs.

Note the slightly unusual scale on the y axis.  I’ve been tinkering with this for a while now, and I’ve grown to like using percent back from the median skier.  Negative values are good (ahead of the median skier) positive values are bad (behind the median skier).  It turns out that it mostly (but not completely!) eliminates discrepancies between mass start and interval start races, it’s easy to interpret, and there are some technical mathematical reasons why I like it as well.

The red dots are this season’s results.  I’ve highlighted last year’s Olympic results in blue, which I’ll return to in a moment.  I will also point out the one instance here where percent back from the median doesn’t capture everything as we’d like: Devon Kershaw’s 5th place in the Olympic 50k last year.  His percent back from the median skier isn’t all that low, for a 5th place finish.  This suggests that (a) the pace might have been fairly slow and (b) as with all 50k’s, the field was likely somewhat reduced. Continue reading ›

Tagged , , , , , , , ,

Technique Preferences: Italy

My post on the differing performance of Japanese skiers by technique (classic vs. freestyle) got a lot of positive responses and a few requests that I use the same methods on some other countries.  First up is Italy.

I’ve tweaked and refined my model a fair bit, hopefully for the better.  The basic idea is the same: using a hierarchical linear model to estimate differences in performance in skating and classic races (I’m omitting pursuits of all varieties).  There are some technical things I’ve changed to be able to accomodate changes over time.  Mostly this means making some adjustments for the occasional small sample sizes you find from season to season.  This allows me to provide an estimate even in seasons where a skier did races of only one technique, although naturally those estimates come with a bit of a grain of salt.

In the results by athlete for their entire career, I’m only going to display information on only those athletes that did a minimum number of races of each technique (2) for space and clarity reasons.

The final big change is in the distance category.  There are some technical reasons why FIS points are somewhat of a nuisance to use as a response variable in models like these, so I’m using something else: percent back from the median skier.  I’ll save a more detailed description for why I’m doing this and how this measure is useful for another post.  Here all we need to know is that 0% back represents the median (or middle) WC skier.  Negative values mean you’re faster and positive values mean you’re slower.

Going into this, our intuitive notion is that the Italians have been generally better at skating.  And that does turn out to be the case.  But some other fascinating stuff pops up as well. Continue reading ›

Tagged , , , , , , ,

Top 10 Qualification vs. Making The Final

This is a follow-up post to an earlier entry where I noted an off-hand comment by US coach Chris Grover that when Kikkan Randall qualifies in the top ten in a sprint race, that this is a good sign that she has a good chance of making it to the finals.

I looked at some data and concluded that it mostly supported his claim, but that the sample sizes were small enough to not mean much, from a statistical point of view.  It was suggested in the comments that I look at this overall, for all skiers and see what, if any, relationship there might be.

Generally speaking, I think we shouldn’t be terribly surprised about a correlation between one’s performance in qualification and one’s performance in the heats.  It just makes sense that, on average, the faster sprinters will do better in qualification.

What we’re interested in here is in quantifying this somewhat arbitrary, but interesting, cutoff of finishing in the top ten in qualification.  First up is a modified version of the plots I made for Kikkan Randall and Andrew Newell in the previous post: Continue reading ›

Tagged , , ,

Merry Christmas!

The Statistical Skier is going to take a much needed holiday break to visit the Statistical In-Laws and hopefully do some skiing, but not much statistics.  Regular posting should resume on Monday.

Enjoy your holidays!

Tagged ,

La Clusaz Recap: North American Women

Just as I did yesterday, let’s take a closer look at the results form Saturday’s World Cup mass start race in La Clusaz, France, this time focusing on the North Americans.  As I discussed yesterday, we’re going to use percent-back difference plots from mass start or pursuit races to evaluate each skier’s race, since the FIS points awarded in mass start races can be a bit misleading.

Starting off with the Americans, Morgan Arritola:

Both Arritola and Stephen seemed to have excellent races if you go by their place (18th, 19th) but pretty terrible races using FIS points.  Clearly, the FIS points are misleading in this case.  While the entire range of dots from Saturday’s race (blue) doesn’t seem markedly different than last season, what interests me is the median (red), which was almost 5% points better in that race than what Arritola typically managed last season (or ever, quite frankly).  And if you think that’s a big jump compared to last season:

Granted, this is looking only at mass starts and pursuits, which happened to coincide with some truly horrible races for Liz Stephen last season.  But Saturday’s race still looks about as good, or slightly better than what she was managing the previous two seasons.

Men’s post to follow later today…

[ad#AdSenseBanner]

Tagged , , , , , , ,

La Clusaz Recap: North American Men

Picking up from my post yesterday, we’ll move on to Kris Freeman:

Freeman told FasterSkier that blood sugar issues weren’t a problem, but that he simply didn’t have a great race on Saturday.  Well, it’s quite clear from this percent-back difference plot what happens when he does have blood sugar problems, as he did last season (this graph only includes mass start and pursuit races, which is why there’s only that one bad Olympic race for Freeman last season).

But if you move back earlier in his career, I don’t think this race looks all that bad in comparison.  Of course, it may be bad from the perspective of how fit he is right now, but compared to how he’s fared in these sorts of races in the past, I’d actually call that one fairly solid.  And of course, he’s always been a bit more of a threat in the 15k events.

How about the Canadians?  First up, Devon Kershaw: Continue reading ›

Tagged , , , , , ,

La Clusaz Recap: Distance Mass Start

Saturday was the first long mass start race of the World Cup season in La Clusaz, France, and along with mass start races come difficulties in measuring performance in a sensible manner.  The women’s field didn’t help matters by having the top three ski away from the field.

So you end up with some odd stuff, like Elizabeth Stephen and Morgan Arritola have seemingly some of the better races of their careers (18th, 19th) with FIS points near 100, which would only just barely put them in the top ten in last weekend’s Nor-Am race in Canada.  As I’ve pointed out before, over the course of a season or an entire career, these sorts of things tend to even out and not be such a problem.  But in situations like this, where we’re looking at a single race and we want to know what it means, it’s a problem.

One way we can get around this is by comparing a skier directly to the other skiers in this particular race, using a percent-back difference plot looking only at mass start and pursuit races.  The fact that the entire field’s FIS points may be inflated (or deflated) artificially won’t matter, since we’ll only be looking at the relative differences between a particular skier and the field.  Sometimes we get a different answer, sometimes not.  As an example, let’s look at a handfull of skiers who appeared to have particularly good or bad days on Saturday.

First up, Curdin Perl (SUI):

Compare this with how his race looks in my snapshot graph.  In that version, Perl had a career race, as the black bar represents the middle 50% of his results over the past several years.  While the percent-back difference plot above suggests a good race, it doesn’t look mind-blowingly good.  The difference in percent-back between Perl and the other top 40 skiers or so is modestly better than last season.

Another skier who seemed to have an unusually good race was Sweden’s Daniel Rickardsson (who also, I believe, is also more known for his classic skiing, but I haven’t checked that): Continue reading ›

Tagged , , , , , , , , , , ,