Skip to content

Battle Of The Sexes

A strange discussion has suddenly broken out over at FasterSkier in response to some comments that Marit Björgen made about how she’d like be able to race the 50km distance just like the men do.

A reader  sent me a note asking if I’d specifically address this statement from the comments:

Looking at the times of sprint races (prologues) when men and women have gone the same distance, in almost identical conditions, you will see that the fastest women is normally a similar speed to the last man. This would leave only the last man in the field fighting for his manhood, and trying not to get chicked.
It is similar all the way up the distances if you look at the split speeds. Look at the times the women come through 1,3 then 5km (whatever the split points are on that day) then look at the mens times. Aften the first women is about the same as the last mn.

As far as I can tell, the commenter xclad’s first assertion is correct. Taking major international sprint races (WC, TDS, OWG and WSC) I found 42 instances over the past 7-8 years where the men and women appeared to use the same course for a sprint competition. In these cases, the fastest woman in qualification is in fact pretty close to the slowest man in qualification. The best a woman fared that I saw was to only have ~73% of the men ahead of her in qualification speed. Nearly half the time, the best woman would have been last in the men’s qualification.

As for xclad’s comment about split times in distance events, I can’t say. I don’t have access to split times for every race. But given that the disparity in the sprint times, it seems like that may at least be plausible.

I do have one caveat to add. I wrote a post a week or two ago describing differences in the times from round to round in men’s and women’s sprints. Specifically, for the women the qualification round is often the slowest round of the day, while for the men to opposite is often true. This means that comparing women’s qualification times to the men’s will often be comparing the women’s slowest times on that course for the day to the men’s fastest. (Not always, of course, there’s a fare bit of variation from race to race.)

Now, what I found in that previous post was that the median time for the women generally improved by around 5 seconds between qualification and the finals. So the question is, how much of a difference would those five seconds make in comparing the qualification times? Well, if I just artificially subtract five seconds from the best woman’s time, that still would only have put her in the top half of the men’s qualification field once.

As for my own personal feelings, I also think it’s quite silly that the men and women ski different distances. I’d love to see them do the same distances as part of my general desire for FIS to greatly simplify their collection of race formats. It’s may be true that a 50km for women would be a different sort of physiological test than it is for the men, but that’s fine. It’s not like the women’s 30km is really the physiological equivalent of the 50km for men as it is, I think.

Tagged , , , , ,

Checking In On The Scando Cup

As always, it’s nice to see plenty of North Americans jaunting around Europe getting in some good racing. I returned from my travels this weekend to learn that some Americans had some strong results at the Scandinavian Cups in Estonia. Let’s take a bit of a closer look at the distance results, shall we?

First off, Reese Hanneman was the only American to race the 15km classic race. As best as I can tell, there were precisely two athletes in that race that he’d faced before, both of them at World Juniors last season. (In both cases he still lost to them, but by much, much less.) So I’m not even going to touch that one. Too little data.

Even for the women, the number of athletes that have faced each other before is fairly small, so keep that in mind as I run through this. Let’s start with Ida Sargent, who finished 21st, earning her 67.77 FIS points, which is quite good for her:

Excellent! Lower points is always a good thing. However, what happens when we look directly at how she skied that day compared to her performances against those specific skiers in the past? Continue reading ›

Tagged , ,

Rybinsk: Sprint Recap

I only have the raw result this time around, so nothing fancy. Instead, we’ll just pick out some people who looked like they had some unusually good races again.

There’s a lot less to say here, since it’s a lot harder to measure sprinting performance in ways that account for the strength of the field. A few things do stand out to me, though.

Martin Jaeger’s race seems genuinely good, even with the weak field. He’s been in the top ten before, but that was two years ago, and it was only once. On the other hand, Peeter Kummel’s result seems somewhat less impressive than at first glance. Unlike Jaeger, he’s had top ten’s before on a fairly regular basis. That’s certainly not a typical result for him, but it’s not unusual for him to pop up in the top 10/15 once or twice a season, is what I’m saying.

Tagged , , ,

Rybinsk: Pursuit Recap

A bit delayed, since I spent a goodly portion of Saturday running, and most of the entire weekend without internet. With the small fields it can be a little tough to interpret results from these events, to be sure. Particularly when the distance race was a pursuit, which often lead to artificially low FIS points for many of the leaders. And of course in sprinting, we don’t really have any way of measuring the strength of the field beyond the qualification round.

So let’s look at a few folks who had some seemingly stronger than usual races, starting with the pursuit:

I’m using the percent behind the median skier, and I’ve only plotted major international races (WC, OWG, WSC, TDS). The red is obviously the Rybinsk result. For the three men, this makes their results look somewhat less impressive. Not bad, of course, but probably not quite as good as they might have seemed from just glancing at the results.

Valentina Novikova is a rather different case, being a somewhat less seasoned racers than the others. She hasn’t had the opportunity to do many distance World Cups this season, but her strong result in Rybinsk does actually seem fairly good compared to her results last year.

Of course, there’s always the worry that the pursuit format is clouding the issue somewhat, so here’s a version of the same graph, only showing pursuits and mass starts: Continue reading ›

Tagged , , , , , ,

Rybinsk: Attendance

I feel like a grade school teacher, checking off names and such, but let’s see how many folks actually showed up in Rybinsk compared to past years:

I’ve plotted the number of racers and nations present in the sprint and distance events in Rybinsk, along with the corresponding averages for all other World Cup events (not including major championships like Olympics, World Champs, or big events like the Tour de Ski) in black.

At first glance, we see that the men’s fields were slightly larger this year in Rybinsk, although we might then note that the men’s fields have been slightly larger across the board this season as well. The number of nations present has been somewhat more stable. This highlights just how much depth is lost in World Cup fields from the omission of only two nations (Norway and Sweden).

I also find it interesting that the number of racers was mostly below average through 2008-2009, but it has only really been this year and last when the attendance has really plummeted in some cases.

Tagged , ,

Editorial: Ranking Lists – What Are They Good For?

After some thought, I feel compelled to expand a bit on my post regarding the US National Ranking List (NRL). Actually, my thoughts on the subject apply to points lists and athlete ranking in XC skiing more generally.

Ranking athletes via some numerical system can be both fun and necessary. It’s fun for skiing fans like myself who enjoy numbers and working with data. It’s sometimes necessary as a tool to inject some objectivity into distinguishing between high and low performing skiers. This tool can be used for many ends: seeding racers for start lists, selecting athletes for national teams, selecting athletes to participate in major events, identifying promising young athletes, etc.

The US NRL recently came in for some (to my ears, mild) criticism. Specifically, since it is based on an athlete’s best four races over the previous 12 months, it is possible for skiers to remain ranked very high even though their performance this season has been underwhelming. The NRL has been a subject of dispute in the past, and I’m sure it will be criticized again no matter how it may be altered or improved upon.

The specific “problem” in this case is rooted in the fact that ranking athletes by the average of their best four (or in FIS’s case, five) races over an entire calendar year makes it easy to ascend the ranking list rather quickly, but coming back down happens very slowly as races move outside the one year window. Two excellent races can rocket you up the rankings, but then they stick around for an entire year no matter how badly you ski from that point on.

There are tons of (fairly simple) ways to fix this issue, but I’d rather people focus on a slightly different topic than the technicalities of points and ranking lists. Instead, I’d like to discuss what it means to use data to help one make a decision and how this pertains to using data in the world of XC skiing.

There is an unfortunate tendency for people to expect data to literally tell them the answer to a question, unequivocally. I think people have a mental model of science that goes something like this:

  1. Scientist asks question.
  2. Scientist collects data.
  3. Data tells scientist what the answer to her question is.

While not entirely false, this is badly misleading. The truth is something closer to this:

  1. Scientist asks question.
  2. Scientist collects data.
  3. Scientist uses her judgement on how to model the data.
  4. Scientist uses her judgement on how to interpret the results of their model.
  5. Scientist uses her judgement on how well (if at all) any of this answers their original question.

Obviously, I’m being very simplistic here to make a point. To be sure, data can be an enormous help in answering questions. But it does not replace human judgement, it augments it.

How does this relate to the NRL? When I hear cries for more purely “objective” criteria in XC skiing, I often get the sense that people want the data to make decisions for them. I think this is both lazy and dangerous. No ranking list, no matter how well designed, will perfectly capture the notion of skiing ability.

Even so, the NRL isn’t “wrong”. It’s doing exactly what it was designed to do: rank people based on the average of their best four races over the past year. The “wrongness” stems from the perceived disconnect between the ranking list and what we want the list to tell us. It would be more productive, I think, for people to think about what it is you want the NRL to measure.

It’s unreasonable to expect the NRL, as currently implemented, to reliably identify skiers who are performing the best right now. There will surely be some significant overlap between skiers ranked well on the NRL and skiers racing well right now, but we shouldn’t necessarily expect this given how the NRL and similar tools are designed. Also, despite our fondest wishes, it’s a stretch to expect these lists to accurately predict who will ski the fastest in the future. Generally speaking the person ranked 5th will usually ski faster than the person ranked 50th over the next few races, but how meaningful are the differences between 5th and 6th? 5th and 8th?

The former problem is easily addressed from a technical standpoint by simply looking at races over a shorter time window, or down-weighting older races. You needn’t alter the NRL itself if all you want is a tool for athlete team selection. [1. Using shorter time windows involve a sticky trade-off in that it then becomes crucial for athletes to attend every top NRL race. Every athlete would need to get four distance and four sprint races by early January. If everything goes well, this may not be a problem. But weather can interfere causing races to be canceled, so it would be risky to skip any. There were essentially 7 distance races and 6 sprints available to domestic skiers this season through US Nationals (although two of those “sprints” were only qualifiers). That doesn’t leave a lot of margin for error.]

The latter problem is just inherent to the sport. The entire reason I include error bars in my athlete rankings for World Cup skiers is to point out the precision with which averaging each skier’s best N races can really achieve (not as much as we’d like). There isn’t really a ‘solution’; ski racing is variable, so predicting who’s going to ski fast next month (or next week, or next year) will always be a sizable gamble. No amount of statistical trickery will fix that.

I draw two conclusions from this. First, attempts to design a perfectly objective team selection criteria using point systems and ranking lists will fail and will fail spectacularly. If you show me the most advanced, sophisticated ranking system in existence (e.g. ELO) I’ll show you ways in which it could potentially fail and probably does in practice. Ranking lists should be a guide for subjective, human-based decision making, not a rulebook for an automaton. The goal should be to design a system that better augments human judgement, not one that is intended to completely supplant it.

Second, the flaws in the points lists and ranking systems we do use should be evaluated in terms of how likely they are to lead us to make wrong decisions. For example, the NRL was probably “wrong” to rank Caitlin Compton 3rd, for the purposes of selecting a World Championships team. Ultimately, though, she wasn’t selected for the team, so I’m left wondering what the big deal is. If we can identify and fix a perceived flaw in the NRL, fine. But perhaps we should only flip out about it to the extent that it leads to poor decisions. [2.Obviously, this last point depends on what we’re using the list for. We can’t have race organizers using their “discretion” to seed athletes differently if they think the current ranking list is wrong.]

Going forward, it seems most productive to me to (a) decide what you want to measure, as precisely as you can, (b) develop the best method you can for measuring that quantity and then (c) stipulate that it will be used only as a guide for decision making, not followed to the letter.

In that light, the process that John Farra described in the above linked FasterSkier article seems rather sensible to me. I was particularly struck by his mentioning that they started with the specific events at WSC and worked backwards from there. One area of improvement might be to look at better ways of organizing, displaying and analyzing data that highlight specific strengths/weaknesses in particular techniques or events, since the NRL currently doesn’t help much with this. But generally, I applaud the basic philosophy: data as tool and guide, not as a rulebook.

Tagged , , , ,

Race Snapshot: IBU WC 7 Men/Women Pursuit

More strong racing from Lowell Bailey and Sara Studebaker.

Continue reading ›

Tagged , , , ,