Statistical Skier : Data based analysis and commentary on nordic skiing

Arrivederci!

First Arianna Follis retires, and now we learn that Marianna Longa is heading out the door (again!). This strikes me as quite a blow to the Italian women’s team. Before I wrote this post, if you asked me to list some other top Italian women besides Follis and Longa off the top of my head, I’d probably have been able to name Magda Genuin, Antonella Confortella and maybe one or two others if I’d really thought about it.

Going back through some data, it does seem like the Italian women are really going to be struggling next year, particularly in putting together a relay team. I’ve put together some graphs of the remaining contenders, though I’m by no means an expert in up-and-coming Italian women, so I apologize if I’ve omitted someone important.

First we have some “older” ladies:

Genuin is a talented sprinter, with several WC level podiums and numerous top tens. She’s not a spectacular distance skier, though, as her best results just break the 50 FIS point level. Confortola is a stronger distance skier, but she’s only cracked the top ten four times in a WC level race, and 36 is fairly old for WC skiers. Also, her results have been trending in the wrong direction lately.

Then we have some folks in their mid to late twenties: Continue reading ›

Tagged development, italy, retirement, women

2011 05 25 Joran Cross Country Comments (0) Permalink

Giro d’Italia: Rest Day 2

We’re through Stage 15 of the Giro, so here are some updated versions of the accompanying graphs. I made these “high quality” (i.e. PDFs) in case anyone wants to repost them elsewhere. Here’s the overall GC picture:

And then we have the same data broken down by team: Continue reading ›

Tagged cycling, Fun Stuff, giro d'italia, graphs

2011 05 23 Joran Cycling
Fun Stuff Comments (0) Permalink

Career Retrospective: Petra Majdic

What can you say about Petra Majdic? The Slovenian was delightfully hard to miss on the ski trails. Known as a classic sprint specialist, she was no slouch at distance events either:

From 2002 forward, that is some very consistent skiing, year in, year out. I realize that standardized percent behind the median skier may be a little hard for you to get a sense for, so here’s a different way of looking at it. Majdic’s median distance result from 2002 on was basically 11th. She slipped a bit in 2004 and 2005 (median of 17th in those seasons), but every other season her median was ~11th or better.

You might notice the sudden appearance of higher quality races beginning in 2006. My first thought was that this corresponded (roughly) with the start of the Tour de Ski, an event that Majdic has excelled at. Additionally, the unusual formats and smaller fields can sometimes produce unusually extreme differences compared to the median skier. But only 1/3 of those races below -1.5 after 2006 came from the Tour, so she probably stepped up her distance skiing generally. Note, though, that the presence of several much stronger results only slightly reduces her median results in any given season. I point this out only to remind once again about the difference between gauging performance based on someone’s best results versus their ‘average’ result.

That said, the Tour has been friendly to Majdic’s distance racing. Half of her distance podium results have come from Tour stages. And she surely is stronger in classic events: Continue reading ›

Tagged career retrospective, petra majdic, retirement, Sprint, women, World Cup

2011 05 20 Joran Career Retrospective
Cross Country Comments (0) Permalink

How To Win A Medal (Or Come Kinda Close)

Another interesting data-centric post appeared over at NCCSEF, and when it comes to data, I just can’t help myself but comment.

This time we’ve got some slides that seem to be trying to draw a relationship between WJC results and winning a medal at (I believe) the Vancouver Olympic Games. We’re only shown the (partial) results for six skiers, so I’m not sure what exactly the lesson is supposed to be.

We seem to be mixing sprint and distance results together as an indicator for future success. That seems strange to me, but I’m certainly not an expert in that sort of thing. We’ve also selected a curiously successful subset of Olympic medalists to examine. Absent is Pietro Piller Cottrer, who’s best (and only) result at WJC was 32nd (admittedly, a long time ago). Also missing is Aino-Kaisa Saarinen who’s WJC results were 15th and 23rd. How about Tobias Angerer (WJC: 18th, 26th, 28th)?Â On the other hand, we are shown Marcus Hellner, who’s WJC results were good but not spectacular: 15th and 21st.

The further information provided at the bottom regarding time to an athlete’s first podium also contains mostly skiers who achieved this feat fairly young, but then also two who did not (Gaillard and Rickardsson).

What am I to learn from this? That the right path is to podium at WJC (Northug), except when it isn’t (BjÃ¸rgen, Haag)? That the right path is to be successful early in your 20’s on the WC (Northug, Harvey), except when it isn’t (Gaillard, Rickardsson)?

When I read stuff like this, I’m left feeling mostly confused, like I’ve been presented a bunch of data, but that no one has gone to the trouble to transform this data into information. The reader is left alone, drifting in a sea of numbers, wondering what exactly was the author’s point.

I’m absolutely not going to argue with the idea that skiers who show considerable promise early on are more likely to develop into successful WC skiers. Indeed, I’m less interested in the nuts and bolts of what results mean at a given age than I am in effective and clear presentation of data.

I’ve written about connections between WJC results and medal on another occasion and I tried to emphasize the fact that when you look at all the data, there’s certainly a connection, but the different paths that skiers take toward success can vary so much that it’s difficult to create many useful generalizations just from the data.

But let’s revisit this idea with a few simple approaches and see if we can organize the data in a way that’s informative (and maybe interesting too!). First, I’m going to broaden the scope from medals to top ten results at either Olympics or World Championships. The problem with looking only at medalists is that there are just too few of them. Much can be learned by imitating a single good skier, but there’s always the danger that what worked for them only worked because of something unique about them, rather than having stumbled across some universal truth of skiing.

Let’s tackle the connection between WJC results and whether or not someone achieves a top ten result at the Olympics or World Championships. I fit a simple model (actually, not so simple; no OLS regressions here!) and plotted the model’s predictions for the probability of a top ten result at a major championship based on that athlete’s best result at WJC (sprint or distance): Continue reading ›

Tagged Analysis, medals, podium, prediction, technical

2011 05 18 Joran Uncategorized Comments (0) Permalink

Giro d’Italia – Rest Day 1

It’s not skiing, I know, but I do like to entertain myself by making pretty graphs. So like last summer I’ll be doing some graphical displays for some big stage races. I won’t provide any discussion or analysis, just the graphs for you to enjoy, pass along or share as you please.

I do try to take requests, though, so if there’s a particular graph or analysis you’d like to see, let me know in the comments and I’ll do my best to oblige. Keep in mind, though, that I don’t have access to anything beyond the simple results from each stage.

Here’s a look at the whole field:

Each line is a single rider, and the times correspond to the overall, or GC, category after the completion of each stage. Lines heading down are the good riders, and vice versa. I included “results” for the neutralized Stage 4, so all those lines remain constant since nothing changed. (Indeed, I simply extended every line, even if that person did not continue past Stage 3.) A higher quality (PDF) version can be found here.

The following graph is roughly the same concept, but broken down by team (click for larger version): Continue reading ›

Tagged cycling, giro d'italia

2011 05 16 Joran Cycling
Fun Stuff Comments (0) Permalink

Are My Lists Of Improved Skiers Correlated With Age?

A commenter asked if there was a connection between my lists of improved and un-improved skiers and age. Easy enough to check, so let’s go to the tape. Here are the distance skiers:

The y-axis (Improvement Index) has no units, so don’t try to interpret it. It’s just the mash-up of several ways of measuring changes in performance I used to rank the skiers. Lower values are bad (un-improvement) higher values are good (improvement). The relationship is statistically significant for both men and women, although as you can see from the scatterplot, the practical lessons we can draw from it are limited.

Older skiers are, on average, more likely to see a year-over-year decline, which isn’t terribly surprising. Using this measure, it’s difficult to interpret the magnitude of the decline, since my “Improvement Index” doesn’t have any sensible units. So it’s not really clear what a decline of 10 on this index means, practically speaking.

The picture for sprinting is quite different:

Tagged Age, Distance, men, most improved, most unimproved, Sprint, women, World Cup

2011 05 13 Joran Analysis
Cross Country Comments (1) Permalink

Most Un-Improved: Sprint 2011

The final installment in the most improved/un-improved series. This time it’s the sprinters who saw the biggest drop-off in performance this season over last. Again, you can consult the first in the series for a detail description of my methodology. Basically, though, the following aspects are included:

A skier’s median result
The average of the skier’s best few results
Changes from season to season are adjusted to be relative to their specific performance the previous season; so a drop from 5th to 10th is roughly equivalent to a drop from 10th to 20th.

Here are graphs that display the 12 men and women with the biggest overall drop-off, in order from left to right, top to bottom (click through for full versions): Continue reading ›

Tagged most unimproved, Sprint, World Cup

2011 05 11 Joran Cross Country Comments (1) Permalink

Pages

Tags