Opta Spotlight: Investigating the "curse" of the MLS Supporters' Shield

Opta Spotlight - DL image

The Opta Spotlight series continues at MLSsoccer.com as Opta's data scientists provide a detailed look at performance and apply their world-renowned soccer analysis to MLS.
"We're really proud of the work we do with clubs and leagues all over the world and we look forward to bringing the same top quality insight to MLS fans,” said Opta EVP Angus McNab.
Opta is the official data provider of MLS for six years running. For more of Opta’s MLS analysis, follow @OptaJack on Twitter and visit the OptaPro blog.

With only three matches remaining, FC Dallas look set to win the club's first Supporters' Shield title, claimed annually by the best team during the regular season. This theoretically gives them the best chance of winning MLS Cup, as they’ll have home-field advantage throughout the playoffs, and if we buy the old adage “the table never lies,” they should have the best team on the field as well. It seems a potent combination.

The problem from a Dallas fan's point of view is that things have rarely gone smoothly for Shield winners.

Shield winners have only won six of the 20 MLS Cups to date, and in the past 12 seasons Shield winners have reached MLS Cup only twice. This poses the question, "Are the best teams in the regular season just unable to perform in the playoffs, or is the Supporters' Shield a poor indicator of the best team in the regular season?"

In this decade only the LA Galaxy in 2011 have won Shield/MLS Cup double, and in fact that Galaxy team are the only regular-season conference champions to have even made the Cup final itself. 

One of the best ways of disentangling luck and actual performance levels is using Expected Goals (xG). xG measures chance quality and how likely a particular shot is likely to be scored based on distance to the goal, angle to the goal, whether or not it was a header, whether or not it was assisted and a variety of other factors. It is actually a better indicator of future goals than goals themselves. Intuitively, Expected Goals are a measure that corresponds to how we think about whether a team has outplayed an opponent – which side created higher quality chances? – and in the data it has proven to be a useful way of predicting future performances.

A shot’s xG value is the likelihood of that shot being scored. So by simulating every single shot hundreds of times we can find the most common outcome to each game. Extend the idea further, and you get here: by simulating the entire season we can find the median position in the table and points accumulated of each team for any year from which we have adequate data.

This may seem complicated. But essentially all this does is look at the quality of chances each team created and conceded throughout the course of the season, and from these chances, find their most likely finishing position.

Using xG, I went back and looked at the five past MLS seasons to find the ‘best’ regular-season teams and see if they actually do any better in the playoffs than the Shield winner. Simulating the past half-decade yields the following results:

Already we do a better job of predicting postseason outcomes than merely looking at the regular-season table. The MLS Cup winners in the past three seasons were all the ‘best’ team in their conference in the regular season by this method. In addition, LA consistently finished high in these simulations, which better reflects the fact they’ve won three of the past five MLS Cups.

There is still a problem with this method though – one that is often brought up as a critique of equating the Supporters' Shield winner to the best team in the regular season – and it is this: Strength of schedule is uneven, which is a natural derivative of MLS's conference structure and unbalanced schedule. On the plus side, this allows for more rivalry games and inter-conference play, but it also makes it harder to directly compare teams’ point totals at the end of the season.

In order to remove this potential source of bias, we can create a pseudo-balanced schedule looking only at a subset of games. In the 2011 season each team played a perfectly balanced schedule (playing each other team once at home and once away) so there is no need to re-balance that schedule. However, from 2012 onwards each team played the teams in the other conference only once, and each team in their own conference once at home and once away, with a few extra inter-conference games.

This means we can create a balanced conference schedule by only looking at the games each team played against others in their own conference (if a team played the same fixture twice in a season, the one that occurred later in the season was kept), which lowers our sample from a 34-game season to an 18-game season.

This doesn’t allow us to crown a simulated Shield winner, since teams in different conferences no longer play each other, but it gives us two simulated regular-season conference winners. Only looking at this subset creates some issues as we aren’t looking at the whole season, but it allows us to compare teams playing the same fixtures.

In this simulated, pseudo-balanced schedule the MLS Cup champion ‘won’ their conference two out of four seasons. It's not perfect, but remember that the conference regular-season winner hasn't even appeared in MLS Cup during these four seasons in the actual table.

What does this tell us? Well, while the Shield winner may have a lousy record in the playoffs, it doesn’t mean there is no link between playing well in the regular season and winning MLS Cup. If we look at xG, the teams that perform best in their conference often come out as MLS Cup winners. If we look at points? Nope.

So how excited should FC Dallas be about their overwhelmingly likely Shield win? Well, any time you win a title, you pop champagne. That's the rule. But there is good news for FC Dallas fans beyond that, since Oscar Pareja's squad also look pretty good using the simulated xG method as well. If we simulate every game in MLS so far this season, Dallas have the best record in the Western Conference – while TFC, who lead the East, come out on top overall.

Thus the table is telling a relatively accurate tale. Dallas will surely hope it has a different ending, however, than in recent seasons.

Sam Gregory works as a Data Scientist with Opta and OptaPro. Originally from Canada, Sam moved to London last year but still keeps a close eye on MLS and hopes Cyle Larin will never stop scoring.