One out of every five shots taken from within 10 yards of the net have resulted in a goal in Major League Soccer since the start of the 2012 season, an average that has remained relatively robust through league scoring droughts as well as monsoons.
Conversely, shots from beyond 10 yards have proven to result in a goal once every 15 attempts. It is quite clear that not all attempts are equal, despite the boxscore tallying them in the same fashion. "Outshooting" your opponent takes a back seat to generating more high-quality opportunities when it comes to winning games.
Distance from goal is a major factor when determining the likelihood of a shot being scored, but it's obviously not the only factor. Attributes such as whether the shot resulted from a header or a direct free kick can positively or negatively influence this measurement. Headers, for example, decrease shot efficiency no matter their location. Set pieces, both directly and indirectly on goal, generate chances that are about twice as good as their open-play equivalents.
By evaluating many of the qualifiers that Opta collects on a per-shot basis (and deriving a few more), a predictive shooting model can be constructed using a simple logistic regression. By attaching a probability to every shot taken in the the last two seasons, we can get a glimpse into the types of shots that are being generated league-wide.
The truth of the matter is this: Players are pretty terrible at selecting shots. 85 percent of shots taken so far in 2013 were projected to have less than a 10 percent chance of resulting in a goal. These shots account for only 44 percent of the goals scored this season, and the majority of these "bad shots" are speculative drives from considerable distances.
A slight caveat, though, is that these shots aren't always "bad" in an absolute sense. Sometimes, shots of this nature are the only realistic option in a particular situation, and other times they do, in fact, lead to a favorable deflection or save for a dogged second attacker in the box. And honestly, when you are chasing a game, it's in your best interest to speculate. But, no matter if these shots have some sort of favorable justification, they simply don't have a good chance of directly resulting in a goal.
Naturally, we become curious about which teams are attempting more of these "good" shots on a regular basis.
|TEAM||GOOD SHOTS PER GAME||PERCENTAGE OF GOOD SHOTS|
|Real Salt Lake||2.17||16.5%|
New York find themselves on the top of the table, propped up by the shoulders of Tim Cahill and the sheer volume of solid headed chances that the Australian has generated. Removing headers, New York drop down the table drastically.
A high ranking for the Philadelphia Union is a bit of a surprise given Jack McInerney's summertime scoring drought. But, under closer investigation, the volume and quality of McInerney's chances hasn't fluctuated – it's his ability to convert that has regressed (to perhaps a more sustainable rate).
Perhaps the most interesting insight is the ranking of Sporting Kansas City, who take good shots about half as often as their rivals among the top of the Eastern Conference. But, this seems to be a trademark of SKC. They also attempted a relatively large volume of poor shots during their successful run in 2012.
Sporting are finding a way to convert low-efficiency shots by exploiting methods not captured by this shooting model. My guess is that SKC's pressing midfield helps jar the ball loose into situations that are more dangerous than other events that may occur in similar areas of the field.
This illustrates both the face value of this shooting model as well as value of approaching it with an adequate level of skepticism. Without the skepticism, we would wrongly assume that Sporting Kansas City's attack mimics those of Columbus and D.C. United. On the other hand, when realizing what factors the model does control for, we are able to flesh out New York's lopsided dependence on headers from set pieces.
Analytics is more than numbers in a table. It's the process of discovering patterns, filtering them and then effectively communicating them. Without this filter we will remain susceptible to interpreting noise in the data as truth.