Results like these often convince people that preseason forecasts are basically useless. As you've probably been told repeatedly by various announcers and baseball scribes, the game is played on the field, not on a spreadsheet. However, I thought it would be instructive to look back at the forecasted performance from the beginning of the season and see how well they managed to evaluate expected performance.
To do this, however, we're not going to compare the projected standings to the actual standings, because a team's record is essentially a function of two things: how many hits, walks and other positive events a team creates relative to how many they give up, and the timing of when those events occur. The first one is what projection systems specialize in forecasting, but they really have no way of knowing which teams will tend to bunch their hits together, or distribute their runs in such a way as to win a bunch of close contests.
The timing aspects of win-loss record is basically random, and since there's no real way to project it in advance, we don't really want to judge how well a projection did based on results that are influenced by randomness. Instead, we're better off looking at just the quantity and value of the types of baserunners a team achieved over and above what they gave up, and evaluate the preseason forecasts based on how well they match up with what a team's expected record would be without the timing effects that can skew runs and wins. After all, that is really what the forecasts are trying to measure.
At FanGraphs, we publish the seasonal data from a model called BaseRuns, which takes all of the events a team creates and allows and turns them into an expected runs scored and runs allowed total. Based on those numbers, we can come up with an expected winning percentage that doesn't factor the timing of events into the results. So that's what we'll use to measure the team forecasts.
Here are the top five teams that have outperformed their preseason expected winning percentages:
The A's and Angels look like maybe the two best teams in baseball right now, so it might not be easy to remember that both teams looked like just solid contenders rather than juggernauts back at the start of the year. The Orioles are indeed surprise contenders, as they were forecast to finish last in the AL East but are playing like a real playoff team. It's worth noting, however, that the AL East is a great example of how the timing of events can change the outcome; the Rays actually have a slightly higher expected winning percentage than the Orioles, yet are 10 1/2 games behind them in the standings. Timing can matter an awful lot.
The Mariners and Astros round out the top five, and interestingly, meaning that four of the top five overachievers come from the American League West. What could be causing so many teams in one division to do better than expected? Well, let's go to the underachievers:
Hello, Texas. The FanGraphs preseason forecasts had the Rangers as the sixth-best team in baseball; instead, they've easily been the worst, and it isn't even really close. The 2014 Rangers aren't just the biggest flop of the season; they're one of the biggest flops in recent history. While the A's, Angels and Mariners are all having strong seasons on their own merit, each has certainly benefited from the collapse of a team that was supposed to be a legitimate contender.
The Red Sox, Diamondbacks, Rockies, and Yankees are also in the underachiever groups, but only Boston's struggles come close to matching the Rangers' failure. In each of these cases, injuries played a significant role in the team's failure to live up to expectations, and injuries will likely never be something that a computer model -- or even human beings for that matter -- can predict with any success. If a team loses several of its best players for significant periods of time, the forecasts are going to be off, and there's not much anyone can do about that. It's just part of the game.
However, notice that even when we look at the Yankees, we're not really dealing with a huge variance from the preseason forecast. The difference between the Yankees' forecast winning percentage and their BaseRuns expected winning percentage amounts to five wins over the course of the year, and they're one of the teams that the projections missed on the most. In reality, for most of the teams in baseball, the preseason forecasts actually match up pretty well with their timing-independent results.
Seventeen of the 30 teams in MLB have an expected winning percentage within 2.5 percentage points of their preseason forecasts -- which amounts to plus or minus four wins over the course of a full season -- including some teams who might look like surprises or disappointments based on their current records. For instance, the current narrative is that the Detroit Tigers have fallen apart and it's time to panic in Detroit, but their BaseRuns expected winning percentage (54.8%) is almost an exact match for the forecasting system's preseason expectation (54.1%). Their recent tailspin has basically brought them back into line with what the projections thought they would do before the year began.
What about the Royals, who have capitalized on the Tigers' slump to take over the lead in the AL Central? Believe it or not, they're actually underachieving relative to preseason expectations once you strip the timing of events out of the picture. The preseason estimate had the Royals as the most average team in baseball, expecting them to win exactly half their games. To this point of the season, their expected winning percentage by BaseRuns is just 48.7%, and the fact that they've won 54.2% of their games so far is basically entirely attributable to their clutch performance so far. Just going by the total expected runs scored and allowed, the Royals have been even more mediocre than expected, but they've bunched their hits together in order to win more games than their overall line would suggest.
Overall, the season's results to date confirm that preseason forecasts are imperfect, and the divergent paths of the A's and Rangers are a reminder of how different seasons can go for teams that look similar in March. However, the data also shows that the forecasts do a pretty decent job overall, as one standard deviation -- a statistical measure of variance -- is just 4.8 percentage points. For most teams, the preseason forecasts are going to be an okay guide to what the season will bring. There are always going to be teams that play far better or worse than the numbers suggest, but outside of the AL West, things have actually been pretty normal.