Defending the WAR

Well, it€’s getting to be award voting season again, which means that it’€™s probably time to start hearing about WAR again. Over the past two years, particularly in the AL MVP race, WAR became a lightning rod topic. It was particularly notable in 2012, when Miguel Cabrera became the first Triple Crown winner since the Johnson Administration — but when Mike Trout far excelled him in WAR. People knew how rare a Triple Crown was, but what exactly is this WAR? And why did so many columnists make such bad puns in their headlines about it?

Let me put some cards on the table. I’€™m a pro-WAR partisan. I think it’€™s a fantastic stat, but I know that not everyone feels the same. When I read about and hear discussions of WAR, the feeling I most often hear is confusion. WAR isn’€™t an intuitive stat. Batting average, for all its flaws has been around forever, everyone knows what’€™s good and what’€™s not, and you can calculate it with about eight keystrokes on a basic solar-powered calculator. But WAR, while a bit complex, isn’€™t impossible to figure out.

Here’€™s what I propose. I’€™d like to lay out the basics of what WAR is. Not with all the crazy math. Just the ideas behind it. The math itself is just a reflection of that idea. If you buy the theory, the rest is just a bunch of button pushing. Reader, you can make up your own mind from there? Sound fair?


Giancarlo Stanton, goat herder

There’€™s an old saw about how to know whether your job is valuable. What would happen if you didn’€™t show up for work? That’€™s where WAR starts. What would have happened if Giancarlo Stanton had, on the last day of spring training this past March, said to his fellow Marlins, "€œYou know what guys … just not feeling it this year. I’€™m going to herd goats instead."€ The Marlins, needing a right fielder, would do the logical next best thing and replace him with someone. 

Since Stanton’€™s recent injury, Ed Lucas, Reed Johnson, Jordany Valdespin and Garrett Jones have taken turns in right field.) The same with any other player. Teams would fill in with some guy from the bench or from their Triple-A team or from the waiver wire. These fill-ins form a gigantic blob that we can refer to as "€œreplacement players."€ Now, Lucas, Johnson, Valdespin and Jones would not leave the Marlins with zero production. Between the lot of them, it certainly wouldn’€™t be Staton-esque, but it would be something. The difference between what Stanton actually did and that replacement level of production is what we want to figure out.

In a vacuum …

We don’€™t actually compare Stanton to the players who really replaced him. If Stanton had gone off on his goat-herding expedition but the Marlins just happened to have Roberto Clemente in their farm system to take over in right field, we wouldn’€™t penalize Stanton for that, nor would we give him extra credit because the only right fielder that the Marlins were actually able to scratch up was … well, Reed Johnson. People criticize WAR for this fact, and it’€™s a fair critique. We’€™re specifically trying to compare players to a common baseline. In this case, it’€™s all substitute right fielders (because fourth and fifth outfielders all get some playing time) in MLB mushed together in one big blob. There are different versions of WAR, and they mathematically make that adjustment in different ways, but that’€™s the basic idea.

We look specifically at substitute right fielders because we know there are differences overall among the positions in terms of what sort of production we can expect from each. If you need a replacement first baseman, it’€™s easier to find one who won’€™t embarrass himself at the plate than there are spare shortstops who can do the same. Shortstop is really hard to play. So, we always adjust our expectations for what "€œreplacement level"€ would be based on the position where the player normally hangs out. 

Nickels, dimes, strikeouts, and doubles

Now, let’€™s go back and think about everything that Giancarlo Stanton did in 2014, starting on Opening Day. And let’€™s open up a bank account for him where we will put his contributions. If we go back to that day (a 10-1 win for the Marlins over the Rockies), we find that Stanton went 2 for 5.

His day, in a nutshell:

1st inning –€“ strikeout swinging

4th inning –€“ flyball to right field

5th inning — with the bases loaded, an infield single, scoring a run

6th inning –€“ groundout to second

8th inning –€“ double to right field, scoring a run

Let’€™s look at that strikeout and ask "€œWhat’€™s that worth?"€ Well, the currency of baseball is runs, so we need to figure out how many runs a strikeout is worth. It may seem like a silly question, but it’€™s one that we can actually answer. How? A strikeout, obviously, puts an out on the board and really doesn’€™t do anything else that’€™s helpful. If there are runners on base, they likely have to stay put. In Stanton’€™s case, his strikeout came with the bases empty and two outs. In 2014, with the bases empty and two outs, the average MLB team faced with that situation scored about 0.09 runs over the rest of the inning. Yes, there are no hundredths of a run in baseball, but that’€™s the average. So, Stanton’€™s strikeout took the Marlins from a situation where they might have expected 0.09 runs (on average) to a situation where they are out of the inning. That particular strikeout was worth -.09 runs.

But here’€™s the thing. We want to give Stanton credit for only the things he controls. He doesn’€™t get to pick when he comes up to the plate –€“ the lineup does — or what his teammates did before him. We know that Stanton himself struck out, but WAR doesn’€™t penalize or give him extra credit based on what the two guys up before him (Stanton hit third that day, like always) did in their at-bats. Instead, we look at the lost value from all strikeouts league-wide. Again, there are different ways to do this mathematically, but every time that Stanton strikes out, we debit him roughly a quarter of a run. That debit is applied to his bank account.

For a more positive example, let’€™s look at Stanton’€™s eighth-inning RBI double. A double is no doubt a good thing for the offense. It put Stanton on second, with a better chance to score than when he was just a batter. It brought Christian Yelich in to score from second. It didn’€™t produce an out. All told, pretty good deal and a nice credit for Stanton’€™s bank account. But again, how much credit? The fact that Stanton drove in Yelich is nice, but Stanton had nothing to do with Yelich getting to second in the first place (Yelich had doubled earlier in the inning).

We get around this problem in much the same way as we did with the strikeout. We know that Stanton hit the double. And doubles are nice because if you hit one, any runners that are on base tend to score, plus the batter ends up in a nice spot himself and has increased his own chances of scoring a run. But what if Stanton played on a team that had runners on base all the time? What if he played on a team that never had runners on? His double is less valuable in the absolute sense for the second team, but should we penalize him for lousy teammates? WAR says no. Again, we look at the effect that all of the doubles hit league-wide had (some had runners on, some did not) and we give Stanton that much credit, usually about three-quarters of a run for a double.

But wait, there’€™s more!

We’€™ve covered what Giancarlo Stanton did at the plate on Opening Day, but what about the other things he did? He also ran the bases. How much value is in that?

Let’€™s go back to that fifth-inning single. The next batter up, Casey McGehee, hit a double to left field. Because McGehee doubled, we assume that Stanton will make it to third. But Stanton kept going and scored. You could make the case that by doing that, Stanton actually "€œstole"€ an extra base. In that particular case, there was one out. Had Stanton stayed at third, the Marlins would have been expected to score 1.27 runs. With a runner on second and one out, the Marlins could expect 0.62 runs, plus Stanton’€™s already scored run (a total of 1.62 runs). By taking the extra base, Stanton brought the Marlins an extra .35 runs in that particular situation. Usually, we compare Stanton’€™s exploits to the league average runner. Given that there was one out, we might look to see what percentage of runners stay at third, what percentage make it home, and what percentage get thrown out at home. Therefore, we can figure out how much extra value the "€œaverage" runner brings in this situation. Anything over and above that gets deposited in Giancarlo’s piggy bank. We can also do the same for stolen bases, tagging up on fly balls, advancing on groundouts, and anything else where players can take an extra base.

In addition to his exploits on the bases, Stanton also caught a fly ball in the eighth inning. Sure, lots of fly balls to right field get caught, but not all of them. And in theory, the good fielders catch more of them than the bad ones. Stanton, after all, fielded a single to right off the bat of Justin Morneau in the fourth inning and a single by Michael Cuddyer in the eighth. Why didn’€™t he catch those?

Let’s be fair to Mr. Stanton. The singles might have been line drives that no human being could have caught. Or they might have been hit a foot to the left of where he had been standing where any idiot with a glove should have gotten the job done. The best defensive metrics try to control for this. If we know that a fly ball was in the air and we know roughly where it landed, we can at least get some idea of how difficult a catch it was. The more information we have, the better able we can make that determination, and sometimes, we have to go with a "€œbest guess."€ There have been critiques of the defensive metrics that are used in these cases. Many of those critiques are valid. WAR isn’€™t perfect. It’€™s the best that we can do with the best available information.

Suppose that flyball that Stanton caught was a ball that your average right fielder would have a 50/50 chance of getting to. If he catches it, it’€™s obviously an out. If it drops, it’€™s likely to be a double. While hanging in the air, it€’s worth half a double and half an out. A double is worth roughly three-quarters of a run to the offense, and an out in play is worth roughly a quarter of a run to the defense, so the ball in the air is worth about a quarter of a run to the offense. By catching the ball, Stanton turns a quarter run for the offense to a quarter run for the defense, a difference of half a run total. That credit goes in Stanton’s bank account.

Lather, rinse, repeat and divide by 10

We can do this for all of Stanton’€™s games in 2014. We look at how much value he put up on offense, defense, and on the bases with all the things that he did (and didn’€™t do). That’€™s his raw production. Now, remember that idea of "œreplacement"€ level that is the level at which the bench/scrub/fringe guys plays? We’€™re going to compare him to that.

Giancarlo came to the plate 638 times in 2014. What would have happened if the Marlins had given those 638 plate appearances to your average scrub right fielder? They would have gotten something, but not as much. Stanton also played 1,262 (and a third!) innings in right field for the Marlins. What would have happened if some bench player had taken those reps? We have an idea of how much value that is, based on the performance of substitute right fielders league wide.

Mix it all together, and you get a comprehensive look at how many more runs Stanton was worth to the Marlins than a replacement-level player. Usually, the number of runs is converted to be expressed in terms of wins that a player added to the team. The general rule of thumb is that somewhere between nine and 10 runs equals a win. (It mostly depends on how many runs are scored in general around the league that year. When everyone is winning games 2-1, a run is very valuable. If there are a bunch of 12-10 games, one run isn’€™t quite as big a deal any more.)

So, we’€™ve converted to the number of runs above replacement level to the number of wins above replacement. WAR.


WAR is the answer to a very specific question. It might not be your question. It seeks to be a comprehensive (offense, defense, and baserunning) measure of a player’€™s value. It compares players not to zero production, but to what backups in general at his position do. It tries to strip away the context around what actually happened (what was Stanton actually worth to the Marlins?) and focus on a common baseline (what would Stanton have been worth if placed on an average team?). It has metrics that compose it that are fairly reliable (mostly the offensive ones), and those that are less reliable (outfield defense, in particular). It has no idea whether a player is a good teammate or what he does outside the white lines. It’€™s not perfect and some people aren’€™t comfortable with those imperfections. I’€™d argue that while perfect would be nice, WAR is at least better than just about any of the commonly used stats. The not-so-satisfying thing is that because of the uncertainties, if we want to say that Stanton was definitively better than Andrew McCutchen this year, we can’€™t properly tip the balance either way. 

The other objection that I often hear is that people don’€™t like the way that WAR defines "€œvalue."€ It’€™s too abstract for some folks, which is everyone’€™s own prerogative. However, I would simply suggest that if you have been reading this article and thinking that it makes sense, then you might actually like WAR. But, reader, this is the point where I will leave you to your own decision.