Another way of measuring a manager
If you talk to a professional baseball player about his lived experience, you’re guaranteed to hear a certain phrase within the first five minutes. Maybe even more guaranteed than hearing phrases like “throw a fastball”, “swing the bat”, or “comport with the platypus.” You’ll hear about “the grind.” By the time a baseball season reaches August, it's hot, he's tired, he's been living out of a suitcase for four months. Every night he has to play a game that requires intense concentration and lasts for three hours. Yeah, I know, it's hard to feel pity for a guy making $10 million per year, but these guys are human and those are rough working conditions, no matter how you slice it.
A couple of weeks ago, I asked whether we could measure the contributions that a manager makes behind the scenes. In that article, I found some evidence that certain managers appeared to be better than others at helping their players bounce back after a loss. But what about "the grind"? One of a manager's jobs is to manage the 25 personalities on the squad and to keep them in tip-top playing shape. It's why managers are so tuned in to giving players a day off here and there to keep them from burning out. Plus, a manager is nominally in charge of making sure that the clubhouse is a fun place to be and that the grind doesn't feel so grindy.
We know that over the course of a season, plate discipline erodes as hitters presumably get more tired. It's hard to get enough sleep out on the road and after a while even a fun job becomes just another job. Well, if players lose plate discipline, they are probably bleeding away strikes, and that can have a significant effect on a team's chances. Maybe there are some managers who are just better at handling the grind, and in minimizing the penalty to be paid for fatigue and boredom at the end of a long season.
Warning! Gory Mathematical Details Ahead!
I took all events from 2009-2013 (2014 isn't available yet) and coded each pitch for outcomes like whether the batter swung, if he did, whether he made contact, if he didn't, whether it was a called ball or strike, and overall, whether the pitch produced a strike or not. (My standard definition of "plate discipline" is "strikes are bad." A zero or one strike foul ball is a bad outcome, but a two-strike foul is just peachy.) I gathered together the rate at which both the pitcher and batter in a given at-bat saw each of those outcomes and created a control variable for how likely a given pitch was to be swung at/contacted/taken for a ball/etc.
Using that, I ran a binary logistic regression on all relevant pitches. To test whether "the grind" really does affect plate discipline, I entered the number of calendar days that had passed from a team's first game. The effect might not actually be so linear, but if an effect is there, we should at least see some evidence. And we do. The time elapsed since Opening Day was a significant predictor of hitters swinging more and making less contact, and having more strikes recorded against them.
I decided to focus on just the strike-inducing measure and I proceeded to enter managers (as fixed categorical predictors) into the model along with the interaction term of elapsed time by manager identity. We know that overall, as time elapses, the slope of the line describing plate discipline (or strike-avoidng) trends downward, even after we control for the overall talent on the pitcher's mound and in the batter's box. To put it somewhat inelegantly, if we might expect a hitter to avoid a strike 50 percent of the time on the season overall, the expected rate might be 51 percent at the beginning of the season and 49 percent by the end. We can plot what that slope looks like in general, and for each of the managers in the sample.
Ideally, we could find a manager (or 12) who could actually reverse the downward trend. That is, over a season, we would actually see that the players who played under them performed a little better. While everyone else was trending downward, his players would be trending upward. At the very least, he would hold the line. It turns out that there are a few of those managers. Again, setting aside the managers who were interims or who only managed for one year in the sample, we see that Mike Matheny actually turns out to be pretty good at this, along with Cito Gaston, Bud Black, Davey Johnson, and Terry Francona.
It's a little complex to explain the effect fully, but let's take a look at a rough estimate of the effect size. Let's consider a pitch that, based on what we know about the pitcher and batter, we expect a 50 percent chance that it will produce a strike. In general, we would assume that on Opening Day, for all managers, it would be 50 percent. But, now 90 days into the season (roughly the mid-point), we predict that for some managers (Matheny, for example), the chances of avoiding a strike have gone up a tiny bit and for some, they have gone down a bit (Mike Quade, Cecil Cooper, and Lou Piniella all ranked low on this measure). At 90 days, we would assume that Matheny's hitter (a member of the Cardinals) has a 50.65 percent chance of avoiding a strike. Quade's Cub has a 48.67 percent chance of avoiding a strike. It's the sort of thing that wouldn't be visible to the naked eye. But let's look at Matheny's advantage at this point in the season, relative to a manager who was simply able to hold the line. If his Cardinals avoid a strike on an extra 0.65 percent of pitches, and did so at that rate over the course of a season, it would be roughly an extra 158 pitches (.0065 * 162 * 150 pitches per game = 157.95) that they avoided a strike on. Again, we assume that on Opening Day, Matheny has no advantage, but by game 162, the advantage has grown further from the mid-point of the season. So, the midpoint is probably a good place to start when trying to work out the full season effect.
In our catcher-faming metrics here at BP, we assume that the value of a strike turned into a ball is roughly 0.14 runs. In this case, we don't know that the strikes that are being avoided are becoming balls. Some of them might become two-strike fouls (which don't change the count, obviously) and some are put in play. But even if we assume that avoiding a strike is worth .10 runs, Matheny, just by virtue of the fact that he manages to keep the Grind from wearing down his players (and indeed, helps them to thirve!) will clear roughly 16 runs of value over a manager who simply arrests the Grind and holds it there. And of the 57 managers in the sample, only 21 of them were able to that.
Now, there's the question of whether this is a stable effect or not. If this really is a manager "talent" we ought to see some good year-to-year correlation. In the analyses above, I had lumped all five years together for all managers (that is, I looked at data points from Bruce Bochy from 2009-2013 as one undifferentiated mass). I ran new analyses in which I looked at the slope of the "Grind" line within each individual year. Using data from managers who managed in at least four of the years under study, I looked at an AR(1) intra-class correlation of that slope. (For the non-initiated, it's like a year-to-year correlation, only you can enter more than two years.) The result was .73, which indicates a high level of year-to-year correlation. This looks like a fairly stable skill from one year to the next.
The Magical Rainbow Colored Clown Wig
The implications here are pretty big. We know that in general, the Grind has a deleterious effect on hitters, but that some managers show repeatable performance (dare we say, a "skill"?) at keeping that effect away, and in some cases, reversing it. We see that at the top end, a reasonable estimate of the effect size is 16 runs or so, depending on where the baseline really is. A merely good manager might clear 10 runs or so, but that's one win. And this only covers the skill a manager might have in fighting The Grind's effect on his hitters. Imagine if we found a similar effect for a manager's effect on pitchers.
Some time ago, I found that a perfectly optimized Sabermetric manager might be expected to produce a win and a half over a garden variety manager by virtue of his tactical superiority, and it would probably take a perfect set of circumstances and breaking a few rules in the process. Here, we've found an effect roughly equal to that which is related to, but not exhaustively descriptive of the manager's "other" job. For a long time, Sabermetricians have focused on how to make managers better tacticians, because that stuff was easily observable. What if the most powerful tool that a manager has at his disposal isn't the lineup card or the bullpen phone, but a rainbow colored clown wig that he wears behind the scenes to keep things loose? What if The Grind is the most powerful force in baseball, and we know nothing about it?
There's a certain chicken-and-egg problem that we must first deal with. One could make the case that perhaps what we're seeing is that as the season wears on for some teams, they become involved in a pennant race, and that's really what keeps them from trailing off. I can't fully rule that one out. There's also the question of whether the manager should take all the credit (or blame) for these effects. Maybe it's the hitting coach who wears the rainbow wig. Maybe it's an organizational thing. We know that there's an effect, but there's a black box around it right now. How can we encourage managers to set things up to better fight The Grind? There's more work to do here and it's pretty clear that the possible benefits are enormous.