Was the Matt Cain Conundrum really a conundrum?

June 13, 2014

Just yesterday, my column relied heavily upon the work of young Lewie Pollis.

Well, I’m back at it again today. But don’t blame Pollis. Blame me, and take some comfort in the fact that we won’t have Pollis to rely upon any longer; he’s leaving the wonderful world of Internet Baseball Writing. Having just graduated from Brown, Lewie’s beginning an internship with the Cincinnati Reds and won’t be gracing us with any more of his wisdom. Which is a bummer, but I try to be grateful for whatever gifts I’m given.

And Lewie’s latest (and last) column is quite a gift. It’s subtitled “DIPS, Random Variation, and the Matt Cain Quandary.” You might already know exactly why those words go together like Jon Miller and a transistor radio in June; if so, feel free to skip these next six grafs ...

DIPS (or “DIPS theory”) is a short-hand way of pointing out that major-league pitchers, good or bad, generally will allow roughly the same batting average on balls in play. These days, the figure is in the .290-.300 range. If a pitcher gives up a significantly lower batting average on balls in play (BABiP), we usually assume he was lucky. And that he’ll revert (or regress) to the usual range, given enough innings.

In a related line of inquiry, someone noticed that pitchers, good or bad, generally will allow roughly the same percentage of home runs per fly ball. These days, that figure is right around 10 percent. If a pitcher gives up a significantly lower percentage of home runs on his fly balls allowed, we usually assume that he was lucky. And that he’ll revert to the usual range, given enough innings.

Matt Cain has pitched a ton of innings, and has not reverted. In his career, Cain’s given up a .263 batting average on balls in play, and only 7.4 percent of his fly balls allowed have carried the fence. About the latter, you might be thinking PARK EFFECTS and you might be right ... but it doesn’t explain everything. Madison Bumgarner’s at 8.8 percent, Tim Lincecum at 9.3 percent; whilst pitching for the Giants, Barry Zito gave up homers on 9 percent of his flies. The Giants’ ballpark definitely helps the pitchers avoid home runs, but doesn’t account for Cain’s 7.4 percent.

If you remove Cain’s “good luck” on balls in play, his ERA should be 3.71. If you remove Cain’s “good luck” on balls in play and his “good luck” on fly balls that didn’t fly over the fence – and I have to admit, there might be some relationship between these things I haven’t even begun to understand – then his career ERA should be 4.16.

His career ERA is actually 3.37, and the qualitative difference between 4.16 and 3.37 is massive. It’s the difference between a really good pitcher and a potential Hall of Famer; between Bronson Arroyo and Cole Hamels.

Hello, Conundrum!

I hope you’re all reading again. I did count that last bit as a whole paragraph.

For a number of years, those of us who dream of electric sheep figured that Matt Cain’s ERA was too good to be true, that he couldn’t keep doing it. But he did keep doing it. Beginning in 2009, Cain’s ERAs were 2.89, 3.14, 2.88, and 2.79. And ‘round about the 2011 or 2012 ... well, here’s Pollis again:

When Matt Cain signed his $127.5 million contract extension in April 2012, Colin Wyers memorably tweeted: “If your response to the Matt Cain extension involves xFIP I'll be by later to pour coffee on your keyboard.” As someone who was still relatively new to the field of sabermetrics, seeing that was a watershed moment in that it signaled a substantial retreat from our previous collective understanding that DIPS statistics were unambiguously superior to ERA as measurements of pitching ability.

In the two years since Wyers’ excoriation of those who still judged Cain by his peripherals despite his history of outperforming them, I’ve noticed that the best analysts in our midst have continued to trend toward nuance in their discussions of DIPS theory, looking for qualities and characteristics that explain the outliers both in the stats and from personal observations. This ever-increasing blend of sabermetrics and scouting is unambiguously good for bettering our understanding of baseball.

Two things about that.

First, in 2013, Cain’s ERA matched almost exactly his DIPS. So yeah, score one point for natural reversion.

And second, it does seem utterly possible that we were simply too quick, maybe a little too eager, to cave in to the DIPS skeptics.

When we used to have those endless debates about clutch hitting, the Luddites took great pleasure in saying, “Look, we found all these guys who hit really well in clutch situations.” To which the natural (and obvious) response was, “Well, of course you found some guys. Given a big enough population – say, all the major-league hitters – simple probability theory would predict that x of them will hit really well in clutch situations.”

And whaddayaknow, the number of great clutch hitters just happened to be almost exactly x.

Well, you can do the same thing with DIPS. I just don’t think anyone had done it. Turns out someone did: FanGraphs’ Noah Isaacs a few years ago. Now Pollis has built on Isaacs’ work. While it’s true that Cain essentially overachieved, relative to his DIPS, in seven straight seasons, it’s also true that even seven straight seasons means that Cain is 81-percent likely to be a fundamental Overachiever ... and thus a 19 percent chance of actually being a Normal.

Granted, Pollis admits that he’s employed “an overly simplistic model.” He’s open to the possibility that Cain and other pitchers might really have something that allows them to consistently beat their DIPS, and he suggests more study. As I do. But here’s where I think he really nails it: “A good analyst will never dismiss the value of a scout or assume that a model that describes some aspect of the game cannot be improved upon. But if our quest to explain baseball takes us so far into the rabbit hole that we mistake ordinary random variation for causal trends, that’s not nuance—that’s overfitting to the data.”

Good luck, Lewie. We hardly knew ye, but we’ll miss ye.