Someone takes swing at measuring ‘chemistry’ in MLB — and whiffs
I know I’m late to party with this one, but I’m not sure enough’s yet been made of ESPN The Magazine’s publication of projections incorporating team chemistry based on “a fancy algorithm.”
Yes, really. It’s “a proprietary team-chemistry regression model” that “combines three factors — clubhouse demographics, trait isolation and stratification of performance to pay — to discover how well MLB teams concoct positive chemistry.” Details (sort of):
Demographic factor: The impact from diversity, measured by age, tenure with the team, nationality, race and position. Teams with the highest scores have several overlapping groups based on shared traits and experiences.
Isolation factor: The impact from players who are isolated because of a lack of subgroups from these shared demographic traits. Too much diversity can, in fact, produce clubhouse isolation for players who don’t have teammates with similar backgrounds or experiences.
Ego factor: The impact from individuals’ differences in performance and monetary status. Too few All-Stars and highly paid players signal a lack of leadership; too many, however, creates conflict. The ideal level falls in the middle.
Practical impact? Well, the Rays and Red Sox begin with virtually identical records. But after the chemistry algorithm works its magic, the Rays finish four games ahead of the Red Sox. That’s a big deal!
And it all sounds reasonable enough, I suppose. Everybody loves chemistry. Alas, we have absolutely no reason to believe that this particular model actually works. While there is presumably some scientific research on these matters – Vince Gennaro alluded to such research in his SABR Analytics presentation last month – it’s one thing to conclude that something leads to greater productivity and job satisfaction in a business setting, and quite another to translate such research into wins and losses for a baseball team.
The authors actually list base projections for every team — no source listed; PECOTA, perhaps? — and the impact of chemistry on each of them. This impact ranges from 0 to 2 wins or losses in every case. Four teams get no chemistry (or “chem”) adjustment, 12 gain or lose one game in the standings, and 14 gain or lose two games. It’s said “these factors can produce a four-game swing during a season,” but that’s a big stretch. Theoretically, if a team had the worst possible chemistry in one season, and completely revamped the roster in the offseason, it could then have the best possible chemistry in the next season, going from -2 chem to +2 chem.
Granted, it seems unlikely that teams made many decisions last winter to improve their chem scores. If they did, one imagines they could find greater (theoretical) gains than +2 … but of course, while your chem score might go up, your base projection might go down because you’ve been selecting for chemistry rather than performance.
More to the point, there’s no reason for us to have any confidence in this method for scoring chemistry. It’s a black box with no history. While the method might be tied to actual research, we don’t know. It’s proprietary! For which, by the way, there’s no good excuse. What happened to peer-reviewed science? It’s like ESPN The Magazine is competing with the Yankees and the Red Sox. There’s nothing at stake here!
Still, there’s a workaround. If the method works, it should work retroactively. It wouldn’t be difficult to track down preseason projections for a number of past years. If the chem-adjusted projections performed better than the basic projections over the course of five or 10 years, we would have to take them seriously.
But if this sort of retrospective work has been done, we’re not seeing it here; so I’m guessing it hasn’t been done. I’m guessing this is all completely theoretical. Which makes it … interesting, but hardly predictive and barely reputable.
Let me be very clear about something, though: This subject is ripe for study. It’s one thing to say we haven’t quantified chemistry (true), but another to say we can’t (possibly false). Last week, the New York Times published Errol Morris’s four-part series of essays about Donald Rumsfeld. Regardless of your opinions about Rumsfeld and his favored wars, I highly recommend Morris’s exploration of “known unknowns” and “unknown unknowns” and particularly the warning that “absence of evidence is not evidence of absence.”
If you’re like me, you would rather think about baseball than about armed conflict or climate change. Fortunately, unknown unknowns and absence of evidence are highly relevant in baseball, too. When I was young, I wasn’t convinced that chemistry mattered at all. When I was slightly less young, I allowed that chemistry might matter, but I didn’t believe it could ever be quantified. Now that I’m slightly, slightly less young, I believe that chemistry probably does matter and that with a great deal of work it might begin to be quantified.
But that work is just beginning. So I wouldn’t start making any bets quite yet.