Evaluating Statcast’s rookie season

We’re now well into the season, which means we’re now well into the first season in which the long-awaited Statcast technology is installed in all 30 major-league stadiums. But just a few weeks into the season, you might have seen this headline on a Beyond the Box Score column:

The exciting yet slightly disappointing debut of Statcast

The headline was, I think, accurate. I think it’s still perfectly accurate. What’s odd is how that conclusion was reached, which I think is best epitomized by this passage:

The question that is still on everyone’s mind is how will this information be disseminated? While this may seem trivial, it’s essential to the future of MLB. In 2012, the median age of people who watched nationally televised baseball games and MLB Network was 54 years old. Obviously baseball’s executives would love to see that number decrease. The long-term economic health of the game depends on it. Because of this, it’s imperative that MLB showcase the excitement that Statcast brings to the game. For this to happen, MLB must increase their social media presence and bring the videos to the people. Unfortunately, that hasn’t been the case thus far.

Words like essential and depends and imperative and must get my bullshit detector working really hard, and there’s a huge pile of bullshit in that paragraph. Here’s what I think: I think that Statcast means something close to nothing for the future of Major League Baseball. Statcast publicly, I mean. Privately, Statcast is going to win and lose championships, and it’s going to make some careers and break others.

But I am 99 percent sure that Statcast is just one of a hundred things that MLB hopes will bring younger fans to the national broadcasts, and that it ranks somewhere in the bottom half of that list.

So why is Statcast’s debut really disappointing, slightly or otherwise?

A couple of weeks ago, a story in the Wall Street Journal about defensive shifts included this passage near the end:

The current technology still has value when it’s understood properly. Orioles general manager Dan Duquette, who used Inside-Edge as early as the 1990s, said the new metrics “have really advanced the evaluation process” but that “people are trying to figure out how to use the data, and which data points are dependable and which aren’t.”

More comprehensive answers to these questions about how to evaluate defense may be on the way in short order. Now all 30 major-league parks are equipped with StatCast, a modern player-tracking system that will give teams precise data from exactly how far a player travels to catch a ball to how fast he ran to get there.

As I understand it, teams already have this precise data; already are finding comprehensive answers to these questions. Granted, there are many millions of pieces of data, and some teams will find the answers sooner than others.

What’s disappointing isn’t the data itself, which I’m sure is both ridiculously large and preposterously useful. What’s disappointing is that we’ve seen so little of the stuff we most wanted to see.

Let’s return to that Beyond the Box Score column, which essentially faults MLB for failing to propagate all the new data, via social media. Starting with Twitter.

Well, not much has changed since April. Perusing the Statcast Twitter feed, Sunday morning, I found only 21 tweets since the All-Star Break, and three categories stand out:

1. exit velocity and distance of home runs (12)

2t. fielding metrics (3)

2t. videos of guys talking about pitchers’ spin rates, etc. (3)

Now, I bring this up because ever since Statcast – which was actually nameless for a long time, during which I cottoned to OMGf/x – first was revealed to the public, nearly all of the attention was directed toward what seemed the inevitable discovery of THE HOLY GRAIL: fielding metrics that would almost absolutely settle things, once and for all.

Or almost absolutely, anyway.

Well, we’re still waiting. First the system was installed a year or two later than everyone expected. And now that it’s installed, only tiny bits of fielding data have been released, generally to punctuate a spectacular fielding play on television. And while the numbers can certainly sound impressive – LOOK AT THAT! 98.2 PERCENT ROUTE EFFICIENCY – these numbers have been used in Vin Scully’s least-favorite way: for support rather than illumination.

For illumination, we would of course need context. We would need to know just how good 98.2 percent really is, and we would also need to know if his outfielder’s 98.2 percent is typical for him, or if he’s usually down around 90 percent. And how good would 90 percent be?

When I first started talking to people about Statcast, which must have been about four years ago, I asked the obvious question: When, and how much?

And nobody ever had any precise answers, except there was always some expectation that the OMGf/x data would (roughly) follow the model of PITCHf/x and HITf/x, with a great deal of meaningful raw data released to the public.

Which was exciting!

Except I didn’t believe it would happen then, and I still don’t believe it will happen. Not really. Because precise information about fielders – where they start, where they finish, how long it took them to get started, how long it took them to get there, which route they took to get there – is basically the last bastion of performance-related subjectivity, at least publicly. And I don’t think the players are letting go of that information without a big fight.

Today, every fielder still has plausible deniability. Occasionally an enterprising writer will ask a player with lousy fielding stats what he thinks about these modern metrics. You can imagine the typical response. Of course usually the player’s full of it (or himself) … but still, his excuses (if he bothers to offer any) can seem plausible.

Or even leaving aside the newfangled stuff, if it’s obvious that an outfielder’s slow, he might still argue that he gets good jumps and takes good routes. Even if it’s obvious that a shortstop isn’t getting to many grounders up the middle, he might argue that maybe not, but look at all those plays he making in the hole! Lousy range? Doesn’t matter: I’m great at positioning myself before every pitch!

If the raw Statcast fielding data’s released, though? Well, now we get rankings. And if you’re an outfielder and you’re last on the list in route efficiency or you’re last on the list in “jump” or you’re last on the list in some other important, seemingly objective measure, you will suddenly have no logical recourse.

Wait, I know what you’re saying … “But Rob, there are already dozens and dozens of lists with guys at the bottom. Why would these be any different?”

First, you might be right. But second, these would be different because a) they would be new, and b) they would so fundamentally describe exactly what a player is supposed to do, that the tail-enders would have little material for their rebuttal.

At this point, it’s probably worth mentioning that there is some highly public Statcast data out there, perhaps best shown in Darren Willman’s tremendous site, Baseball Savant. There, just as an example, you can find out which players hit the ball the hardest (Giancarlo Stanton) … and the softest (Billy Hamilton). Along with everyone in between, and tons of other nifty stuff about batted balls.

But we already knew that Giancarlo Stanton hits the ball real hard, didn’t we? I’m sure exit velocities can be useful. But when I look at the leaders and the trailers, I’m seeing almost exactly the names I would expect. Which makes for a pretty little graphic on TV, but doesn’t really do much else. Again, a lot of support without a great deal of illumination (which isn’t to say I didn’t enjoy seeing that Randal Grichuk is a strong, strong man).

Statcast’s first season has been, to this point anyway, mildly disappointing because the single thing we most hoped to see, we haven’t really seen at all. Fielding is complicated. Let’s say you throw a bunch of route-efficiency numbers out there, but one center fielder’s seen 25 percent more line drives right at him than another center fielder. Aren’t some batted balls more difficult to judge than others? And wouldn’t you be more likely to take an inefficient route whilst chasing a hard-to-judge batted ball? You want to compare apples to apples, oranges to oranges, other like fruits to one another … and there are a lot of fruits in the fielding data.

So while it seems that exit velocity “stabilizes” after a surprisingly small number of batted balls, maybe it takes months for the fielding data to make much sense. And it wouldn’t make sense for MLB to release the data before it makes sense.

If that’s the problem, then my concerns about the players’ concerns are unfounded (and they are, I should admit, purely speculative). For now, though, all we can say is that Statcast has been at least mildly disappointing. And that we hope it becomes less disappointing in the coming months and years.