Through nine weeks: What we've learned about instant replay

What have we learned from instant replay in the first nine weeks of the MLB season.

Umpires Seth Buckminster (left) and Fieldin Culbreth review a play during a Brewers-Braves game.

Brett Davis / USA TODAY Sports

When a totally new system like Major League Baseball’s expanded instant replay — complete with brand-new job descriptions and job openings and technology — is assembled on the eve of the season, it’s easy to imagine its implementation would look more like an evolution than the arrival of a fully formed process.

By most accounts it has been. Whether it’s the change in the transfer rule that tangentially went along with it, or managers getting used to the silly choreography of how to argue with an umpire while simultaneously looking back at the dugout for a thumbs-up or a thumbs-down, everybody involved in the process seems to be getting better at it.

“We’re getting a little more educated in different ways not to manipulate the system but to use it properly,” Pirates manager Clint Hurdle told Baseball Prospectus. “We went in with a plan, but I think we’ve been able to refine and make it a little bit better along the way. ... Every team’s going about this in their own method. We’re keeping records on all the calls — specific calls that are made, leverage situations.”

Major League Baseball is keeping records. too.

According to data obtained from MLB’s central office on Tuesday, there were 381 traditional replay reviews — 15 percent of which were umpire-initiated rather than challenge-initiated — plus four more looks to check counts or other record-keeping procedures in the first 759 games. Those challenges totaled 830 minutes, meaning that the replays averaged 2 minutes, 9 seconds.

The measurements show replay review added 1:06 to each game, but that’s a blunt estimate considering that it measures only the time from the challenge to the decision and not the manager’s stalling with an eye on the dugout. It assumes no delay that would come with full-length arguments without replay.

Most notable is just how high a percentage of calls have been overturned, with the frivolous challenge not really materializing. Despite no penalty for getting a challenge wrong, beyond just losing the challenge, the success rate of 47.2 percent overturns is considerably higher than the 40 percent that tends to be close to the yearly rate in the NFL, where lost challenges cost a valuable timeout.

The prevailing thought was that managers would challenge anything and everything because the chances of a close call showing up again wasn’t enough of a deterrent to challenge a play with a small chance of getting an overturn.

For the most part, according to a couple of managers, that’s been true. The possibility of losing a challenge isn’t motivating them to hold back when there’s a play they’re considering challenging.

“Our philosophy is that if we think that we’re right and we think it’s a good challenge for us, we’re going to do it,” Nationals manager Matt Williams said in an interview with BP. “You never know when it’s going to come again.

“That’s our philosophy. Some have been good for us and some have been bad for us, but we’re not going to shy away from doing it because we may not have another opportunity to.”

What inning has seen the most challenges? The first, and by a lot.

Part of this is probably structural as it relates to the first inning. The first is the highest-scoring offensive inning, and the second and third are the lowest in the starter’s typical time in a game. It’s not a surprise that more challenges would come with more runners on base because more unusual things can happen.

Still, with so many challenges early, even proportional to offense, teams are clearly not afraid of losing a challenge.

“We’ve committed ourselves to the point with our group where we make a decision and we go with it,” Hurdle said. “If we win, we win. If we lose, we lose and we move on.”

Where my prediction of overdoing their laxity might be going wrong is that there might not be a lot of plays where there’s a 5 percent chance of getting an overturn. There’s no reason that the distribution has to be uniform or normal or anything else. What the first quarter of the season has shown is that the distribution may be a whole lot of plays with a zero percent chance of being overturned, a whole bunch with a 90-100 percent chance, and then a few in the middle.

There has been some controversy with bad angles that might make these chances not exactly zero or 100, but there really haven’t been too many surprises when umps come back from their chat session —especially on the side of bizarre overturns. If a manager needs a 5-percent chance of getting it overturned to make it statistically worthwhile, that just might be an area of the probability curve that doesn’t exist.

Baseball also has started to see some patterns emerge in what types of plays are getting overturned. By far, the plays that result in the greatest percentage of overturns are ones involving whether or not a ball was caught. That’s not surprising. In those cases, the replay official has to look at only one thing, and managers on the defensive side can lean on their players, who should know whether or not a ball was caught much better than they could gauge a close out-or-safe play.

Counting both plays initially called a catch and a no-catch, there have been a total of 12 challenges with 11 overturned, none confirmed, and only one in the distinct category of “standing” with not enough evidence to overturn but no confirmation. Even that one was just the transfer rule — a Terry Francona challenge of this Elliot Johnson non-catch — which might be a catch now, seven weeks later.

Here are the rest of the plays broken down by success rates of challenges on each type of initial call.

Initial call

Total plays

Overturned

Confirmed

Stands

Overturn %

Catch

7

7

0

0

100.0%

No Catch

5

4

0

1

80.0%

Safe 1st

69

41

7

21

59.4%

Safe other bases

83

42

18

23

50.6%

Out other bases

64

31

13

20

48.4%

Out 1st

82

36

20

26

43.9%

Other**

19

8

5

6

42.1%

Foul

12

4

2

6

33.3%

No home run

16

4

8

4

25.0%

Home run

9

2

6

1

22.2%

Plate block

15

1

13

1

6.7%

Total

381

180

92

109

47.2%

**Other includes things like HBPs and whether a ball hit the wall before being caught.

A few things jump out here, one being the relatively low rates on the home run vs. non-home run calls, which were the only reviewable plays before this year’s expansion. Managers can’t challenge homer calls, but they can request umpires to review them. With so many of those “confirmed” and not just “standing,” this might be where managers are taking their 100-to-1 shots, given that the stakes are so high.

The more intriguing one is the different rates both on plays at first and at other bases/basepaths when the call is initially “safe” vs. when the call is initially “out.” With the immediate caveats that we’ll know much more when there’s triple the data at the end of the season, and that this could be a sample size issue, the rates of overturns have been higher when the initial call is that the runner was safe.

This could be seen as some initial insight on how umpires are making these calls, erring on the side of “safe” calls. It could be some subconscious bias, or it could be erring — in the age of replay — on the side of keeping plays alive because it’s easier to sort things out that way than to return baserunners to the field.

However, it could be something much simpler — actually in the overturning rather than in the call. It’s generally easier to prove the existence of anything than the absence of anything. Hence, it would be easier to overturn a safe call into an out call by proving that a tag happened instead of trying to proving that it didn’t happen.

Whether that applies to a tag on a video, though, is unclear. It might actually be easier to verify blank space between glove and back than to verify a tag. It will be interesting to see if these relationships hold, for plays at first and for plays on the bases with more time, more data and no transfer rule skewing the numbers at second.

Where the intuition is much more straightforward is in the idea put forth by broadcasters, who in this case are absolutely right in saying that the longer a challenge goes, the worse it is for the challenging team. Sort of.

The shortest replays are the ones that confirm the call, while the longest ones are the ones that get the exact same result in the call standing. After a certain point, it can certainly be said that the longer it goes, the worse the result for the manager who made the challenge.

Result

Total plays

Average time

Std. Dev.

Confirmed

92

1:37

0:45

Overturned

180

2:03

0:53

Call stands

109

2:45

0:49

 

Two months into the process without a lot of major hangups, the goals should be to lower those numbers.

Hurdle suggested another change, desiring a camera angle that aims directly down the first-base line to get a better view of plays at the bag, but most of the room for improvement will be in the flow of the process. Again, that average time counts only from challenge to verdict, not the time that the manager is stalling.

If there’s a major tweak in the offseason, it might not need to be in penalizing a team that makes frivolous challenges, but to enforce some time limits and regulate when a play can be challenged afterward. But two months in, the system has been working — albeit slowly working — with less funny business than anticipated, given how quickly it was all thrown together.

Zachary Levine is an author of Baseball Prospectus.
Click here to see Zachary's other articles. To contact Zachary click here.

Play FOX Fantasy Baseball Commissioner MLB Game