If someone has a good way to evaluate managers, please let us know

I’m a member of the Baseball Writers Association of America, which means that I’m part of the pool of writers who are asked to cast ballots for MLB’s postseason awards. The MVP and Cy Young are the two big ones that get most of the attention, and then there’s Rookie of the Year, which has more recently turned into the Prominent International Star Who Made His MLB Debut This Year award — Jose Abreu will be accepting that one for 2014, though they’ll probably stick with the shorter title for now.

There’s also a fourth award that BBWAA members are asked to vote on — Manager of the Year. It’s the one award that we’re asked to give out that doesn’t go to a player, and not coincidentally, it’s the one in which there is usually the least consensus. Last year, nine American League managers received at least one vote in one of the three slots listed on the ballot, which might not sound like a lot until you remember that there are only 15 managers in the American League, so more managers got votes than those who didn’t.

This is what happens when you ask a panel of diverse members to try and come to agreement on a subject that is inherently difficult to measure. This challenge is compounded by the fact that there is literally no criteria or guidance provided along with the ballot. For comparison, the MVP ballot contains the following instructions:

 

Dear Voter:

There is no clear-cut definition of what Most Valuable means. It is up to the individual voter to decide who was the Most Valuable Player in each league to his team. The MVP need not come from a division winner or other playoff qualifier. The rules of the voting remain the same as they were written on the first ballot in 1931:

1. Actual value of a player to his team, that is, strength of offense and defense.

2. Number of games played.

3. General character, disposition, loyalty and effort.

4. Former winners are eligible.

5. Members of the committee may vote for more than one member of a team. You are also urged to give serious consideration to all your selections, from 1 to 10. A 10th-place vote can influence the outcome of an election. You must fill in all 10 places on your ballot.

Only regular-season performances are to be taken into consideration. Keep in mind that all players are eligible for MVP, including pitchers and designated hitters.

The Manager of the Year ballot contains no such wording. In fact, it contains no wording at all, other than instructing us to submit our ballots before the first postseason game and to not reveal our selection until the results are announced. As a first-time awards voter tasked with choosing my candidates for NL Manager of the Year, I have essentially been instructed to vote based on any criteria I want.

The freedom to vote based on my own criteria is great, but also intimidating, because I’m being forced to admit that I really have no idea how to evaluate a manager’s performance in a given season. And, unfortunately, I’m not sure anyone else does either.

The history of the Manager of the Year award gives us a pretty decent insight into what voters have looked for in the past. Since its creation in 1983, 63 managers have been honored — it’s an odd number thanks to a tie in 1996 between Joe Torre and Johnny Oates — with the victor having led his team to a division title in 44 of those 63 manager-seasons. Another 15 winners finished in second place in their division, meaning that 94 percent of the awardees came from strong playoff contenders, and 62 of the 63 managed winning teams; Joe Girardi is the only manager to ever win the award while posting a losing record.

So, winning matters a lot, and it seems logical that the season’s best managing performance would come from a team that had a good year. But history also tells us that the voters actually think less of a manager’s performance if his team wins too many games. Last year, Terry Francona (92 wins) beat out John Farrell (97 wins) in the AL, while Clint Hurdle (94 wins) beat out Fredi Gonzalez (96 wins) in the NL, and it is actually pretty rare for the manager of the team with the best record to actually win the Manager of the Year award. The assumption seems to be that if a team wins too many games, a manager loses value relative to the talent on his own roster, and gets less credit for winning 100 or more games than if he would have only won 90-95.

We can see this same pattern when looking at the voting finishes for teams who are expected to win. Over the last 20 years, no team has won more often than the New York Yankees, and included in that timeframe is the 1998-2006 stretch in which they won the American League East for nine consecutive seasons. Joe Torre was their manager for all nine of those seasons, and he won the Manager of the Year award in only one of them. In fact, from 1999 to 2006 — a span in which the Yankees went 777-515, a .601 winning percentage — he never finished higher than third in the voting. In 2003, the Yankees won five more games than any other AL team, and he finished fifth. Fifth!

Torre simply couldn’t overcome the bias against voting for an expected winner. He won with juggernauts who would have been viewed as failures had they not won 100 games. We like to vote for managers who defy our expectations, and win far more games than the preseason forecasts suggested. But that requires us to assume that our preseason estimates of a team’s talent level are somehow the gold standard of evaluations, and that success above those expectations means that there was a performance worth rewarding rather than that our ability to project a team’s future performance is shoddy at best.

Maybe Clint Hurdle was a fantastic manager last year, which is why the Pirates won 94 games in a season where few expected more than 84. Or maybe we just all did a really terrible job of noticing that the Pirates had put together a strong roster, and the Pirates performance exposed our own weakness rather than Hurdle’s strength. This isn’t to denigrate Hurdle, or any manager of a team who has a surprisingly strong record, but it seems somewhat presumptuous to suggest that our understanding of a team’s talent level during spring training should be the baseline against which his performance should be measured.

Of course, we need a baseline, so maybe that’s the most rational one to select. Others have suggested using something like the gap between a team’s actual record and their Pythagorean expected record to try and quantify a manager’s impact, but beating your Pythagorean expected record mostly just means that you hit really well in the clutch — do we really think that’s something that a manager can control? Of all the things that I could see a manager having a direct impact on, clutch hitting would rank somewhere near the bottom. It would be nice if there was a category that neatly summed up the value of a manager’s impact on a team, but I can’t see how clutch performance — the dominating driver of deviation from a team’s Pythag record — is that category.

With no easy way to evaluate managers, it seems my BBWAA brethren have essentially taken to voting for a manager of a small-to-mid-revenue team — the Yankees haven’t had a winner since 1998, the Red Sox since 1999, and the Dodgers since 1988 — who finished in first or second place, with bonus points going to a guy whose team had not historically been a contender, or wasn’t expected to contend in that season. It has essentially become an award for having a winning season with a moderate payroll, which is almost certainly harder than winning on a large payroll, but it seems impossible that the teams with the most money available to buy the best players keep hiring managerial duds who are rarely worthy of recognition no matter how many games they win.

I have roughly three weeks to figure out the criteria for my NL Manager of the Year vote, and I can honestly say that I don’t have any idea how I’m going to vote. It might be the most obscure award, but it is also the most difficult puzzle to crack. It seems to me that the traditional method of rewarding a guy for managing a team that did better than we expected them to do maybe isn’t rigorous enough, but then again, I don’t know that I have a better alternative.

If someone wants to accurately figure out how to evaluate managers in the next three weeks, please, let me know.