defensive-fielding-stats-sabermetrics-problems-war-huh

In the wake of the Great Passan/Cameron Twitter Kerfuffle -- which I referenced in my column yesterday -- Baseball-Reference.com's Sean Forman wrote about the real problem with fielding metrics. No, the metrics don't correlate year-to-year strongly as hitting metrics. No, the sample sizes are small, the measurers imperfect. But as Forman notes, there's a bigger problem:
The main issue with defensive stats is how do we account for positioning versus fielding skill.
For example, our defensive numbers come from Baseball Info Solutions and for every ball put into play BIS looks at where the ball is caught and its hang time and compares it to similar plays from previous (or even future games as it's updated at year end also) to know that 12% of balls like this were caught over X-number of years. Which is pretty straightforward to do and is probably what most everybody would do if they wanted to analytically measure fielding value.
The issue with defensive stats, however, is this. If the team is really good/bad or even lucky/unlucky at positioning players, it may be that the 12% catch would actually be caught 70% of the time given the player's initial positioning. BIS doesn't track player initial locations (other than noting shifts) because they aren't available on TV and even if they did, which number should we go with (88% or 30%) as we don't really know how much of the positioning is due to the team or the player?
Here's a first step toward answering questions like this, though: Record which coach on each team is responsible for positioning the fielders. Last winter, I heard a great story -- which I wrote down, but now can't find -- about a second baseman who had tremendous defensive stats ... until he changed teams and didn't have the same coach telling him where to position himself before each pitch.
Of course, maybe the coach was just following orders from the manager, who was following orders from the Assistant General Manager, who was--
Anyway, you get the idea. Once we've got the OMGf/x data in a year or two, we can tease out the range from the positioning ... but even then, who should get the credit for the positioning? I can't fathom a good answer for that question. Which might be one more great reason for regressing defensive metrics when figuring Wins Above Replacement.