Half of baseball is offense: both hitting and baserunning. Hitting is the easiest aspect of baseball to measure accurately. Use linear weights and Markov chains to create run values for certain events (home run, strikeout, etc.), adjust for park and league factors, and then aggregate to calculate runs and wins. Baserunning is also not too difficult. Measure the value added from stolen bases and subtracted from caught stealings. Also, include the value of taking the extra base (1st to 3rd on single, 1st to home on double, etc.) and you can calculate the amount of runs (and wins) a player created with his legs.
The other half of baseball is run prevention: both pitching and defense. Pitching is not too difficult to evaluate when looking at the correct numbers (not ERA). We know the value of each event (strikeouts, walks allowed, etc) The only difficulty comes from determining how much to weight FIP (pitcher has 0% control over balls in play and sequencing) vs RA9 (pitcher has 100% control over balls in play and sequencing). The percentage is somewhere in between, but it is not certain where. Defense comprises about 10% of the overall game of baseball, yet the measurement of defense is the hardest aspect to calculate in baseball. Was that player able to make a good play because of positioning or range? Could that diving web gem catch by the outfielder in the gap have been caught on the run by a speedier defender who got a better jump? Defensive metrics are not exact and hard to measure.
Prior to video availability, a TotalZone rating is used by reviewing Retrosheet data and sifting through box scores, but it is very imprecise and probably only directionally correct. Ultimate Zone Rating was implemented in 2002 through the help of video at Baseball Info Solutions. These values are much better than Total Zone or Fielding Percentage but still require at least a 3-year sample size and suffer from some human error in tracking. Defensive Runs Saved came about a year later and it also has its limitations. The other advanced defensive metric is Fielding Runs Above Average at Baseball Prospectus. Teams certainly have proprietary data that more effectively measures defense than any of the public methods. The public will be left in the dark until FieldF/x is (hopefully) released.
However, there were 3 new advancements in defense measurement over the past week. In order of magnitude:
MLBAM video: At the MIT Sloan Sports Analytics Conference MLB advanced media announced the creation of a new video system that will help track player defense. The video will be available in Target Field, Citi Field, and Miller Park for the upcoming season with hopes of eventually expanding league-wide. The on-screen video will display distance ran to the ball, reaction time, top speed, acceleration, and a comparison of direct (straight line) path and actual path to compute a route efficiency. An explanation of the new video tracking doesn’t really do it justice, but it is sure to be a hit among baseball fans.
Catcher pitch framing: Harry Pavlidis and Dan Brooks of Baseball Prospectus went through PitchF/x data to create a Regressed Probabilistic Model to evaluate catcher pitch framing. Part of the model involved giving a particular run value for being able to steal (or not) a strike based on the count. They provided their calculations from catcher runs earned from run saving from 2008-2013. Former Yankees Russell Martin (91 runs earned) and Chris Stewart (43 runs earned) rank in the top 10. Number one on their list for the time period is current Yankee Brian McCann. The Yankees clearly put a lot of value on this aspect of catcher defense. Pitch framing might still be a slightly undervalued asset in the marketplace, but that inefficiency seems to be drying up as all teams are better able to accurately measure this aspect of the game. All field/no hit backup Jose Molina received a 2-year, $4.5 million contract this offseason. The Yankees deemed it okay to let Russell Martin go because of Chris Stewart‘s defense including his pitch framing. Noted bad pitch-framers Ryan Doumit and Carlos Santana are effectively being moved off of the position. Of course, all of this would be moot if robot umps were implemented.
As an aside: Mariano Rivera received 20 framing runs added per 2000 opportunities from 2008-13. This could be due to the large amount of movement he had on the cutter that made umpires think it nicked the corners more than it actually did. It could also be due to the Yankees having good pitch framers as described in the above paragraph. Finally, it might be due to the umpires calling the name instead of the game and Rivera got extra strikes because of who he was. It was likely some combination of these 3 factors.
Inside Edge Fielding: This advancement has been around since 2012 and Fangraphs added it to the players pages on March 5. It is the data that Inside Edge scouts track based on difficulty of balls in play. The demarcations are impossible (0%), remote (1-10%), unlikely (10-40%), about even (40-60%), likely (60-90%), and almost certain/certain (90-100%). There will obviously be some human error as these categorizations are pretty arbitrary. Also, there is no run value attached to each play made (or not), but it is still kind of fun to see the range of difficulty for balls in play that various defenders were tasked with and how often they converted them into outs.
There was some interesting Inside Edge data for current Yankees. Brain McCann converted 28.6% of the remote plays and 14.3% of the unlikely plays the past two seasons. Mark Teixeira graded out well for both easy plays (98.2% routine and 90.9% likely) and tough ones (66.7% unlikely and 14.3% remote) in his healthy 2012 season. Brian Roberts hasn’t really had many innings in the field the past two seasons to make any of the data that meaningful. Brendan Ryan has made 2.8% of the remote play chances and 37% of the unlikely category. He has some serious range and arm strength. On the other hand, Derek Jeter didn’t make a single remote or unlikely chance in his healthy 2012 regular season. He did, however, convert 98.3% of the routine plays which jives with his “sure-handedness” tag. In 6817.1 innings the past 2 years at second base, Kelly Johnson has been sure-handed (98.8% routine) but not rangy (0% remote and 14.3% unlikely). Brett Gardner is unreal in the outfield (33.3% remote and 100% routine the past 2 years in all outfield spots). Jacoby Ellsbury can also go get it (33.3% remote, 30.0% unlikely, and 99.6% routine). Carlos Beltran‘s range has started to slip in right from his former Gold Glove center field days (0% remote and 25% unlikely).
Defense is still far away from being totally understood and measured properly, but these 3 improvements definitely move the needle.