There are 362,880 different ways to organize 9 things. This is the problem MLB managers face before every game: in what order should they place their 9 starting players. There have been some unique machinations over the years as managers try to optimize the lineup and give their teams the greatest opportunity to score runs. Tony La Russa would regularly bat his pitcher 8th in order to have 3 “real” hitters bat in front of his #3 hitters (usually Mark McGwire and Albert Pujols). Earl Weaver would pencil in a starting pitcher that wasn’t pitching that day in the DH spot in the lower third of the order. This was to ensure that he got the platoon advantage when that lineup spot came up. This tactic would come to fruition on the off-chance that the opposing starting pitcher got hurt or was so ineffective that he was out of the game by the time the DH spot (being held by the pitcher) came up. Weaver would simply pinch hit a real hitter (righty or lefty) based on the handedness of the opposing pitcher at that point in time. Most surprisingly, Billy Martin randomly picked names out of a hat when the team struggled to score runs. No matter what the method, the lineup has always been a talking point among fans and the media.
Old logic, based entirely on inertial reasoning, includes batting your fastest player first without regard for how often he gets on base, using a good bunter and bat control guy in the 2-hole, placing your best hitter in the 3-hole, and always batting the pitcher (or worst hitter) last. Some of the new studies present rules when constructing a lineup that directly conflict with the old theories. Among the many findings is the idea that the best all around hitter (power and OBP) should bat second, the best hitters should bat 1st,2nd, and 4th while the 5th hitter should be almost as good as the 3rd hitter, and the “second leadoff man” theory is true so the 9th hitter should be better than the 8th hitter. Furthermore, each spot down in the order loses around 18 plate appearances over the course of the season. Also, it is valuable to alternate lefties and righties as much as possible. This avoids allowing the opposing manager to summon some lefty or righty slider monster and mow through several consecutive spots in the order. In the end, most managers conform with the “old book” and might be squandering opportunities to produce more runs.
Let’s see what the Yankees’ lineup should look like for 2014. Using the ZiPS projections for OBP and SLG and Baseball Musings Lineup Analysis we can get a very rough idea of what the Yankees lineup should be and how many runs per game it would score. The 9 players (with OBP/SLG) I will enter into the model are the projected starters for 2014: Alfonso Soriano (.297/.484), Brian McCann (.340/.451), Mark Teixeira (.340/.464), Brian Roberts (.304/.364), Derek Jeter (.322/.357), Kelly Johnson (.315/.405), Brett Gardner (.339/.388), Jacoby Ellsbury (.341/.438), and Carlos Beltran (.327/.479).
The optimal lineup for the 2014 Yankees based on this data would score 4.762 runs per game and look like this:
This simulation neglects some information such as the fact that some players are much better or worse against a certain handedness of pitcher or that Derek Jeter must bat first or second to ensure Twitter doesn’t blow up. It doesn’t take into account base-running at all which is important as fast teams can steal bases and take the extra base to increase their runs scored projection based entirely on hitting ability. Also, it only considers the magnitude of on-base percentages rather than the composition. For instance, 2 players with the same exact high OBP (say .365 for this exercise) and low SLG (.350) would be ideal candidates to leadoff. However, the one with a .260 batting average would be better than the guy with a .330 batting average because more of his OBP comes from walks and hit by pitches rather than hits. Walks are most valuable in the 1st spot because nobody is on base the first time the leadoff man hits and the bases are oftentimes empty thereafter due to batting after the 3 worst hitters on the team. Walks can only advance current base-runners 1 base so they are best utilized with no one on. Hits, on the other hand, can advance current base-runners multiple bases so the .330 BA guy should bat after guys who get on base often (i.e. not first). Finally, ISO (SLG-BA) would a better measure of power for this model as it strips out singles and only considers extra base hits. (for example, a guy with 40 hits (all singles) in 100 at bats has the same exact slugging percentage as a guy with 10 hits (all home runs) in 100 at bats, but the second guy clearly has more power.)
The simulation also assumes normal sequencing. This means that the team won’t have abnormally high or low cluster-luck. Cluster luck is when teams have their positive outcomes (hits, walks, ROE, etc.) all clustered within the same inning or innings. For an extreme example to illustrate the point, 2 teams might each have 9 hits. One of the teams gets exactly 1 hit each inning while the other gets all 9 hits in just one of the innings. The second team will obviously score more runs than the first. Cluster luck is not a repeatable skill but more a function of luck. Cluster luck can also include when extra base hits (and specifically home runs) occur. Unlucky cluster luck teams would have a lot of home runs in front of walks and singles while lucky teams will hit most of their homers after men reach base via walks or hits. Two teams that have the exact same offensive events can distribute them over the course of the season in a way that will result in drastically different runs scored totals. Over multiple seasons this should even out, but over 1 season and especially 1 game, teams can be especially lucky or unlucky. Additionally, the inputs are an assumption themselves as they are based on projections and aren’t 100% what will happen, of course.
This simulator allows us to ascertain if the optimal Yankee lineup resembles the actual Yankee lineup. There’s a very strong chance that Joe Girardi will create a lineup that is not optimal. Yet his arrangement will be far from the worst possible and the Yankees should have a decent offense regardless of what the order of hitters looks like.