New Metrics for Leadoff Hitters

by Keith Glab,
January 1, 2007

When I evaluate and rank players, I tend to use batting runs.  The one area in which this metric doesn't work so well is for evaluating leadoff hitters.  This is because the average run value of an event leading off an inning is quite different from its average run value in all situations.  For example, the average value of a home run is 1.4 runs, but any solo homer, including a leadoff one, is worth exactly one run.  The average value of a single is .47 runs, but if there's no one on base, it should be no more valuable than a walk at .33 (and actually, more runs are scored in innings led off with a walk than in those led off with a single).

Therefore, as I endeavor to evaluate some of the greatest leadoff seasons ever, I prominently use what I call Leadoff Adjusted Batting Runs.  LABR accounts for which events are more favorable for leadoff hitters to a slight degree.  First, I take one-fourth of a leadoff batter's stolen bases and one-fourth of his walks, and multiply that value by .15.  I get .15 from the notion that when no one is on base, a walk and a stolen base is exactly as valuable as a double (actually more so, since the next batter will receive pitchouts and meat fastballs with a speedster on first base).  The normal difference from a double (.78 runs) and a walk and a steal (.63 runs) is .15, and should be credited for a leadoff batter's excess plate appearances with no runners on base.  As the average leadoff batter will receive about one extra leadoff plate appearance than another hitter, I divide the total by 4, crediting leadoff batters with only the extra leadoff plate appearances they get above what other batters normally do.

I then do something similar to penalize a leadoff batter for hitting too many homers with no one on base.  I subtract a leadoff homer's actual value of one run from an average homer's value of 1.4 runs and divide by four again so that I'm not penalizing leadoff guys for possible homers with runners on.  Confused yet?  Suffice it to say that I'm giving leadoff hitters who get on base a little more credit than those who have lots of extra base hit potential (sorry Brady Anderson and Alfonso Soriano).  Here is the exact formula that I use:

LABR= ABR+ 0.15(BB+SB)/8 - 0.1*HR

The adjustment tends to be minor, but got as high as 3.6 runs for Rickey Henderson's stellar 1982 season.  It becomes quite significant when looking at differences in career values.     

Some of the other things I look at when evaluating a leadoff season:

LOBP+ - A player's OBP when leading off an inning versus his overall OBP, expressed like OPS+.  Obviously, if Ichiro Suzuki is simply getting his walks because pitchers would rather face Randy Winn in a run scoring situation, that's not quite as valuable as getting on base to start off the game.  (It turns out that a large percentage of Ichiro's walks are indeed intentional).  Ideally, we would simply use leading off an inning data for our LABR calculations as well, but we don't have stolen base info for that.  It would also be better to calculate LOBP+ using LOBP and OBP when not leading off an inning, but the difference probably isn't worth the trouble.

SBR - Stolen Base Runs.  This one's somewhat controversial.   The actual average run value of a stolen base is actually less than .3 runs.  Several sabermetricians decided to bump up the value to .3 because stolen base attempts tend to occur A) when a stolen base would have its greatest run value and B) when a run itself would have its greatest impact on a win outcome.  However, sabermetricians are notorious stolen base-haters, and when they bumped up the value of a caught stealing disproportionately; the actual SB% break even point of approximately 63.5 was artificially bumped up to over 70%.  That is why some people say that a player needs to steal three bases every four attempts to help his team.

The stolen base weights that I use combine the extra weight for a stolen base but keeps the actual run relationship between a stolen base and a caught stealing consistent with their correct values.  As a result, my stolen base runs will be among the highest that you see.  The exact values that I use are .3 for a stolen base and -.52 for a caught stealing (a player with 16 steals in 25 attempts will be credited with .12 SBR for his 64% success rate).

R% - Runs Scored per Time on Base.  Asher first introduced me to this nifty little statistic and I quickly used it against him to extol the virtues of Fred Clarke, who ranks fourth all time in the metric.  More than anything, I use it to estimate the caught stealing totals for players for whom we do not have caught stealing data.  For example, Nap LaJoie swiped 380 bases in his career and scored 38.7% of the time when he reached base, while Sam Crawford nabbed 366 while scoring runs in just 35.7% of his chances on base.  We don't have caught stealing data for the two greats, but I would be willing to bet that Napolean had a better SB%, even though R% favors a table-setter over a middle-of-the-order thumper.

But with leadoff batters, everyone's on more of an even keel.  Even when we have caught stealing data for players, R% is useful because it gives you a better idea how good of a baserunner someone is.  Is a player getting thrown out often trying to stretch singles into doubles?   He'll have a lower R%.  Is the player great at going first-to-third on a single?  Certainly, he'll score more often than a player who isn't.

There's still another problem with R%: it makes players on great offenses look like better baserunners than they really are.  How hard was it for Mickey Cochrane to score runs with Charlie Gehringer and Hank Greenberg hitting behind him in 1935?  But our next metric helps filter out this bias.

TR% - Percentage of a player's team runs scored that are scored by the player.  In 1922, George Sisler scored 134 runs for the upstart St. Louis Browns, leading the American League.  In 1985, Tim Raines scored 115 runs for the mediocre Montreal Expos, finishing second to Dale Murphy for the league lead.  Who was the more productive table setter?  Look at it this way: Raines scored 18.2% of his team's run total for the year, while Sisler scored just 15.5% of the Browns' total.  This is a similar concept to Bill James' Win Shares system.  I feel that it helps provide context for team-dependent statistics like runs scored.


Okay, enough about the methodology.  Let's see it in action as we examine ten of the best leadoff seasons of all time.

Disagree with something? Got something to add? Wanna bring up something totally new? Keith resides in Chicago, Illinois and can be reached at