In my review of some recent sabermetric posts to SABR-L (By The
Numbers, 2/99), I commented on one by Tom Ruane. He looked at the
relative
costs of strikeouts, groundouts, and flyouts. In part because he
included
times that batters reached base on errors (and unsuccessful fielders
choices)
in outs, he found that groundouts were the least costly. I implied in
my
commentary that this information could be useful in creating run
scoring
models.

Subsequently, Mr. Ruane supplied some data on how many times each hitter had reached base on error (henceforth ROE) from 1980 to 1998. I compared these data to other statistics to see if a model could be developed that would predict a hitter's ROE average.

I initially performed a multivariate regression analysis using the following factors, each on a per-at-bat basis: hits, doubles, triples, home runs, strikeouts, sacrifice flies, stolen bases, and grounded into double plays. I also used the hitter's batting side. Despite finding a strong relationship using a limited data set, once I tried the analysis with the full sample, I found almost no correlation between those factors and ROE. After eliminating all players with fewer than 1000 at bats, the correlation was .264.

I next divided players up into groups with certain characteristics. One such group consisted of right handed hitters who ground into double plays a lot, while another was left handed hitters who rarely ground into double plays. I also compared right handed and left handed home run hitters who strike out frequently. While the average player in the sample ROE 14 times per 1000 at bats, the group expected to reach most often, right handed ground ball hitters, had an average ROE of 15.5 times per 1000 at bats, while the upper cutting left handers ROE 10.4 times per 1000 at bats. Using other groups, I found the difference between right handed and left handed hitters to be about 3 or 4 ROE per 1000 at bats. For every extra 100 strikeouts, a batter could be expected to ROE 1.5 fewer times. Speed also had a slight relationship; those stealing bases at three times the average rate had an ROE average .001 higher than normal. A similarly small, opposite relationship exists for slow runners. No relationship was apparent between grounding into double plays and ROE.

Some other authors have looked at this question. In the 1984
edition
of the
__Bill James Baseball Abstract,__ Mr. James studied how
often
Texas Rangers players reached base on error in the 1983 season. He
concluded
that right-handed batters ROE almost 30% more often than lefties, and
fast
runner ROE 12% more often than slow runners. In his 1986 __Abstract__,
he reported that on the 1985 Mariners, right-handed hitters reached
base
just slightly more often than lefties, but fast runners made it 16%
more
often than slow runners. Mark Pankin, in his article "Subtle Aspects of
the Game," used Project Scoresheet data for all major league games from
1984-1992. He found that fast runners and right-handed hitters reach
base
on errors more often. The advantage for righties overwhelmed the speed
factor; slow right-handed batters reach base on errors more than fast
lefties
do.

In summary, just as evaluating fielders by fielding average is not very meaningful because the differences are so small, the same holds true for hitters. If one is rating a hitter is seems to be at one extreme or the other in ROE, one should keep in mind that the hitter is a little more or less valuable than popular formulas such as runs created or linear weights would suggest.

Any comments or questions? E-mail me at CliffordBlau@yahoo.com