MLB Player Performance Index

For this project, I’m turning to the 2022 Lahman baseball database to develop an offensive score rating for players with over 4,000 at-bats. 

Note that this data is updated through the 2022 season. Future Hall of Famers such as Miguel Cabrara and Albert Pujols are still active during this time period and will only improve their overall score.

The process of creating an offensive score rating is quite an interesting one, although not always straightforward. It’s a great way to learn more about various players, compare them with others from their era, and maybe even get a sense of their Hall of Fame prospects. It’s a bit of a fun exercise that offers some neat insights into the world of baseball statistics.

How did I create my score?

  1. To begin, I focused the dataset exclusively on position players who have recorded over 4,000 at-bats (removing players linked to steroid use), resulting in 1,233 players that we will rank.
  2. Each player’s performance in various offensive categories was then evaluated and converted into percentiles to standardize comparisons.
  3. I narrowed down the key categories to seven crucial ones: 1) home runs, 2) batting average, 3) runs batted in, 4) hits, 5) at-bats, 6) walks, and 7) strikeouts. I combined these percentiles for each player, applying different weights to each category based on their perceived significance in the game.

Here is an overview of the weighting system that I used.

CategoryWeight
Home Runs5
Average5
Runs Batted In4
Hits4
At Bats3
Base on Balls2
Strike Outs1

Here is an example of how I computed the overall score for Babe Ruth.

Initially, I calculate where Babe Ruth ranks in percentiles among all hitters with more than 4,000 at-bats. Ruth excelled, landing in the highest percentile for home runs, batting average, runs batted in, and total at-bats. However, as is common with heavy hitters, he had a high strikeout rate, placing him in the 90th percentile for strikeouts, which impacts his overall rating negatively.

PlayerHome RunsAverageRuns Batted InHitsAt BatsBase on BallsStrike Outs
Babe Ruth11149190

I then subtract each metric from 100 (as we are used to seeing higher scores for better players) and apply the weighted metric as I described above. This gives us the following for Babe Ruth.

PlayerHome RunsAverageRuns Batted InHitsAt BatsBase on BallsStrike Outs
Babe Ruth49549539638427319810

The sum of these numbers is 2,251, the overall score we give Babe Ruth.


My scoring system is designed to highlight players from the live ball era who demonstrated enduring excellence over their careers. It tends to favor those with lengthy, consistent careers, meaning a player who has sustained solid performance over 20 years is likely to rank higher than one who had a stellar career over a decade. Additionally, in an effort to maintain the integrity of the metrics, I’ve chosen to exclude players whose careers are associated with performance-enhancing drugs (PEDs).

The summary statistics on my dataset look like the following.

Maximum Score2,315
Minimum Score114
Range2,201
Average1,209
Standard Deviation497

Overall, the first-place score goes to Stan Musical with a score of 2,315, and last place goes to Red Dooin, a catcher in the dead ball era, with a score of 114.

Reviewing the top ten players by offensive score, we see a list of all-time greats. My data is through 2022, including Miguel Cabrera, who retired in 2023 and could move up this list.

RankScoreNamePositionYears ActiveHRAVGRBIHABBBSO
12315Stan MusialFirst Baseman1941 – 19634750.3301,9513,63010,9721,599696
22261Ted WilliamsLeft Fielder1939 – 19605210.3441,83926547,7062,021709
32260Lou GehrigFirst Baseman1923 – 19394930.3401,99527218,0011,508790
42251Babe RuthRight Fielder1914 – 19357140.3422,2172,8738,3982,0621,330
52239Mel OttRight Fielder1926 – 19475110.3041,8602,8769,4561,708896
62238Hank AaronRight Fielder1954 – 19767550.3042,2973,77112,3641,4021,383
72226Jimmie FoxxFirst Baseman1925 – 19455340.3251,9222,6468,1341,4521,311
82222Willie MaysCenter Fielder1951 – 19736600.3011,9033,28310,8811,4641,526
92221Miguel CabreraFirst Baseman2003 – 2022*5070.3081,8473,08810,0221,2272,031
102220Rogers HornsbySecond Baseman1915 – 19373010.3581,5842,9308,1731,038679

From my previous analysis that reviews offensive categories by year, we determined that runs are trending ever so slightly downward, and home runs and strikeouts are increasing. By analyzing Hall of Fame inductees by their offense score and their last year playing Major League baseball, we gain valuable insights into evolving standards for Hall of Fame selection. This approach allows us to assess whether the bar for induction is being raised over time, offering a unique perspective on the criteria and benchmarks that define a Hall of Fame-worthy career.

From the following scatter plot showing only Hall of Fame inductees, we get a correlation score of .40. Although not a large correlation, the offense output criteria to get elected appears to be increasing. Here, I have also limited the data set to Hall of Famers who are position players who are not pitchers or were elected for their managerial roles. This removed Joe Torre, Al Lopez, Billy Southworth, Charlie Comiskey, Wilbert Robinson, Casey Stengel, Miller Huggins, Bucky Harris, Ned Hanlon, and Leo Durocher, who all had over 4,000 at-bats during their career but were selected for the Hall of Fame due to their successful managerial careers.


If we review all players, and not just Hall of Fame inductees, who had over 4,000 at-bats during their career, we do not see an increase in their offense score compared to the era in which they played.

This graph works best in proper visualization software, where you can hover over the data points to see which players are not inducted into the Hall of Fame but appear to have an offense score similar to their peers.

We can also look at the distribution of our offense score and see it does resemble a normal distribution.


Let’s now look at players by position. 

Examining players by position offers us a unique lens through which to identify potential Hall of Fame candidates based on their career achievements. Additionally, this perspective sheds light on those who played in the dead ball era, a time when offensive accomplishments were more challenging to achieve. Recognizing the extraordinary feats of these players within the context of their era adds a valuable dimension to our appreciation of their skills and contributions to the game.

Catchers

RankScorePlayerHOFYears ActiveHRAVGRBIHABBBSO
482,046Mike PiazzaY1992 – 20074270.3071,3352,1276,9117591,113
692,006Yogi BerraY1946 – 19653580.2841,4302,1507,555704414
741,995Ted SimmonsY1968 – 19882480.2841,3892,4728,680855694
771,988Joe TorreY1960 – 19772520.2971,1852,3427,8747791,094
1021,943Bill DickeyY1928 – 19462020.3121,2091,9696,300678289
1371,875Gabby HartnettY1922 – 19412360.2971,1791,9126,432703697
1581,847Carlton FiskY1969 – 19933760.2691,3302,3568,7568491,386
1841,803Joe MauerN2004 – 20181430.3069232,1236,9309391,034
1911,788Johnny BenchY1967 – 19833890.2671,3762,0487,6588911,278
2131,744Ernie LombardiY1931 – 19471900.3069901,7925,855430262

First Baseman

RankScorePlayerHOFYears ActiveHRAVGRBIHABBBSO
12,315Stan MusialY1941 – 19634750.3301,9513,63010,9721,599696
32,260Lou GehrigY1923 – 19394930.3401,9952,7218,0011,508790
72,226Jimmie FoxxY1925 – 19455340.3251,9222,6468,1341,4521,311
92,221Miguel CabreraN2003 – 2022*5070.3081,8473,08810,0221,2272,031
112,200Albert PujolsN2001 – 2022*7030.2962,2183,38411,4211,3731,404
202,164Todd HeltonN1997 – 20133690.3161,4062,5197,9621,3351,175
242,134Eddie MurrayY1977 – 19975040.2871,9173,25511,3361,3331,516
302,106Jeff BagwellY1991 – 20054490.2961,5292,3147,7971,4011,558
382,072Johnny MizeY1936 – 19533590.3121,3372,0116,443856524
402,067Cap AnsonY1871 – 1897970.3342,0753,43510,281984330

Second Baseman

RankScorePlayerHOFYears ActiveHRAVGRBIHABBBSO
102,220Rogers HornsbyY1915 – 19373010.3581,5842,9308,1731,038679
212,147Charlie GehringerY1924 – 19421840.3201,4272,8398,8601,186372
422,066Robinson CanoN2005 – 20223350.3001,3062,6398,7736201,214
492,043Jeff KentN1992 – 20083770.2891,5182,4618,4988011,522
622,023Roberto AlomarY1988 – 20042100.3001,1342,7249,0731,0321,140
791,986Frankie FrischY1919 – 19371050.3161,2442,8809,112728272
821,980Craig BiggioY1988 – 20072910.2811,1753,0601,08761,1601,753
1011,943Nap LajoieY1896 – 1916820.3381,5993,2439,590516321
1031,941Eddie CollinsY1906 – 1930470.3331,3003,3159,9491,499400
1211,904Ryne SandbergY1981 – 19972820.2841,0612,3868,3857611,260

Third Baseman

RankScorePlayerHOFYears ActiveHRAVGRBIHABBBSO
122,199George BrettY1973 – 19933170.3041,5963,15410,3491,096908
132,188Chipper JonesY1993 – 20124680.3031,6232,7268,9841,5121,409
332,094Adrian BeltreN1998 – 20184770.2861,7073,16611,0688481,732
841,975Wade BoggsY1982 – 19991180.3271,0143,0109,1801,412745
991,950Aramis RamirezN1998 – 20153860.2831,4172,3038,1366331,238
1101,928Ron SantoY1960 – 19743420.2761,3312,2548,1431,1081,343
1171,918Eddie MathewsY1952 – 19685120.2711,4532,3158,5371,4441,487
1231,900Ken BoyerN1955 – 19692820.2871,1412,1437,4557131,017
1361,878Scott RolenY1996 – 20123160.2801,2872,0777,3988991,410
1391,875Mike SchmidtY1972 – 19895480.2671,5952,2348,3521,5071,883

Shortstop

RankScorePlayerHOFYears ActiveHRAVGRBIHABBBSO
252,133Derek JeterY1995 – 20142600.3091,3113,46511,1951,0821,840
612,025Cal RipkenY1981 – 20014310.2751,6953,18411,5511,1291,305
592,025Honus WagnerY1897 – 19171010.3271,7333,42010,439963735
652,014Robin YountY1974 – 19932510.2851,4063,14211,0089661,350
712,001Joe CroninY1926 – 19451700.3011,4242,2857,5791,059700
981,952Julio FrancoN1982 – 20071730.2981,1942,5868,6779171,341
1181,915Barry LarkinY1986 – 20041980.2949602,3407,937939817
1461,868Michael YoungN2000 – 20131850.2991,0302,3757,9185751,235
1571,847Alan TrammellY1977 – 19961850.2851,0032,3658,288850874
1611,845George DavisY1890 – 1909730.2941,4402,6659,045874613

Left Field

RankScorePlayerHOFYears ActiveHRAVGRBIHABBBSO
22261Ted WilliamsY1939 – 19605210.3441839265477062021709
162170Al SimmonsY1924 – 19443070.334182729278759615737
182166Goose GoslinY1921 – 19382480.315160927358656949585
262125Carl YastrzemskiY1961 – 19834520.285184434191198818451393
272119Billy WilliamsY1959 – 19764260.28914752711935010451046
432065Jim RiceY1974 – 19893820.2981451245282256701423
522037Luis GonzalezN1990 – 20083540.28214392591915711551218
562029Moises AlouN1990 – 20083320.303128721347037737894
761989Bob JohnsonN1933 – 19452880.2961283205169201075851
751989Zack WheatY1909 – 19271320.316124828849106650572

Center Field

RankScorePlayerHOFYears ActiveHRAVGRBIHABBBSO
82,222Willie MaysY1951 – 19736600.3011,9033,28310,8811,4641,526
222,147Joe DiMaggioY1936 – 19513610.3241,5372,2148,102790369
232,135Mickey MantleY1951 – 19685360.2981,5092,4158,1021,7331,710
282,110Tris SpeakerY1907 – 19281170.3441,5293,51410,1951,381323
342,093Ty CobbY1905 – 19281170.3661,9444,18911,4361,249608
352,092Ken GriffeyY1989 – 20106300.2831,8362,7819,8011,3121,779
542,033Bernie WilliamsN1991 – 20062870.2961,2572,3367,8691,0691,212
572,027Duke SniderY1947 – 19644070.2951,3332,1167,1619711,237
582,026Carlos BeltranN1998 – 20174350.2781,5872,7259,7681,0841,795
672,012Al OliverN1968 – 19852190.3031,3262,7439,049535756

Right Field

RankScorePlayerHOFYears ActiveHRAVGRBIHABBBSO
42,251Babe RuthY1914 – 19357140.3422,2172,8738,3982,0621,330
52,239Mel OttY1926 – 19475110.3041,8602,8769,4561,708896
62,238Hank AaronY1954 – 19767550.3042,2973,77112,3641,4021,383
142,187Al KalineY1953 – 19743990.2971,5833,00710,1161,2771,020
152,171Frank RobinsonY1956 – 19765860.2941,8122,94310,0061,4201,532
172,170Vladimir GuerreroY1996 – 20114490.3171,4962,5908,155737985
322,097Harry HeilmannY1914 – 19321830.3411,5392,6607,787856550
362,086Dave WinfieldY1973 – 19954650.2821,8333,11011,0031,2161,686
392,069Larry WalkerY1989 – 20053830.3121,3112,1606,9079131,231
412,066Roberto ClementeY1955 – 19722400.3171,3053,0009,4546211,230

Designated Hitter

RankScorePlayerHOFYears ActiveHRAVGRBIHABBBSO
192,166Frank ThomasY1990 – 20085210.3011,7042,4688,1991,6671,397
292,109Paul MolitorY1978 – 19982340.3061,3073,31910,8351,0941,244
312,105Harold BainesY1980 – 20013840.2891,6282,8669,9081,0621,441
372,076Edgar MartinezY1987 – 20043090.3111,2612,2477,2131,2831,202
1041,941Victor MartinezN2002 – 20182460.2951,1782,1537,297730891
1121,924Chili DavisN1981 – 19993500.2741,3722,3808,6731,1941,698
1711,820Hal McRaeN1968 – 19871910.2891,0972,0917,218648779
1741,816Nelson CruzN2005 – 20224590.2741,3022,0187,3587321,870
2201,735Brian DowningN1973 – 19922750.2671,0732,0997,8531,1971,127
2321,717Don BaylorN1970 – 19883380.2601,2762,1358,1988051,069

Summary

If we add up the top 10 scores for each position, we can get an idea of what positions have the most offensive players. 

PositionScore
First Base21,765
Right Field21,574
Left Field20,950
Center Field20,897
Second Base20,253
Third Base19,905
Shortstop19,625
Catcher19,035

Next, we can graph a cumulative summary between the First Base and Right Field positions to determine which position has the higher offensive output across all MLB players with over 4,000 at-bats. Reviewing the cumulative summary graph where the player rank is on the x-axis, we can see that First Baseman has a higher offensive output position, but is it statistically significant?

Placing the individual offense scores of First Baseman and Right Fielders into an online t-test calculator, we see no significant difference between the positions; as it states below, “By conventional criteria, this difference is considered to be not quite statistically significant.”.

The P-value is used to determine the statistical significance of the results from a hypothesis test. It represents the probability of observing the test results under the null hypothesis. In the context of a 99% confidence interval (CI), you would typically use a significance level (alpha) of 0.01.

In the t-test we performed, the P-value obtained was approximately 0.066. To determine if the result is statistically significant at the 99% confidence level:

  • If the P-value is less than or equal to 0.01, the result is statistically significant, and the test passes the significance test at the 99% CI (meaning there is sufficient evidence to reject the null hypothesis at this level).
  • If the P-value is greater than 0.01, the result is not statistically significant at the 99% CI, and the test fails the significance test at this level (meaning there is not enough evidence to reject the null hypothesis at this level).

Since the P-value (0.066) is greater than 0.01, the result is not statistically significant at the 99% confidence level. Therefore, the test fails to reject the null hypothesis at the 99% CI, indicating that any observed difference in performance scores could very well be due to random chance rather than a true difference between the groups.


Conclusion

Evaluating Hall of Fame eligibility is far from a precise science. This exercise in devising my own evaluation system has been immensely engaging, involving countless hours of meticulous review, fine-tuning, and the discovery of players previously unknown to me.

My analysis leverages the comprehensive Lahman baseball dataset. Should you be interested in the code used for this analysis, please feel free to send me a message. Be aware, the data handling involved a significant amount of aggregation, pivoting, cleaning, and calculating. I’m more than happy to share my codebase with anyone interested in exploring it further.

Happy Coding!