Thursday, October 20, 2011

An Interesting Case for Being Careful with Statistics

One of the difficulties with working with baseball statistics is that sometimes the devil is in the details. Failing to carefully consider how a statistic is computed can lead to misinterpreting its meaning. I have discussed this before in the mis-use of "innings pitched" to measure opportunity rather than a measure of outs.

The example below is more complicated, but probably even more pernicious because the problem is less obvious. Let me give an example to start the converstion with a question. Which of the following two players would you rather have come to bat with runners in scoring position given these career "slash" numbers with runners in scoring position*, AVG/OPB/SLG:

Player 1- .322/.383/.496
Player 2- .310/.527/.594

I think most of us would quickly choose Player 2 based on his superior power and OBP. In fact, it isn't really very close based on those numbers. But let me add these two non-standard stats:

Hits/Plate Appearances
Player 1- .278
Player 2- .200

Player 2 gets a hit 1 in 5 times he comes to the plate with runners in scoring position and Player 1 gets a hit over 1 in 4 times. And a similar thing happens when you look at total bases:

Total Bases/Plate Appearances
Player 1- .429
Player 2- .384

So which one would you rather have heading to the plate with runners in scoring position now? I think its Player 1 and not particularly close. What is happening is pretty obvious, Player 2 is walking a lot. Those walks reduce his at bats so that his AVG and SLG are both very high. Certainly an argument can be made that all those walks have value, even with runners on base. But I think what you are really looking for in that situation is a hit, not a walk.

BTW, Player 1 is Kirby Puckett and Player 2 is Barry Bonds. To some extents they are extreme examples. Bonds is way over the top in terms of walks and Puckett swung at, and could hit, almost any pitch anywhere near the plate. But if someone tells you batting average measures "how often" a batter gets a hit, that isn't really true. And if someone suggests that AVG does not reflect walks, that isn't really true either. The impact is indirect, but it is sometimes significant.


The House said...

Your analysis is flawed. The reason why Barry Bonds had so few hits/plate appearance with runners on base is because the opposition pitched around him in those situations.

In 2004, Bonds statistics with RISP was insane. His RISP OBP was .754. Think about that. Three out of every four times Barry Bonds came to the plate with a runner in scoring position he DID NOT MAKE AN OUT.

His 2004 RISP BA was .394 with a slugging percentage of .944. His two-out, RISP (critical situations) BA was .420.

TT said...

You apparently missed the point. Batting AVG and SLG both exaggerate a players contribution when they are inflated by large numbers of walks. Bonds "insane" numbers are a lot less impressive than they appear.

Whatever the reason, in 2004 Bonds got 28 hits in 187 plate appearances with runners in scoring position, less than 1 in 6 times. There were 289 runners on base and he got 55 RBI's. When you adjust for his 11 home runs, he drove in only 44 of those 289 base runners.

Kirby Puckett in 1988 had 332 runners in scoring position and got 70 hits in 209 plate appearances, more than one in three. And, not surprisingly, he drove in 88 of those base runners.

The point here is not that Puckett was better than Bonds, I don't think he was. And you are right, the reason Bonds had such a high OBP was that he was intentionally walked 90 times with runners in scoring position.

But even when you take out the intentional walks Bonds only got a hit in .288 of his plate appearances with runners in scoring position while Puckett got a hit in .335 of his plate appearances.

