Thursday, January 2, 2014

God and a counter-intuitive baseball statistic


(A reprint from my previous blog--with a few mods.)

God and Baseball Statistics

In the six days of sports creation, God created sports successively closer and closer to the perfect divine image. To be precise:

Day 1: Basketball (Intended for the Nephilim, to keep their minds off the daughters of men. Alas it didn't work, because the sport was too boring, the action constantly stopped by nitpicky fouls, and besides the daughters of men were hawt.)
Day 2: Soccer
Day 3: Real Football
Day 4: Hockey
Day 5: Baseball
Day 6: NASCAR

And on the seventh day he watched NASCAR. And it was very good. Except for Mark Martin hitting the wall in turn two.

A "Sports Theodicy" is an attempt to explain the puzzle of where figure skating, gymnastics and Formula One Racing came from, since God had nothing to do with these. He is never the author of sports that are highly feminized.

Though baseball is not the pinnacle of sports creation, it's darn close. And it has been given the special honor as the sport-most-holy in its conduciveness to statistical analysis.

We all know about batting average (BA). If you don't—well in the words of that great American philosopher Foghorn Leghorn, "I say, there's just something yech about a boy who don't, I say don't like baseball." BA is simply the number of hits divided by the number at bats. By divine fiat the number of significant digits shall always be kept at three. Never two or four, and five is just out of the question. And thou shall omit the leading zero, lest thou be sentenced to be a Pittsburgh Pirate Houston Astro fan.

So a player who has 207 hits in 611 at bats has BA of .339.

Also by holy decree you shall multiply by 1000 before saying the BA. In the example above, you say "his average is three-thirty-nine" or, generically, "he is a three-hundred hitter." Surveys show that in the 1950s, 99.2% of all baseball fans could multiply by 1000 in their heads. In our advanced early 21st century era, thanks in no small part to all the money we have spent on education, 76% need a calculator to make the decimal point vanish.

The counter-intuitive BABIP

A more interesting statistic is the batting average on balls in play (BABIP). For this statistic, you take the number of times the batter gets the ball in play, i.e., hits it into fair territory, divided by plate appearances. Strikeouts and home runs are excluded. Sacrifice flies, however, count as plate appearances. The formula is:

BABIP = (H – HR)/(AB – K – HR + SF)

where H is hits, HR is home runs, AB is at bats, K is strikeouts, and SF is sacrifice flies.

By comparison, the regular batting average is given by:

BA = H/AB

The average BABIP is around .300. Usually, but not always, a hitter's BABIP is higher than his BA.

Here is where things get interesting. If you are a general manager and your team needs a hitter, you generally snag the one with the highest BA. But suppose there are two players available with the same BA but different BABIP. For example:

Bill Buckner: BA: .280, BABIP: .290
Omar Moreno: BA: .280, BABIP .340

Which would you take? The counter-intuitive answer: take Buckner, the hitter with the lower BABIP.

Why?

Because it turns out that to a good first approximation once a batted ball is in play whether or not it results in a safe hit is random. Does the ball go to where a defender ain't? So a BABIP below the average of .300 indicates a player who has, statistically speaking, been unlucky. His BA should be higher. Conversely a player whose BABIP is higher than .300 has been lucky. His BA is artifically high.

Over time you expect the BA of a player with a high BABIP to drop, and the BA of a player with a low BABIP to rise.

So take Bill Buckner. Send Omar Moreno to AAA.

No comments:

Post a Comment