## Friday, November 15, 2013

### Hitting streaks in the real world.

Question:  A previous post analyzed the likelihood that a hitter would have a 10-game winning streaks.   The post allowed for the hitters batting averages to differ against right handed pitchers from the batting average against left handed pitchers.  However, the batting average against all left handed pitching was independent and identically distributed and the batting average against all right handed pitching was independent and identically distributed.

Previous Post on Hitting Streaks:

How does the assumption that batting outcomes are independent and identically distributed affect the probability that a hitter has a 10-game hitting streak?

Do you anticipate that the actual likelihood of a hitting streak will be lower or higher than the estimate obtained under the assumption that outcome of at-bats and outcomes across games are iid and binomially distributed.

Answer:  Pitchers vary in quality and effectiveness.  Hitters should get a disproportionately high number of hits in games where they face ineffective pitchers and  a disproportionately low number of hits when they face hitters who are relatively strong.

I have posted some data that supports this hypothesis.

Over a 16-game stretch Tony Gwynn hit 0.347 and had four games with zero hits.  We compare this observed number of 0 hit games to the expected number of zero hit games over a 16-game period.

What is the likelihood that a 347 hitter will have 0 hits in a game?

if at-bats are iid and binomially distributed this probability is 0.18.

The expected number of 0 hit games over a 16 game stretch (again based on iid and binomially distributed)

is 16 x 0.18, which is 2.88.  The expected number of 0 hit games is less than the actual number (2.88<4).

This suggests that the actual likelihood of a 10-game streak is lower than the likelihood that one calculates based on the assumption that at-bats and games are binomial and iid.

Other assumptions:

The likelihood calculations in the post written yesterday are based on the assumption of 4 at bats per game.  In some games, the number of at bats is larger than 4 and in others less than 4.  The variability in the number of at bats should result in a higher likelihood of no hits in some games and should decrease the likelihood of a streak.

Actually, only 4 games with no hits over a 16 game period is pretty good.   This is probably because Tony Gwynn was a clutch hitter.  A 347 batting average is astounding but not all 347 hitters are equal.  There are few players I would rather have at the plate in a situation where my team is down by a run late in the game.

Author's Note:

This problem first appeared in my book Statistical Applications of Baseball, published in 1996.  It is available at a very low price on kindle.

https://www.amazon.com/Statistical-Applications-Baseball-Statistics-Sports-ebook/dp/B006M3PQWQ

Go back to baseball probability page by clicking here.

http://www.dailymathproblem.com/p/baseball.html