Testing for home field advantage in
baseball with pooled data
Previously, we tested for the existence of home field
advantage for all MLB teams separately during the 2015 regular season. We found that 8 teams did significantly
better at home than on the road.
In this post we consider and test whether the home win
percentage total of all major league teams is greater than 0.50.
We compare the result obtained by pooling data from all
major league teams over the entire 2015 regular season to tests on each team
separately.
We then discuss the limitations of statistical analysis with
pooled crosssectional data.
Question: Below is information on the home winloss
record for all major league teams in the 2015 regular season. What percent of games
played in the major leagues during the 2015 regular season did the home team
win?
Is this percent significantly different from 0.50?
What do we learn from this test compared to the tests for
individual teams?
Win Loss Records in Home
Games


Team

Observed Win/Home

Observed Lose/Home

Toronto

53

28

New York

45

36

Baltimore

47

31

Tamp Bay

42

42

Boston

43

38

Kansas City

51

30

Minnesota

46

35

Cleveland

39

41

Chicago

40

41

Detroit

38

43

Texas

43

38

Houston

53

28

Los Angeles

49

32

Seattle

36

45

Oakland

34

47

New York

49

32

Washington

46

35

Miami

41

40

Atlanta

42

39

Philadelphia

37

44

Saint Louis

55

26

Pittsburgh

53

28

Chicago

49

32

Milwaukee

34

47

Cincinnati

34

47

Los Angeles

55

26

San Francisco

47

34

Arizona

39

42

San Diego

39

42

Colorado

36

45

Answer: There were 2429 total games played during the
2015 MLB regular season. The home team
won 1315 or 54.1% of these games.
Is the home win percentage 54.1% significantly greater than
zero?
Let’s conduct a one tailed test with a significance level of
0.01.
The z cutoff for a
onetailed test with a critical value of 0.01 is 2.33.
The tstatistic used to test whether the home win proportion
is greater than 0.5 is
Z= (p 0.5)/ ((0.5 x
(1p)/2429))^{0.5}
This test statistic or value is 4.1.
We reject the null hypothesis that the win likelihood for a
home team is 0.5 in favor of the alternative hypothesis that this probability
is greater than 0.5.
Notes: The result presented here appears suggests
the relationship between having the home field and winning is really strong.
After all the probability of winning at home is 0.541. This means the probability of losing on the
road is 0.459.
The win probability is clearly significantly different from
0.50.
But lets remember the teamspecific results are not nearly
as robust.
Only eight teams have a significantly higher win probability
at home than on the road. The home win
probability was not significantly higher than the roadwin probability for 22
of 30 MLB teams in the 2015 regular season.
This means that if you rely on results from a test or model
on pooled data you will overestimate the probability of wining at home for most
teams and underestimate it for some of the teams.
Additional insight into the impact of home field advantage
requires that we look at why some teams have it and others do not.
Authors Note: My book Statistical Applications of Baseball
is a bit dated but it has some interesting problems in it and is very
inexpensive. Please consider buying it
on Kindle.
Go back to home field advantage testing problems:
http://www.dailymathproblem.com/p/homefieldhypothesistestingproblems.html
No comments:
Post a Comment