Tuesday, March 5, 2013

Assessing the ERA with game-level data.

Question:  Below is some game-level information on innings pitched and runs allowed for a pitcher.   Use this data to calculate the following statistics -- (1) the players ERA, (2) the aggregate ERA, (3) the average ERA over all pitching appearances, (4) the median ERA over all pitching appearances,  and (5) the minimum and maximum ERA over the five appearances.

Game Innings pitchedruns allowed

Discuss the limitations and advantages of each statistic.

Here is my answer:

The ERA for each game is 9xER/IP where ER is earned runs and IP is innings pitched.  Figures are in the far left column in chart below.

The aggregate ERA based on total innings pitched and total earned runs allowed is presented in the bottom row of the table.  This aggregate ERA is (9x5/12.33)=3.65.

The average of the five game ERAs is (3+27+18+1.8+0)/5  =9.96.

The median ERA is the game appearance where half are below and half are above.  Order the observations from least to highest and take the one in the middle when there are odd number of observations or average the two in the middle when there are an even number of observations.   The median of (0, 1.8, 3, 18, and 27) is 3.

The minimum ERA for this player is 0 (the fifth appearance).  The maximum ERA is 27 (the second appearance).

Game  Innings pitched runs allowed ERA
1 3.00 1.00 3.00
2 0.33 1.00 27.00
3 1.00 2.00 18.00
4 5.00 1.00 1.80
5 3.00 0.00 0.00
Total 12.33 5.00 3.65

The most commonly used measure of ERA is the ERA calculated with aggregate totals of earned runs and inning pitches.  This is better than the average of the five game ERAs because one or two very appearances where innings pitched was small and earned runs large would dominate the average ERA. In fact, the ERA is undefined if a pitcher gives up a run and does not get anyone out.

The median ERA is a measure of the typical outcome.   It is not affected by one extremely bad outcome.

It may also be useful to rank pitchers on the basis of the percent of time they give up 0 earned runs.   This statistic may be more useful than the ERA based on aggregates because it measures the likelihood of success.  Success is assumed to mean give up no runs.

There  is no unambiguously best statistic. How to use both season total and game-level data is an interesting problem.  More posts on this topic will follow.

No comments:

Post a Comment