Estimating
percentiles under the assumption of normality
This post shows how one can estimate the 5^{th} and
95^{th} percentile for a sample based on the assumption that the data
is normally distributed around the mean, given the sample mean and the sample standard
deviation from the data set. The
dataset that we use here involves 60 observations on the price of Vanguard
smallcap ETF (VB). The data was
previously used to illustrate a confidence interval around the mean of the
price data.
Previous Post on
Confidence Interval:
This post above is also useful in that it explains how to
create a confidence interval and how to use the confidence function in Excel.
Question: The table below has the sample mean and
sample standard deviation from 60 observations on Vanguard fund (VB). The table also contains the values of the 5^{th}
percentile and the 95^{th} percentile in the sample.
Use the data on the sample mean and sample standard
deviation to estimate the value of the 5^{th} and 95^{th}
percentile under the assumption that the data is normally distributed.
How do the estimates of the 5^{th} and 95^{th}
percentiles compare to the actual values of the 5^{th} and 95^{th}
percentile obtained from the sample?
Based on this comparison, do you believe the data is
positively or negatively skewed?
Sample Mean and Sample
Standard Deviation for 60
observations of Vanguard Fund VB


Mean

120.897

Std

1.995

Actual 5th Percentile

116.0

Actual 95th Percentile

123.5

Analysis: The estimated value of the 5^{th}
and 95^{th} percentile is obtained by multiplying the appropriate value
of the normal distribution by the standard deviation. We use the norm.inv function and find Z_{0.05
}is 1.645 and Z_{0.95} is 1.645.
We estimate the 5^{th} percentile at 117.6 and the
95^{th} percentile at 124.2.
Estimates of the 5th and
95th Percentiles
Under the Assumption of Normality


norm.inv(0.05)

1.645


norm.inv(0.95)

1.645


Standard Deviation

1.995


Average

120.897


Estimate of 5th
percentile

117.615


Estimate of 95th
percentile

124.179


Actual Values of 5th and
95th Percentiles


5th

116.0


95th

123.5

The actual value of the 5^{th} percentile is 116.0
around 1.6 points lower than the estimated value under the assumption of
normality.
The actual value of the 95^{th} percentile is 123.5
around 0.7 points lower than estimated value under the assumption of normality.
These deviations between estimated and actual percentiles
make me realize that the data is negatively skewed compared to the normal
distribution.
The skew estimate obtained from Excel is 1.3. There is a negative skew to this data.
A thing for students
to think about: In what way does the
calculation of estimates the 5^{th} and 95^{th} percentile of
this sample based on the assumption of normality differ from the calculation of
a confidence interval around the mean, also based on the assumption of
normality.
.. And how much does the answer differ if one does not assume Normality, and what's a good way of calculating that? (Hint: Consider resampling.)
ReplyDelete.. And how much does the answer differ if one does not assume Normality, and what's a good way of calculating that? (Hint: Consider resampling.)
ReplyDelete