## Wednesday, September 28, 2016

### Estimating percentiles under the assumption of normality

Estimating percentiles under the assumption of normality

This post shows how one can estimate the 5th and 95th percentile for a sample based on the assumption that the data is normally distributed around the mean, given the sample mean and the sample standard deviation from the data set.   The dataset that we use here involves 60 observations on the price of Vanguard small-cap ETF (VB).   The data was previously used to illustrate a confidence interval around the mean of the price data.

Previous Post on Confidence Interval:

This post above is also useful in that it explains how to create a confidence interval and how to use the confidence function in Excel.

Question:  The table below has the sample mean and sample standard deviation from 60 observations on Vanguard fund (VB).   The table also contains the values of the 5th percentile and the 95th percentile in the sample.

Use the data on the sample mean and sample standard deviation to estimate the value of the 5th and 95th percentile under the assumption that the data is normally distributed.

How do the estimates of the 5th and 95th percentiles compare to the actual values of the 5th and 95th percentile obtained from the sample?

Based on this comparison, do you believe the data is positively or negatively skewed?

 Sample Mean and Sample Standard Deviation for 60  observations of Vanguard Fund VB Mean 120.897 Std 1.995 Actual 5th Percentile 116.0 Actual 95th Percentile 123.5

Analysis:   The estimated value of the 5th and 95th percentile is obtained by multiplying the appropriate value of the normal distribution by the standard deviation.  We use the norm.inv function and find Z0.05 is -1.645 and Z0.95 is 1.645.

We estimate the 5th percentile at 117.6 and the 95th percentile at 124.2.

 Estimates of the 5th and 95th Percentiles  Under the Assumption of Normality norm.inv(0.05) -1.645 norm.inv(0.95) 1.645 Standard Deviation 1.995 Average 120.897 Estimate of 5th percentile 117.615 Estimate of 95th percentile 124.179 Actual Values of 5th and 95th Percentiles 5th 116.0 95th 123.5

The actual value of the 5th percentile is 116.0 around 1.6 points lower than the estimated value under the assumption of normality.

The actual value of the 95th percentile is 123.5 around 0.7 points lower than estimated value under the assumption of normality.

These deviations between estimated and actual percentiles make me realize that the data is negatively skewed compared to the normal distribution.

The skew estimate obtained from Excel is -1.3.   There is a negative skew to this data.

A thing for students to think about:  In what way does the calculation of estimates the 5th and 95th percentile of this sample based on the assumption of normality differ from the calculation of a confidence interval around the mean, also based on the assumption of normality.

1. 2. 