This post looks at the
validity of statistical tests when observations are not independent. The
specific application involves financial return data with overlapping holding
periods.
Question: The first chart below contains financial return
data on large cap value and growth stock for 16 holding periods. The second chart below contains financial
return data for small cap value and growth stocks for the same 16 holding
periods.
What is the difference in average
returns for growth and value stocks for large cap and smallcap stocks?
Conduct paired ttests for
the hypothesis that there is no difference between mean return of growth and
value stocks for smallcap and largecap stocks using the return information
from the 16 overlapping holding periods.
Why is the use of this test
on these samples created from overlapping holding periods problematic?
Could an average of 16
holding periods be used in a crosssectional study of ETF financial returns? Would use of such an average in a
crosssectional study be worse or better than a crosssectional study using a
single holding period?
Data:
Returns on LargeCap
Value and Growth Funds


Obs No.

Purchase Date

Sale Date

VTV Value fund

VUG Growth Fund

VUGVTV

1

7/1/13

7/1/17

11.0%

13.6%

2.6%

2

10/1/13

7/1/17

10.9%

12.4%

1.5%

3

1/1/14

7/1/17

11.3%

12.5%

1.2%

4

4/1/14

7/1/17

9.7%

12.2%

2.5%

5

7/1/13

10/1/17

11.5%

14.1%

2.6%

6

10/1/13

10/1/17

11.4%

13.0%

1.6%

7

1/1/14

10/1/17

11.8%

13.2%

1.4%

8

4/1/14

10/1/17

10.3%

12.9%

2.6%

9

7/1/13

1/1/18

13.0%

15.7%

2.7%

10

10/1/13

1/1/18

13.1%

14.7%

1.6%

11

1/1/14

1/1/18

13.5%

15.0%

1.5%

12

4/1/14

1/1/18

12.2%

14.9%

2.7%

13

7/1/13

4/1/18

10.8%

13.5%

2.7%

14

10/1/13

4/1/18

10.7%

12.6%

1.9%

15

1/1/14

4/1/18

11.0%

12.7%

1.7%

16

4/1/14

4/1/18

9.6%

12.4%

2.8%

Returns on SmallCap
Value and Growth Funds


Obs No.

Purchase Date

Sale Date

VBR Value

VBK Growth

VBKVBR

1

7/1/13

7/1/17

10.8%

8.8%

2.0%

2

10/1/13

7/1/17

10.2%

7.5%

2.7%

3

1/1/14

7/1/17

10.1%

7.0%

3.1%

4

4/1/14

7/1/17

9.2%

7.9%

1.3%

5

7/1/13

10/1/17

11.2%

9.7%

1.5%

6

10/1/13

10/1/17

10.7%

8.6%

2.1%

7

1/1/14

10/1/17

10.6%

8.1%

2.5%

8

4/1/14

10/1/17

9.8%

9.1%

0.7%

9

7/1/13

1/1/18

11.8%

10.8%

1.0%

10

10/1/13

1/1/18

11.3%

9.8%

1.5%

11

1/1/14

1/1/18

11.3%

9.4%

1.9%

12

4/1/14

1/1/18

10.5%

10.4%

0.1%

13

7/1/13

4/1/18

10.3%

9.9%

0.4%

14

10/1/13

4/1/18

9.7%

8.8%

0.9%

15

1/1/14

4/1/18

9.6%

8.5%

1.1%

16

4/1/14

4/1/18

8.9%

9.3%

0.4%

Statistical Analysis:
Calculation: othe paired ttest statistic: Just take the
average and standard deviation of VUGVTV for largecap stocks and VBKVBR for
smallcap stocks.x
The tstatistics is the
average divided by the standard error, which is the standard deviation divided
by the square root of the sample size.
The sample size is 16. The
square root of the sample size is 4.
Results: The observed averages and standard deviation
between returns for growth and value stocks in both the largecap and smallcap
sector are presented below.
Difference Between Growth
and Value Stocks


Statistic

Large Cap Stocks
(VUGVTV)

Small Cap Stocks
(VBKVBR)

average

2.11%

1.40%

Standard Deviation

0.58%

0.96%

tstatistic

14.54

5.83

In the largecap sector,
growth stocks outperformed value stocks
In the small capsector,
value stocks outperformed growth stocks.
Discussion of validity of test: The sixteen observations
are not independent because the holding periods overlap. Statistical tests can provide misleading results
when the assumption of independent observations is violated.
Could an average of sixteen holding periods be used in
a crosssectional study of ETF financial returns? Would use of such an average in a
crosssectional study be worse or better than a crosssectional study using a
single holding period?
Consider a crosssectional
study where the analyst wants to examine whether one type of fund has better
returns than another type of returns. For example, the post below presents
overlapping return statistics for 10 sectorspecific ETFs and 6 broadmarket
ETFs.
Could we use this data to
test for differences between performance of sector funds and broad funds?
The short answer is yes. Components of the average overlap. However, the assumption that returns from
each sector are independent is not altered by the use of overlapping holding
periods.
In my view, the use of a return
measure based on multiple purchase and sale dates provides a more meaningful
measure of return than a statistic based on a single purchase and single sale
date. The use of one holding period can
provide unusual readings due to volatile stock market conditions.
Final Note: The difference in ou9comes growth minus
value for largecap versus small cap stocks is fascinating. I need to follow up on this issue.
No comments:
Post a Comment