Unweighted TwoWay
Association Tests
A recent post on my health care blog looked at some
statistics on differences in the age profile of people obtaining private health
coverage through state exchanges versus people obtaining private health
coverage through their employer.
A post on the economics implications of the different age
profiles of state exchanges and employmentbased insurance:
This post uses data on this issue to demonstrate how to
conduct an unweighted twoway association test or the Pearson chisquared test
for a twoway contingency table.
Question One: The table below has information on ages of
people with private health insurance plan and information on whether the
private health insurance plan was obtained from a state exchange or some other
source. (The primary source of private
health insurance for households with workingage people is the person’s
employer.) Test the hypothesis that
there is an association between the age category variable and the plan venue
variable. Use the Pearson chisquare
statistic to test this null hypothesis.
Contingency Table for
Private Plan Type
and Age Category


age_cat

Exchange Plan

Not Exchange Plan
(primarily employer
baed)

Total

<=21

645

14,889

15,534

21<age<=26

217

3,711

3,928

26<age<=35

497

7,100

7,597

35<age<=45

559

8,211

8,770

45<age<=55

713

8,984

9,697

55<age<=65

734

8,031

8,765

Total

3365

50,926

54,291

Answer: The test for an association of two
categorical variables involves comparing the observed values to the expected
values. The observed values are
above. The expected values are equal to
the product of the row total and column total divided by the total sample.
Expected Plan Types by
Age Category


age_cat

Exchange Plan

Not Exchange Plan

<=21

962.8

14571.2

21<age<=26

243.5

3684.5

26<age<=35

470.9

7126.1

35<age<=45

543.6

8226.4

45<age<=55

601.0

9096.0

55<age<=65

543.3

8221.7

Total

Some
observations:
Note that actual number of young cohorts (less than or equal
to age 26) are lower than expected
number for the exchange plan. Actual
values of other cohorts are higher than expected value for remaining older four
age groups.
Note that actual group size exceeds expected group size for
younger groups in the not exchange plan group.
Actual is below expected for the four older cohorts in notexchange plan
group.
The younger age of employersponsored plans is not entirely the
result of young people not getting covered.
A disproportionate number of kids and young adults get insurance from
the health plan of their parents with employerbased plans.
The calculation of
the chisquare test statistic:
The chisquare test statistic used in this problem is the
sum of (0E)^{2}/E where O is the observed cell count and E is the
expected cell count.
See the table below for the calculation of the chisquare
statistic. (The top rows are values for
state exchange and the bottom rows are values for not state exchanges.)
Calculation of
ChiSquared Statistic


age_cat

Observed

Expected


<=21

645

963

104.9

21<age<=26

217

243

2.9

26<age<=35

497

471

1.5

35<age<=45

559

544

0.4

45<age<=55

713

601

20.9

55<age<=65

734

543

67.0

<=21

14889

14571

6.9

21<age<=26

3711

3685

0.2

26<age<=35

7100

7126

0.1

35<age<=45

8211

8226

0.0

45<age<=55

8984

9096

1.4

55<age<=65

8031

8222

4.4

Chi Square

210.5

There are five degrees of freedom in this problem (r1) x
(c1) or (61) x ((21).
The pvalue for this test is around 0.00001.
We reject the null hypothesis that there is no association
between the row and column variables.
Resource:
A great resource for this type of statistical problem:
A future post will discuss how to calculate this chisquare
test when the unweighted survey data is not representative of the entire
population.
No comments:
Post a Comment