## Wednesday, October 12, 2016

### Unweighted Two-Way Association Tests

Unweighted Two-Way Association Tests

A recent post on my health care blog looked at some statistics on differences in the age profile of people obtaining private health coverage through state exchanges versus people obtaining private health coverage through their employer.

A post on the economics implications of the different age profiles of state exchanges and employment-based insurance:

This post uses data on this issue to demonstrate how to conduct an un-weighted two-way association test or the Pearson chi-squared test for a two-way contingency table.

Question One:  The table below has information on ages of people with private health insurance plan and information on whether the private health insurance plan was obtained from a state exchange or some other source.   (The primary source of private health insurance for households with working-age people is the person’s employer.)   Test the hypothesis that there is an association between the age category variable and the plan venue variable.  Use the Pearson chi-square statistic to test this null hypothesis.

 Contingency Table for Private Plan Type  and Age Category age_cat Exchange Plan Not Exchange Plan (primarily employer- baed) Total <=21 645 14,889 15,534 21

Answer:  The test for an association of two categorical variables involves comparing the observed values to the expected values.   The observed values are above.   The expected values are equal to the product of the row total and column total divided by the total sample.

 Expected Plan Types by Age Category age_cat Exchange Plan Not Exchange Plan <=21 962.8 14571.2 21

Some observations:

Note that actual number of young cohorts (less than or equal to age 26)  are lower than expected number for the exchange plan.   Actual values of other cohorts are higher than expected value for remaining older four age groups.

Note that actual group size exceeds expected group size for younger groups in the not exchange plan group.   Actual is below expected for the four older cohorts in not-exchange plan group.

The younger age of employer-sponsored plans is not entirely the result of young people not getting covered.   A disproportionate number of kids and young adults get insurance from the health plan of their parents with employer-based plans.

The calculation of the chi-square test statistic:

The chi-square test statistic used in this problem is the sum of (0-E)2/E where O is the observed cell count and E is the expected cell count.

See the table below for the calculation of the chi-square statistic.  (The top rows are values for state exchange and the bottom rows are values for not state exchanges.)

 Calculation of Chi-Squared Statistic age_cat Observed Expected <=21 645 963 104.9 21

There are five degrees of freedom in this problem (r-1) x (c-1) or (6-1) x ((2-1).

The p-value for this test is around 0.00001.

We reject the null hypothesis that there is no association between the row and column variables.

Resource:

A great resource for this type of statistical problem:

A future post will discuss how to calculate this chi-square test when the un-weighted survey data is not representative of the entire population.