Paired Difference
Versus Standard t-tests
A previous post considered differences in poll results when
the respondents are asked to choose between Clinton and Trump and poll results
when respondents are allowed to select a third-party candidate.
In the previous post, I employed a paired difference test
and a non-parametric sign test on pairs and concluded that the Clinton-Trump
margin was significantly smaller when polls included options on third-party
candidates.
This post considers whether a standard t-test on means would
lead to the same result. I also discuss
which test, (the paired difference test or the comparison of means) is more
appropriate.
Question: Below
is information on the sample averages, sample standard deviation and the sample
size for two types of polls.
Test the hypothesis that the variance of the poll results
from polls including only the two main candidates is equal to the variance of
poll results when third-party candidates are included.
Test the hypothesis that the mean Clinton-Trump margin is
identical for the two polling techniques.
Discuss whether the hypothesis test presented here is
superior or inferior to the paired difference technique used in the previous
post.
Clinton-Trump Margin for
Two Different Poll Types
|
||
Two Man Candidates
|
Third Candidates Included
|
|
Average
|
6.25
|
4.17
|
Standard Deviation
|
2.49
|
1.85
|
Sample Size
|
12
|
12
|
Analysis: The F-test for the equality of the variances
is F=1.81 (2.49/1,85)2 The
two-tailed p-value for this F statistic is 0.3387. The two-tailed test is appropriate because I
don’t have a clear prior as to which variance is larger.
I reject the hypothesis that the variances are equal.
I then conduct the t-test on equal means based on the
assumption that variances are not equal.
I get a t-value of 2.33, which is associated with a p-value of 0.0305.
Whether or not you will reject the null hypothesis of no
differences in means depends on the level of significance that you choose for
the test. If you choose a level of
significance of 0.01 you will not reject the null hypothesis. If you choose the level of significance of
0.05 you reject the null hypothesis.
The p-value in the previous post that used the paired t-test
was 0.0023. The paired t-test unambiguously
rejects the null hypothesis of no difference in means at conventionally used critical
regions.
What Test Is Better: For this database the paired difference test
provides an unambiguous conclusion, reject the null hypothesis that the mean
difference in poll results is identical in favor of the hypothesis that the
mean difference is not zero. Results
from the standard comparison of means are less conclusive.
A strong case can be made for the use of the paired-difference
test in this example. The pairs in this
study occur naturally. The pairs involve
the two polling techniques on the same day by the same company. By evaluating paired differences in this
manner we are getting rid of extraneous variability caused by factors that are
unrelated to the inclusion or exclusion of the third-party option.
One source of variability involves changes in the general
election victory likelihoods over time.
Movements in the net Clinton Trump margin over time from +10 to +2 or,
in the future, even lower, have nothing to do with whether the inclusion of
questions on third-party candidates changes results. Similarly differences in poll characteristics
like sample size and whether the polls use both cell and landlines are
irrelevant to the question of whether the inclusion of the third-party question
impacts results.
The standard error from the unpaired test that simply
compares means is extremely large because it measures variability in poll
results from these other sources.
Some polling companies only have one question -- Trump
versus Clinton or Trump, Clinton or a third-party candidate. I have excluded all poll results that do not
include both questions. Many
statisticians would argue that I should not be throwing away so much data.
It is possible to pair polls using different samples on the
same date. This would reduce
variability based on changes in the electoral mood. However, in my view it makes little sense to
compare results from a poll of 400 people based exclusively on landlines only
to results from a poll of 1000 people that uses both landlines and cell phones.
It is possible to handle this problem in a regression
framework where poll result (Clinton Versus Trump margin) is the dependent
variable. The key explanatory variable
in the model would be whether the poll allows for a third party option but the
poll would also include questions on date of poll and other polling
characteristics. In this framework, a
negative coefficient on the coefficient of the third-party dummy variable would
indicate that third-party candidates appear to favor Trump.
The main advantage of the regression framework is that it
allows for the use of both polls that ask only one question and polls that ask
both question types. The regression
method also allows for the statistician to estimate the impact of third-party
support on the Clinton-Trump margin.
Final Thoughts: Often statisticians should use paired
difference tests rather than a standard comparison of two means. Fore example, a comparison of hourly sales at
a Caribou Coffee versus Starbucks should pair the observations across hours
because the change in sales over the day are irrelevant to whether one location
dominates the other. Similarly, a test
comparing the number of people going to movies to the number of people watching
TV should pair days because changes in habits which vary from day to day and
month to month are net related to the long term differential between the number
of people watching television versus the number of people watching movies.
I am planning more work on the election. Please subscribe to my blog to get these
posts.
No comments:
Post a Comment