We'll develop these ideas by using three examples.
H-null: The numbers 1, 2, 3, and 4 are equally likely. H-alt: The numbers 1, 2, 3, and t are not equally likely.We expected about 121.5 golf balls (1/4 of 486) of each number. By adding
2 2 (observed - expected) (137 - 121.5) --------------------- ; i.e., 1.9774 = ---------------; etc. expected 121.5for each of the four values (1 through 4), we obtained Chi-squre = 8.46914:
obs exp diff n dif chi sq. P-value cum prob
137 121.5 15.5 1.97737 8.46914 0.0372487 0.962751
138 121.5 16.5 2.24074
107 121.5 -14.5 1.73045
104 121.5 -17.5 2.52058
The P-value for this test is 0.0372487. This means
if the golf ball numbers are uniformly distributed,
then we would expect to get a value this big or bigger
about 3.7% of the time.
H-null: There is no association between taking aspirin and having a heart attack. H-alt: There is an association between taking aspirin and having a heart attack. (That is, those taking aspirin are either more likely or less likely to have a heart attack than those taking a placebo.
Here is a two-way table representing the data from this famous study:
| Heart Attack | No Heart Attack | |
| Aspirin | 104 | 10 933 |
| Placebo | 189 | 10 845 |
The table lists the number of subjects with each possible combiniation of treatment and outcome. For example, there were 104 subjects treated with aspirin that had a heart attack during the course of the study.
We can get a more information from this table by adding anther row and two additional columns
| Heart Attack | No Heart Attack | Total | Rate per 1000 | |
| Aspirin | 104 | 10 933 | 11 937 | 9.4 |
| Placebo | 189 | 10 845 | 11 034 | 17.1 |
| Total | 293 | 21 778 | 22 071 | 13.3 |
Now we can clearly see that roughly the same number of subjects were in each treatment group, and that the aspirin group had a lower rate of heart attack. In fact the rate of heart attack for the aspirin group was only a little more than half the rate for the placebo group.
rate for one group Relative risk = ------------------------ rate for other groupSo in this case, we can say that rate of heart attacks for those taking the placebo was 1.82 (17.1 / 9.4) times the rate of heart attack for those taking aspirin. Sometimes this is expressed as an increased risk of 82%.
Of course, we can reverse the roles of the two groups, computing the relative risk to be 0.55 (9.4 / 17.1), and say that taking aspirin reduces the risk of heart attack by 45%.
At least that was the case for those in the study. That leaves us with at least two questions:
Had we divided the doctors into two groups randomly (group A and group B) and not given either group any aspirin or placebo, we would not expect both groups to have exactly the same number of heart attacks. Is the diffence we see in this study large enough to be considered statistically significant? Or might it be attributed to random chance? That is what the Chi-Square test will help us determine.
This is a trickier matter. Aspirin seemed to cut the risk of heart attack by about half in the study. But will it reduce your risk? That depends on who you are. Results extend best to populations that are most similar to male physicians. The answer to this question lies in additional studies and in the medical explanations for the effects of aspirin. If we know why aspirin reduced the rate of heart attack in the physicians, then we could say better in what other populations it might also do so.
Chi-Square Test
Expected counts are printed below observed counts
Heart Attack?
Yes No Total
Aspirin 104 10933 11037
146.52 10890.48
Placebo 189 10845 11034
146.48 10887.52
Total 293 21778 22071
Chi-Sq = 12.339 + 0.166 +
12.343 + 0.166 = 25.014
DF = 1, P-Value = 0.000
The Chi-Squre statistic is computed by adding the value of
2 2 (observed - expected) (104 - 146.52) --------------------- ; So 12.339 = ---------------; etc. expected 146.52for each of the four cells in the original two-way table. The four values are added togther to produce (in this example) 25.014
How big is 25? That answer is given by the P-Value. It is listed here as 0.000, which means that it is less than 0.0005 (else it would round to more than 0.000). This P-value was so small that the study was actually terminated early. The evidence was so overwhelming in favor of aspirin, that those conducting the survey could no longer justify withholding it from the placebo group.
Notice that in expectations in each row are approximately the same. That is because there were roughly equal numbers in each treatment group. Had there been more in one group than in the other, we should have expected more heart attacks (and more non-heart attacks) in the larger group. More specifically, since 11 037 of the 22 071 subjects took aspirin, we would expect (if the Null Hypothesis is in fact true) that of the 293 heart attacks, approximately
11 037 11 037 * 293 -------- (293) = -------------- = 146.52 22 071 22 071would occur in the aspirin group. That's simply the fair share for that group. In general the expected count is given by
Row Total Row Total * Column Total expected = ------------ (Column Total) = -------------------------- Grand Total Grand Total
If the percentages of heart attacks remained roughly the same, the data in this case would have been the following
| Heart Attack | No Heart Attack | Total | |
| Aspirin | 10 | 1093 | 1103 |
| Placebo | 19 | 1085 | 1104 |
| Total | 29 | 2178 | 2207 |
Expected counts are printed below observed counts
Heart Attack?
Yes No Total
Aspirin 10 1093 1103
14.49 1088.51
Placebo 19 1085 1104
14.51 1089.49
Total 29 2178 2207
Chi-Sq = 1.393 + 0.019 +
1.392 + 0.019 = 2.822
DF = 1, P-Value = 0.093
Notice how much less significant the result is with a sample of this size!
As an example, let's look at the data from Survey 1. We might be interested, for example, in whether men get more tickets than women. In order to do a Chi-squre analysis, we first must decide what categorical variable to use for "getting tickets". One way to do this would be to compare men and woment to see who has received any tickets at all. If we do so, Minitab produces the following output:
Rows: Sex Columns: any tickets
No Yes All
F 74.00 26.00 100.00
37 13 50
32.22 17.78 50.00
M 52.50 47.50 100.00
21 19 40
25.78 14.22 40.00
All 64.44 35.56 100.00
58 32 90
58.00 32.00 90.00
Chi-Square = 4.483, DF = 1, P-Value = 0.034
Cell Contents --
% of Row
Count
Exp Freq
Another way to do this would be to look at multiple offenders (those with 2 or more tickets). Here are the results:
Rows: Sex Columns: multiple tickets
No Yes All
F 92.00 8.00 100.00
46 4 50
40.56 9.44 50.00
M 67.50 32.50 100.00
27 13 40
32.44 7.56 40.00
All 81.11 18.89 100.00
73 17 90
73.00 17.00 90.00
Chi-Square = 8.706, DF = 1, P-Value = 0.003
Cell Contents --
% of Row
Count
Exp Freq
Notice what these results say and don't say.
There may be many explanations for this data besides that men drive faster or more carelessly: perhaps they drive more miles, perhaps they drive cars that police officers are more likely to stop, perhaps they are less likely to "get out of a ticket" once pulled over, maybe the men were older (so they had more time to get tickets), etc.
One final note about statistical design. In a situation like the one we just looked at, there are two ways to design the study:
There are, however, a few little details to keep in mind.
This page is maintained by Randall Pruim. Please email comments, corrections, suggestions and the like to rpruim@calvin.edu.
Last Modified: Thursday, 11-Jan-2001 16:02:48 EST