Test 1 Summary


There were 95 points on this test, but you should probably think of it as 92 points plus 3 bonus points. The 3 bonus points:

  • 1 point for selecting your sample from non-breaksfast eaters in problem 5 to avoid possible ethical issues of having students who normally eat breakfast stop eating breakfast for your study. It also makes your subjects more similar to each other, which might reduce variablility, also a good thing in most cases.

  • 2 points for (correctly) using a paired design in problem 8d. We certainly covered paired designs, so this isn’t as much of a stretch, but most of you opted for a 2 independent samples design (or some mixture of paired and 2-sample).

The point breakdown per page was

page points
1 16
2 22
3 18
4 15
5 24


The plot below shows the distribution of scores colored by approximate letter grade (A, B, C, etc.).

plot of chunk unnamed-chunk-3

Test 2

The plot below shows the distribution of scores colored by approximate letter grade (A, B, C, etc.).

plot of chunk unnamed-chunk-4

Here is a plot showing the relationship between homework scores and test 2 scores. The colors are letter grade colors from test 1.

plot of chunk unnamed-chunk-5

Test 3

Test 3 had 92 points. The plot below shows the distribution of scores on each test colored by approximate letter grade (A, B, C, etc.).

plot of chunk unnamed-chunk-6

Some comments on Test 3

As always, I’m happy to discuss individual test problems with you. But since I’ll be out of town for a day and the final is coming soon, I thought I would write up some notes about the test.

  1. Problem 1 went well for most of you. Increasing the confidence level will increase the width of a confidence interval. This makes sense. If we want a confidence interval to have a better chance of “covering” the correct, we need our confidence interval to cover more, i.e., be wider.

  2. This went well for most of you. Some of you computed the chi-squared statistic by hand (using a calculator, I assume) and some of you used chisq.test(). Either way was fine.

  3. This is going better each test for most of you. A few of you still sometimes confuse variables and statistics.
  • The one that caused the most trouble in that regard this time was the “success rate” of the smoking cessation programs. To caculate that rate, the variable you need is whether or not someone quit smoking (a categorical variable).
  • The smoking problem actually involves three variables: quit smoking (yes or no), program (one of three), and sex (male or female). This was a bit of a goof on my part. I intended to have “none of the above” in my list, but I see now that I did not. So in addition to “none of the above” I was pretty lenient as long as you chose reasonable variables and an analysis method that fit what you chose. Many of your ignored sex and focused on success rates of the 3 programs, which could be compared using a Chi-squared test for 2-way tables.
  • The last item was interesting in that there were two approaches. If you think of the fly papers as the cases, then the variables are color and number of bugs, and you can do a 1-way ANOVA. If you think of the flies as cases, then you can do a Chi-squared goodness of fit test. The latter isn’t quite as good since you can really only test whether the bugs are equally likely to be caught on any of the colors. The 1-way ANOVA lets us do more, like Tukey’s Honest Significant Differences to see which colors differ from which and by how much.
  • On the final I may ask for additional information. In particular, I might ask you to identify what the cases are. This is a good thing for you to do anyway, but sometimes it was difficult for me to be sure what you were considering to be a case.
  1. The regression problem went pretty well for most of you. Some of you didn’t remember how to compute a residual (observed - predicted) or confused it with the residuals as defined in Chi-squared tests.

  2. Many good responses here. When people made mistakes, they typically didn’t consider both the differences in the means (between/among group variability) and spread with in the groups (within group variability) or misinterpreted the role one of those two elements plays.

  3. The two assumptions for ANOVA are that (a) each population group is normally distributed, and (b) each population group has the same standard deviation. Our rule of thumb is that the ratio of the largest sample standard deviation to the smallest sample standard deviation should be no more than 2:1. In this case the sample standard deviations are all very close, so we have no cause for concern on those grounds. We don’t really have information on the test paper to allow us to check the normality assumption. (That’s a hard one to check in such small samples anyway.)

  4. This problem had its ups and downs.

  • The most disappointing question on the test was 7d. Many of you had trouble expressing what a p-value is. If it helps, recall our very first example of the Lady Tasting Tea. We simulated a world in which the Lady was just guessing by tossing coins and saw that getting 9 or 10 of the cups correct just by guesing was pretty unlikely. That “unlikely” is the p-value. The p-value is the probababilty of observing data at least as unual as our atual data, assuming the null hypothesis is true. In this case that would be the probability of observing such a large difference in the survival rates of the animals (at least as big as the 10% vs 5% that we observed) if the vaccine actually made no difference at all.

  • I was looking for more than just “reject \(H_0\)” or “do not reject \(H_0\)”. I was looking for an answer that said what this means in context. Our p-value is not small enough to reject \(H_0\). What does that mean? It does not mean that \(H_0\) must be true. It says that \(H_0\) is consistent with our data – we do not have enough evidence to be convinced \(H_0\) is false. But the problem might be that we just don’t have enough evidence period. Some of you suggested that things were interesting enough that we might like a larger data set to see what is really going on. That’s a good suggestion (if you can afford it.)

  • On the other hand, the computations on the second page of this problem went very well for the majority of you.

Points Breakdown

Here are the number of points on each page:

page points
1 12
2 20
3 6
4 18
5 16
6-7 20
total 92