Math 143 C/E, Spring 2001

Math 143 C/E, Spring 2001
IPS Reading Questions
Chapter 7, Section 2 (pp. 537-548)

The tests of significance of Sections 6.2 and 7.1 have both been for means (one type of sample statistic for quantitative data), and both can be classified as 1-sample procedures. In Section 7.2 we learn about 2-sample procedures (two-sample z and two-sample t). Why is this naming scheme (one vs. two sample) appropriate?
The naming seems appropriate, since the 1-sample procedures require collecting only one sample, while the 2-sample procedures require two samples. (Note that this holds even in the case of a matched-pairs t test.) The 2-sample procedures are generally intended to compare two potentially-different populations, and the samples come from each of these populations.

Suppose that x₁ and x₂ are quantitative variables that come from populations known to be distributed as N(m₁, s₁) and N(m₂, s₂) respectively. What distribution should you expect for x₂ - x₁? How about for `x₂ - `x₁, where `x₁ represents the mean value of x₁ for a sample of n₁ units taken from the first population, and `x₂ is found similarly from a sample of n₂ units taken from the second population?
The (population) distribution for x₂ - x₁ will be N(m₂ - m₁, (s₁² + s₂²)^(1/2). The other (sampling) distribution would be N(m₂ - m₁, (s₁²/n₁ + s₂²/n₂)^(1/2)).

What would be the center of any level C confidence interval for the difference of means m₂ - m₁? Suppose the 95% confidence interval for this difference was (-1.75, -0.36). Which of the means m₁ and m₂ would you say (with 95%) is larger? Would you say the same if that 95% confidence interval were (-1.75, 0.61)?
The center of any such confidence interval is `x₂ -`x₁. In the former case, you could say (with at least 95% confidence) that m₁ is larger than m₂. In the latter case, you cannot make this claim. There is a different level of confidence C (don't ask me what it is, but the value of C must be smaller than 95%; in fact, small enough so that the associated level C confidence interval comprises only negative numbers) for which you could say (with confidence C) that m₁ is larger than m₂.

On p. 541 we receive the news that, when sample standard deviations replace population s. d.'s in the computation of a t statistic (instead of a z statistic), the resulting t statistic does not have a t distribution. Nevertheless we are told how to find a value of df (we will use method 2) so that this statistic is approximately distributed as t(df). At this point, the authors say that you can count on this approximation being conservative, a word that they have used before for approximate distributions (on p. 516). Just what does this word conservative mean for us?
In the case of level C confidence intervals, it means that, if off, our margins of error will be a little larger than necessary (so that, if anything, we are a little more than C% confident that the population parameter lies inside this interval). In the case of a test of significance, we are even less likely to errantly reject the null hypothesis (that is, when it is, in fact, true) because the P values, if anything, will be larger than they would be if no approximation were necessary.

Notice how the paragraph on p. 545 that immediately follows Example 7.15 addresses concerns raised by the seven critical components of a study. Ideally, any study you read about in the news media would give information like this explaining some of the pitfalls of that type of study as well as how they were addressed whenever possible by the researcher. (Note: There is no question posed here.)

Under what conditions are the results of the two-sample procedures most trustworthy?
If the two populations under consideration have similar distributions, the results should be trustworthy even for sample sizes as small as 5 (from both populations). In general, even if the two population distributions have shapes that are quite different, the 2-sample procedures are pretty safe to use (i.e., will yield fairly accurate and conservative results) so long as the two sample sizes are roughly equal and the total number of units (from both samples) is at least 40. Assuming that the researcher has no difficulty getting 40 units to study, the bulk of attention needs to be placed upon getting an SRS or something close enough to it, since a biased sample contaminates any conclusions drawn later using statistical inference no matter what the sample size.