Math 143 C/E, Spring 2001
IPS Reading Discussion Questions
Chapter 3, Section 4
A statistic is a quantity that is computed from the sample. That means it is determinable. A parameter is a quantity that reflects the entire population. We generally do not know its value, but the desire to know it prompts us to select a sample and compute an approximate value (statistic) for it.
Actually, these were not sampling distributions, but were a lot like them. A true sampling distribution would have included the percentage from every possible sample of size 10 (there are about 17 trillion such samples), not just the one sample per classmate that was in our dotplot. Moreover, no sample would appear twice in a sampling distribution; we made no attempt to make sure that such a thing did not happen in our dotplot. That said, it is unlikely that two people in class did wind up with the same sample of Senators, and we may well start to get the idea of the shape of the sampling distribution from the few samples that we included. (We just have to fill in another 17 trillion values to be sure.)
An increase in sample size decreasing the spread (the standard deviation, specifically) of the sampling distribution of that statistic. What this means practically is that you can generally be more confident about the accuracy of the sample statistic in approximating the population parameter for larger SRSs than you can for smaller ones.
Sample size (usually) does affect the variability of a statistic, but population size generally does not. So, a sample of 2500 from a population of 740,000 inhabitants does yield a more reliable statistic (in general) than a sample of size 100 from this same population. However, if we hold the sample size fixed while varying the population size (from 740,000 to 270 million), there is little affect on the reliability of the statistic. There is no contradiction here.
Suppose you want to know about the general air quality in the U. S. If the air you include in your sample only comes from the vicinity of Newark, NJ, the proportion of contaminants you find in the sample is likely to be biased toward being more contaminated than is true for the air in the U. S. as a whole. The sampling distribution for samples taken in this fashion (only from Newark) will almost certainly have a higher mean level of contaminant than the sampling distribution for all air samples.