Math 143 C/E, Spring 2001

Math 143 C/E, Spring 2001
IPS Reading Discussion Questions
Chapter 3, Section 4

How do the concepts of statistic and parameter relate to those of population and sample?
A statistic is a quantity that is computed from the sample. That means it is determinable. A parameter is a quantity that reflects the entire population. We generally do not know its value, but the desire to know it prompts us to select a sample and compute an approximate value (statistic) for it.

In class on Wed., 2/7, we carried out an activity in which we randomly selected 10 Senators, determining the percentage in our sample that were women, Democrats and from states beginning in the letter `M'. For each of these three random variables, we made a dot plot that included the percentage in your sample as well as that of each of your classmates. Was this dotplot an example of a sampling distribution?
Actually, these were not sampling distributions, but were a lot like them. A true sampling distribution would have included the percentage from every possible sample of size 10 (there are about 17 trillion such samples), not just the one sample per classmate that was in our dotplot. Moreover, no sample would appear twice in a sampling distribution; we made no attempt to make sure that such a thing did not happen in our dotplot. That said, it is unlikely that two people in class did wind up with the same sample of Senators, and we may well start to get the idea of the shape of the sampling distribution from the few samples that we included. (We just have to fill in another 17 trillion values to be sure.)

A statistic is a number that describes a sample. Examples include things like the mean height of a sample of students, the proportion of clam shells in a day's catch that contain pearls, etc. When a statistic is computed and plotted for each of many samples of size n, we begin to get this statistic's sampling distribution (actually, we would have to get all samples of the given size, computing the statistic each time, before we could have the full sampling distribution). As we will see, it is not unusual for a sampling distribution to be approximately normal, with a large spread (standard deviation) for when the statistic demonstrates a good deal of variability and a small spread when the statistic has little variability (from sample to sample). How does increasing sample size affect variability?
An increase in sample size decreasing the spread (the standard deviation, specifically) of the sampling distribution of that statistic. What this means practically is that you can generally be more confident about the accuracy of the sample statistic in approximating the population parameter for larger SRSs than you can for smaller ones.

Read the last two paragraphs on p. 273. Does the information from these two paragraphs contradict your answer above? Why or why not?
Sample size (usually) does affect the variability of a statistic, but population size generally does not. So, a sample of 2500 from a population of 740,000 inhabitants does yield a more reliable statistic (in general) than a sample of size 100 from this same population. However, if we hold the sample size fixed while varying the population size (from 740,000 to 270 million), there is little affect on the reliability of the statistic. There is no contradiction here.

Statistics for samples that are SRSs tend to be unbiased estimators of population parameters. Give an example of a sample selection process and a statistic one might calculate from the resulting sample that would not be an unbiased estimator of a population parameter.
Suppose you want to know about the general air quality in the U. S. If the air you include in your sample only comes from the vicinity of Newark, NJ, the proportion of contaminants you find in the sample is likely to be biased toward being more contaminated than is true for the air in the U. S. as a whole. The sampling distribution for samples taken in this fashion (only from Newark) will almost certainly have a higher mean level of contaminant than the sampling distribution for all air samples.

File translated from T_EX by T_TH, version 2.87.
On 12 Feb 2001, 12:43.