Math 143 C/E, Spring 2001
IPS Reading Discussion Questions
Chapter 3, Section 2 (pp. 237-250)
Bias is a predisposition to one outcome or option
over others. In terms of a statistical study,
one does not want a biased design since a predisposition
for one outcome over others stands in the way of getting
the truth about the question the study was meant to
answer.
Suppose we wish to know whether one arrangement of keys, say, the standard U. S. keyboard layout (aka QWERTY), on a computer keyboard allows for faster typing than another arrangement, say, the Dvorak layout. We might give two groups of people a certain amount of time to practice with the layout assigned to them and then do a speed test. Nevertheless, if all of the people in both groups are experienced typists using the QWERTY layout, there is a bias in favor of that layout over the other.
A placebo is a treatment (like a salt tablet) that is known to have no physiological effect. When people are receiving treatment that they think may improve some medical condition, it has been observed that the psychological effects alone can bring improved health. Thus an experiment that is meant to determine whether a certain drug combats a certain illness is likely to be biased towards the affirmative conclusion if the control group knows it is not receiving any treatment. Giving the control group a placebo (so that the subjects in that group have the same psychological benefit as those in the treatment group) eliminates the bias and makes it possible to conclude that any change in outcome between groups is due to some other variable. Such a study is called a blind study. There is still the possibility that a researcher who knows which patients are in each of the two (control and treatment) groups may treat members of one group differently than those of the other, and there are situations in which this may also introduce a bias. Such a risk is removed when even the researcher doesn't know who is in what group (i.e., a double blind study).
Only through an experiment can one establish a cause-and-effect relationship between the treatment and the observed response. A particularly interesting example is the effect of cigarette smoking on lung cancer. To establish causation, one would have to select people (probably at birth), have certain ones smoke for a number of years and forbid the others from smoking, and then monitor each subject for lung cancer. Such an experiment would obviously be unethical. All of the evidence we have about the relationship between the two comes from (numerous) observational studies, and, as a result, the U. S. Surgeon General's Office has taken years to progress to the point of saying, ``smoking cigarettes causes cancer".
There is just one factor (categorical variable) distinguishing the groups, that being the type of assistance the utility company provides in monitoring energy usage. There are three values (levels) of monitoring assistance: installing a meter in the house that shows the members of the household their energy use, providing information and charts for people to monitor their own use, and no assistance at all.
Random assignment is the best way to control the effect that lurking variables (unforseen factors) might have on the outcome of the experiment, as it more often than not divides up experimental units evenly who have similar values for those lurking variables. The ``more often than not" statement is put on more secure footing when there are many units/subjects in the study.
When certain groups of experimental units are noted to
be similar in a fashion that is thought may have an
effect on the outcome, one may employ block design by
dividing people from these groups (blocks) up evenly
among treatments. In the particular case of the nutrition
study proposed above, it seems particularly reasonable to
consider gender blocks. (There may be other important
ones as well.)
It is true that we use randomization so that all variables (not just gender) have values (such as "Male", "Female") that are evenly represented in the various groups. While there is no better way of achieving such parity, randomness is not a sure way to get it. There will almost certainly be occasions when values of certain variables appear more often in one group than the other. When we foresee that this variable may have nontrivial effect on the outcome, it seems reasonable to take special care to distribute things evenly.
It is an example of the latter. The various regions of the field in which a pole with sticky boards is placed are all give two treatments, a yellow board and a green board.