Review: Exam 2

* Statistical studies

  1. Study types
    1. Controlled experiements, observational studies, sample surveys
    2. Advantages/disadvantages of one type of study over another in various settings
  2. Evaluating studies (Don't memorize, just be comfortable with these ideas)
    1. 7 critical components
    2. Pitfalls to watch out for in various types of studies
      1. Familiarize yourself with types of bias for observational studies, both those mentioned in text and those on this page
  3. Experimental design
    1. Identification of factors/levels and response variables
    2. Use of controls
      1. Placebos and their purpose
      2. Double-blind studies
    3. Randomization
      1. Importance of randomization when assigning experimental units to treatment groups (Note: Without this cause-and-effect relationships could not be established — be able to explain why)
      2. Use of a table of random digits
    4. Importance of replication of experiment on many units
    5. Block design
      1. Makes group assignment a little less random, but serves a purpose (Know what this purpose is.)
      2. Similar to stratification in sample surveys
      3. Special case: matched-pairs design — two forms:
        1. Each unit receives all treatments (probably only two) in a random order
        2. Experimental units come in pairs and are split up randomly so that one receives one treatment, the partner receives the other
  4. Sampling
    1. vs. taking a census
    2. Sampling methods (Which ones are valid? Be able to identify sample type given a scenario.)
      1. Simple random sample (SRS) of size n
        1. All individuals/all groups of size n are equally-likely to be chosen
        2. Reasons that an SRS is often impractical
        3. Identifying (from descriptions) sampling methods that do and do not result in an SRS
      2. Stratified random sample
      3. Systematic random sample (see problem 41, p. 264)
      4. Convenience sample (Ex.: A polster who wants to know America's opinion interviews people who pass by her in a nearby mall parking lot on a Tuesday morning)
      5. Voluntary response sample
      6. Multistage sample
Terms to know:
statistical inference, anecdotal evidence, population, design, sampling frame, exploratory data analysis, experimental units/subjects, treatment, factors, levels, confounding/lurking variables, blocks, strata, voluntary bias, control/treatment groups, response, nonresponse, undercoverage, parameter, statistic, sampling variability
* Probability
  1. Concepts/terms
    1. Experiment, outcome, sample space, event
    2. Unions (“A or B”), intersections (“A and B”), complements (“not A”) of events (and depictions of each using a Venn diagram — note that in this applet they write AB when they mean A and B)
    3. Independence, disjointness (= mutual exclusivity) of events
    4. When outcomes are equally-likely
  2. Randomness
    1. short-term unpredictability, long-term predictability
    2. Law of Large Numbers (pp. 328-332)
      1. What does it guarantee?
      2. What information does it leave out?
      3. What is the mistake characterized as the “law of small numbers” (or the “gambler's fallacy”)
    3. why it must be a part of sample selection (how later analysis of data depends on it)
    4. Variability among samples — sampling distributions
  3. Rules of probability models
    1. 0 £ p £ 1 ; p = 0 for null events, p = 1 for certain events
    2. Sum of probabilities over all outcomes is 1
    3. Complementation rule: P(Ac) = 1 - P(A)
    4. P(A and B) = P(A) P(B) when A and B are independent
    5. P(A or B) = P(A) + P(B) - P(A and B); this becomes P(A or B) = P(A) + P(B) when A and B are disjoint
  4. Assessing probabilities
    1. of continuous random variables
      1. Uniform distributions: probability = area of an appropriate rectangle (See Example 4.17 and Figure 4.10, pp. 318-319)
      2. Normal distributions N(m, s)
        1. Standardizing values (converting a value of X to a standardized value Z) and the reverse process
        2. Using Table A to go back and forth between probabilities and standardized scores
        3. Interpreting a probability P(a < X < b) as area under a normal curve
        4. Interpretation of the standard deviation s
          1. Distance from center to point where inflection occurs
          2. The 68-95-99.7 rule (remember, these numbers are not exact; you should be able to tell what they are exactly using Table A)
    2. of discrete random variables
      1. Binomial distributions via Table C, formula (learn it), or normal approximation (when appropriate)
      2. Uniform distributions
Terms to know:
probability, trials (of an experiment), expected value mX (= mean) and standard deviation sX of a random variable, variance
* Sampling Distributions of sample statistics
  1. What are they (in general)?
  2. Relationship to population distributions (in fact, the sampling distribution for sample size n = 1 is the population distribution)
  3. Some specific sample statistics
    1. counts
      1. recognize situations in which the sampling distribution is
        1. binomial B(n, p)
        2. approximately binomial (close enough)
        3. approximately normal (already approximately binomial)
      2. expected value (mean) of the distribution is np
      3. variance of the distribution is np(1-p) (Remember: s.d. is the sq. root of variance.)
      4. determining probabilities like P(X < c), P(c < X < d), P(X > c)
        1. when count X is binomial (or approximately so, but not approx. normal (Table C)
        2. when count is approximately normal (use normal approximation)
    2. proportion = count/(sample size)
      1. recognize situations in which approximately normal (Note: same as when counts are distributed approximately normally)
      2. expected value (mean) of the distribution is p
      3. variance of the distribution is p(1-p)/n
    3. sample mean
      1. expected value (mean) of the distribution is m, same as population
      2. variance of the distribution is s2/n
      3. distributed normally (for every value of n) if population is
      4. central limit theorem
        1. What does it say?
        2. What implications does it have for sampling distributions for means?
        3. How is it related to the normal approximation to binomial distributions? (See p. 404)
Terms to know:
count, proportion, parameter, statistic, population, sample, population distribution, sampling distribution
* Confidence intervals for means
  1. Purpose (Why are they used? What do they tell you? How should they be interpreted?)
  2. Construction: (estimator) ± z* × (spread of sampling distribution for estimator)
  3. Effect on CIs when n (sample size) or C (level of confidence) is changed
Terms to know:
margin of error, critical value, confidence level

Back to Math 143C Class Page


This page maintained by:
Thomas L. Scofield
Department of Mathematics and Statistics
Calvin College

Last Modified: Monday, 26-Jul-2004 13:10:08 EDT