2 175 325 500
208.33 291.67
Total 625 875 1500
Chi-Sq = 2.667 + 1.905 +
5.333 + 3.810 = 13.714
DF = 1, P-Value = 0.000
------------------------------------------------
Simpson's Parodox
If we divide by programs applied to, we see a different story
------------------------------------------------
Expected counts are printed below observed counts
acceptA rejectA Total
1 400 250 650
403.45 246.55
2 50 25 75
46.55 28.45
Total 450 275 725
Chi-Sq = 0.029 + 0.048 +
0.255 + 0.418 = 0.751
DF = 1, P-Value = 0.386
------------------------------------------------
Expected counts are printed below observed counts
acceptB rejectB Total
1 50 300 350
79.03 270.97
2 125 300 425
95.97 329.03
Total 175 600 775
Chi-Sq = 10.665 + 3.111 +
8.783 + 2.562 = 25.120
DF = 1, P-Value = 0.000
------------------------------------------------
hospital example (Utts chapter 12, pages 213-215)
give combined results first, then separate
survive die s rate d rate
standard 505 595 .46 .54
new 195 905 .18 .82
total 700 1500
standard 5 95 .05 .95
new 100 900 .10 .90
total 105 995
discrimination example (Utts, chapter 12, pages 215-217) ??
death penalty
326 cases, white defendant: 19/160 get death pen. (.119)
black defendant: 17/166 get death pen. (.102)
when separated by victim's race, see different story
[overhead from Moore 207]
point: statistically significant means that the effect is not
likely to be due to chance alone, but there may be
some other factor than the obvious one that is
reason
Probability
random: long-term predictability vs short-term unpredictability
law of large numbers / "law" of small numbers
scale: 0 to 1 (0% to 100%)
personal vs. mathematical (relative frequency)
4 Rules and applications
axiomatic method
four rules "overhead"
examples
probability of losing luggage is 1/176 (Krantz)
P(heart attack kills) = .33, P(cancer kills) = .2
[assuming death]
estimated probability of grades
probability of two girls (P(boy) about .512)
probability of winning 2 of 3, 3 of 5 given an estimate
for each game
video -- Life By the Numbers (#4 Prob)
02:00 (or 08:20) - 27:00: intro to prob., Graunt, casinos
27:00 - 42:25: polling, polio, prob assesses results
note that (p)(1-p) < .25, so use .25
note that p-hat is usually very close to p, especially if
the sample is large
example: sample 1600 people and 500 people say yes
example: Reeses' pieces
Testing a hypothesis
1) determine null and alternative hypotheses
2) collect data
3) compute test statistic
test stat is a measure of how true the
null hypothesis seems to be
4) determine likelihood of such an extreme test
statistic if null hypothesis is true (p-value)
5) make a decision
Testing Hypotheses for Proportions
test statistic is z-score
example: predicting election outcomes
==
Tuesday, January 25
Topic: Time Series
Topic: Wrap-Up
Due: HW #10 @ hw06.shtml
Vocab: time series, long-term trend, seasonal variation,%%
seasonal adjustment, cycle
A look at Calvin Tuition Data
Time series
plot Calvin Tuition Data
plot CPI (with Calvin Price Index?)
births
Dow Jones
postage?
Things to watch for
cherry picking data
choice of units ($ vs inflation adjustments, etc)
vertical axis doesn't start at zero (magnifies steepness)