# Math W50: How do they know my opinion? # January, 2000 Monday, January 3 Note: No Class Tuesday, January 4 Note: No Class Wednesday, January 5 Note: No Class Thursday, January 6 Topic: Introduction Topic: 7 Critical Components @ overheads/seven-critical.shtml Topic: Measurement Act: Survey 1 @ data/survey01.shtml Act: Survey 2 @ data/survey02.txt Read: Utts 1-3 Vocab: Vocab: statistics, 7 critical components, %% individual (unit), variable, value of variable, %% categorical variable, measurement variable, %% discrete, continuous, validity, reliabilty, bias, variability #Note: Class begins 2pm in NH 295 course intro, student intros, why you are in the course Def of stats get student ideas (on board) more than one use of word statistics descriptive (summary measures) inferential Video - FAPP #6 beginning to about 4:00 in diner some defs (primarily of inferential statistics) Utts: a collection of procedures and principles for gaining information in order to make decisions when faced with uncertainty Amabile: a way of taming uncertainty, of turning raw data into arguments that can resolve profound questions Moore: the science of gaining information from numerical data Garfunkel: the science of drawing conclusions from data with the aid of the mathematics of probability dictionary: the mathematics of the collection, organization, and interpretation of numerical data, especially the analysis of a population's characteristics by inference from sampling etymology: latin: statisticus - of state affairs political right down to etymology key elements data uncertainty information/decision-making ability science or math? Video - Against All Odds #1: beginning thru about 13:30, 26:00 to end [whole thing if time -- but probably not] Three phases of a statistical study (and outline of course) 1) collect data (statistical design) 2) organize data (data analysis) 3) draw conclusions from data (statistical inference) Survey 1: Have students add to existing list collect data from students, have them gather 2 more each Quiz 0: info about students ----------- Break ----------- Fill out survey 1 and survey 2 7 critical components that determine soundness of statistical studies 1) source of funding (why was it done?) 2) researcher contact 3) individuals studied and how selected 4) measurements made (quesitons asked) 5) setting 6) extraneous differences (other explanations for effect) 7) magnitude of claimed effect example: drug to cure excessive barking in dogs (page 22) example: US Voters Focus on Selves, Poll says (from Moore S:C&C) example: most women unhappy in their choice of husbands (page 24) didn't do this one How data are organized: units/individuals/subjects variables values of variables immagine a grid layout Some terminology categorical vs. measurement variables continuous vs. discrete validity (proxies) reliability bias -- systematically off in same direction variability pictures of variability/bias possibilities (target) Some things are not easily or obviously measured: happiness (happiness newpaper article) Apply terminology to Survey 1 Summary -- Measuring is a difficult task example: finding cheapest grocery store Survey 2: Wording issues did questions 1 and 2 in class data: 1) 6-0; 0-6 in expected direction 2) 2-4; 2-4 in expected direction Wording Pitfalls Bias (Intentional or unintentional) Elian Gonzalez Do you agree that he should be returned to his father in Cuba? [with US Immigration and Naturalization Service] Do you agree that he should be allowed to remain with his relatives in Florida? [agree with boy's attorneys] Confidentiality and Anonymity (people may lie) positive AIDS test? financial and sexual issues a methods to ask sensitive questions (some answers random) Desire to Please "How much do you smoke?" vs. cigarette sales didn't discuss this particular pair, put on quiz for tomorrow Unneccessary complexity, misunderstandings 1992 American Jewish Committee [NY Times July 8, 1994] Does it seem possible or does it seem impossible to you that the Nazi extermination of the Jews never happened? (22% possible) Does it seem possible to you that the Nazi extermination of the Jews never happened, or do you feel certain that it happened? (1% possible) -------------- only got this far -------------------- "do you own stock?" cartoon [do on monday instead] Asking the uninformed will see an example in a video 1975 Public Affairs Act [didn't exist] (page 35) in 1978 1/3 expressed opinion in 1995 nearly 1/2 expressed opinion with leading political bias, 53% expressed opinion and tended to go with political leanings Ordering of questions, additional information peer pressure example (34) Open/closed questions museum example Levi jeans example (page) Defining your terms adolescent sex: increasing or decreasing? (page 37) unemployment changes have been made to survey questsions - define week as sunday to saturday so people don't underreport weekend work - redesign questions so definitions of 'work', 'looking for work', 'on layoff' are uniform - emphasize difference between 'on layoff' and 'fired' - also split long questions into series of short questions "The Truth but not the Whole Truth" Friday, January 7 Topic: Sampling Topic: Observation and Experimentation Act: Random Samples of %% Circles @ http://www.calvin.edu/~rpruim/cgi-bin/random-digits.cgi Act: What comes to mind @ data/words.txt Read: Utts 4 Read: Utts 5 Due: HW #1 @ hw01.shtml Vocab: Vocab: observational study, experiment, %% unit, population, sample, sampling frame, sample survey, %% census, margin of error, "1 Over Root n Rule", %% simple random sampling, stratified random sampling, %% cluster sampling, systematic sampling, %% random digit dialing, %% multi-stage sampling, convenience sample, response rate, %% treatment, explanatory variable, %% response variable (outcome variable), control, %% interaction, %% confounding, placebo, placebo effect, Hawthorne effect, %% experimenter effect, double-blind, single-blind %% Basic categories of studies sample survey -- ask a bunch of people a question experiment -- looking for relationships, cause/effect key: treatment observational study -- looking for relationship, but no treatment meta-analysis case study (popular in media -- clip from ABC news?) First 3 all have something in common: you don't measure every unit To conduct a study properly 1) get a representative sample 2) get a large enough sample 3) decide between observational study and experiment Sampling terminology: unit, population, sample, sampling frame, sample survey, census, margin of error sampling vs. census: samples are often possible, faster, accurate SRS: sampling circles using random digits Video -- Against All Odds #14: skip 35:20-41:10 [14 without bead sampling, 21 minutes with] Sampling Methods -- make sure they understand SRS stratified random sampling cluster sampling systematic sampling random-digit dialing multistage sampling Sampling difficulties wrong sampling frame not reaching selected individuals low response rate volunteer sample haphazard or convenience sample Literary Digest poll (1936) Alf Landon predicted to get 3-2 victory volunteer response to sample from poor frame George Gallup Quiz 1 ------------------------------------- Break: do the three words experiment ------------------------------------- Experiment terminology and set-up treatment, explanatory variable, response variable, control can't get cause/effect from observational study alone individual divided into groups each group gets different treatment measurements taken and comparison made between groups looking for cause/effect: treatment -> response Video -- FAPP segment on Physicians Health Study Problems with experiments Placebo effect Gastric freezing to relieve ulcer pain (34% in gf group, 38% in placebo group) Lack of control, confounding variables -- randomize 1940 propaganda experiment [Germany occupied France] Interaction -- measure and report possible variables can turn possible confounding variables into possible interaction variables by measuring nicotine patch and smokers at home [didn't mention] Hawthorne effect -- not always possible to avoid this problem new curricula Experimentor Bias -- blindness Ecological validity/generalizability what is the population, was setting a factor? over weekend: how many words come to mind from 4 others == Monday, January 10 Topic: Statistical Summaries Topic: Distributions Read: Utts 7 Read: Utts 8 Act: How Many Raisins? Due: HW #2 @ hw01.shtml Vocab: Vocab: mean, median, mode, outlier, range, stemplot, histogram, %% shape, symmetric, bell-shaped, unimodal, bimodal, skewed, %% five-number summary, quartile, boxplot, interquartile range, %% variance, standard deviation, frequency curve, normal curve, %% proportion, percentile, standardized score, z-score, %% standard normal distribution, "68-95-99.7 Rule" Where are we now? many problems with statistical studies are not mathematical 7 critical components Pam Plantinga left job because she was told what to find you can't do good statistics unless you start with good data individual issues four issues: validity, reliability, bias, variablility measurement must decide what and how to measure -- not always easy ask about any problems with survey 1 wording dealing with people is especially hard do you own stock cartoon proxies (validity/reliability trade-off) sampling/assignment issues good samples are representative and large enough 1/root n rule experiment vs. observation, the role of treatment randomness used to reduce bias moving into a phase of "what do we do with all this data?" but first ... ... Ethics of experiments informed consent use of doctors in physicians health study kids in art experiment human subjects & review boards Stanley Milgram (Yale): shock and memory done 1960's, probably not doable today Penny's data collection in grad school risk: cost/benefit analysis reasonable hope, reasonable doubt criteria for clinical trials (did friday) Some specific examples and issues Nazi data give them article (2 versions) and then discuss from Bouma's class Yes No good use 6 2 (1 of 2 struggled) criticism 5 3 twins studies -- ideal matched pairs? PHS used only middle-aged men, what about women? minorities 1 in 5 men has heart attack before age 65 1 in 17 women has heart attack before age 65 (did friday) AIDS and slow process of clinical trials measuring easier but less reliable things pressure to release drugs before effectiveness demonstrated (mentioned friday, didn't mention more here) domestic violence: warn and release or arrest can a randomized experiment be done? [no informed consent] [didn't do here] Raisins guesses Looking at the data: stemplots, histograms choosing bin-size subdividing stems Measures of center -- what is a typical value? mean, median, mode what is unusual? outliers Measures of spread range, standard deviation Five-number summary & boxplots Some shape descriptions symmetric, skewed, bell-shaped, unimodal, bimodal Intro to frequency curves and cummulative probability (proportions) generalization of 5-number summary deciles, percentiles from standard tests ----------------------------------------------- Quiz 2 get data from survey 1, survey 2, raisin counts break ----------------------------------------------- Normal distributions symmetric, bell-shaped, determined by mean and standard dev. Empirical Rule: 68-95-99.7 standardized scores (z-scores) charts and computers to get other values (chart on page 137) examples of approx. normal distributions height (male 18-74) N(5'9",3") = N(69,3) height (female 18-74) N(5'3.5",2.5")= N(63.5,2.5) height (male 18-24) N(5'10",3") = N(70,2.8) height (female 18-24) N(5'4.3",2.6")= N(64.3,2.6) height (male 11) N(146cm,8cm) = N(57.5,3.15) weight (male 18-24) N(162,29.1) weight (female 18-24) N(134,27) 1994 SAT mean Verbal = 423, 7% above 600, 42% below 400 approx s.d. 120 now renormalized to N(500,100) what percent score 800? 1987 CA women's salaries mean=11,600; s.d.=10,500 Tuesday, January 11 Topic: More Pictures of Data Topic: Relationships between Categorical Variables Topic: Chi-Squared Read: Utts 9 Due: HW #3 @ hw02.shtml Vocab: pie chart, bar chart, pictogram, line graph %%, scatter plot comments on quiz 2 abstraction in mathematics precise use of language in mathematics Look at survey 1 data correct typos, errors, etc. Common problems with plots, graphs, and pictures 1) missing labels 2) scale doesn't start at 0 3) changes in labeling along an axis 4) misleading units 5) poor information Picture Checklist overall impression 1) is message clear? 2) is purpose clear? 10) is there any clutter? source 3) is source given? 4) is source reliable? labeling 5) is labeling clear? 6) do axes start at 0? 7) is scale constant? 8) are there any breaks along axis? are they easy to spot? 9) was inflation adjustment made? Banner chart and follow-up letters Utts, Figure 9.9 (page 149) and fixed version Read: Utts 12 Act: Golf Balls in the Yard @ data/golfballs.shtml Vocab: contingency table, cell, row, column, conditional percentage, %% rate, test statistic, chi-sqaured statistic, p-value, %% statistical significance, proportion, odds, relative risk, %% odds ratio, Simpson's paradox %% Physician's Health Study data attack no att. total rate/1000 Aspirin 104 10,933 11,037 9.4 Placebo 189 10,845 11,034 17.1 Total 293 21,778 22,071 Question of the day: why is this data so compelling? significance: how unusual is this? (chi-squared) magnitude: how big is this? Look at data from Survey 2 (in class) and simulate with cards results: 4-2 split not so unusual, 6-0 split more compelling Golf ball distribution and test statistics 4-sided dice, computer simulation Chi-squared statistic (on golf ball data again) what should we expect if there is no association? how can we adjust our measurement to account for sample size? obs exp diff n dif chi sq. P-value cum prob 137 121.5 15.5 1.97737 8.46914 0.0372487 0.962751 138 121.5 16.5 2.24074 107 121.5 -14.5 1.73045 104 121.5 -17.5 2.52058 Return to Physician's Health Survey significance: do chi squared P-value: interpretted the same for all statistical tests! chi-sqared table (degrees of freedom) magnitude: relative risk percentage having trait = (# with trait / total #) * (100%) proportion having trait = (# with trait / total #) i.e. probablility written as decimal risk of having trait = # with trait / total # odds of having trait = # with / # without to 1 = # with to # without odds against trait = # without / # with to 1 = # without to # with relative risk: = one risk / other risk increased risk: = change / original (* 100%) Misrepresenting risk [covered only implicitly in discussion of PHS] 1) no baseline risk given 2) no time period given 3) unclear population (may not apply to you) Preview Simpson's Paradox -- Berkeley Admissions example Wednesday, January 12 Topic: Probability and Randomness Read: Utts 15 Read: Utts 16 Vocab: probability, relative frequency, personal probability, coherent, Vocab: mutually exclusive events, independents events, Vocab: cummulative probability, expected value, four probability rules Due: HW #4 @ hw02.shtml Review Chi-Squared Berkeley admissions (1=men 2=women) (Utts page 221, exercise 14) video clip -- FAPP #10 2:04:45 -- 2:08:40 [didn't show] Expected counts are printed below observed counts accept reject Total 1 450 550 1000 416.67 583.33 2 175 325 500 208.33 291.67 Total 625 875 1500 Chi-Sq = 2.667 + 1.905 + 5.333 + 3.810 = 13.714 DF = 1, P-Value = 0.000 ------------------------------------------------ Simpson's Parodox If we divide by programs applied to, we see a different story ------------------------------------------------ Expected counts are printed below observed counts acceptA rejectA Total 1 400 250 650 403.45 246.55 2 50 25 75 46.55 28.45 Total 450 275 725 Chi-Sq = 0.029 + 0.048 + 0.255 + 0.418 = 0.751 DF = 1, P-Value = 0.386 ------------------------------------------------ Expected counts are printed below observed counts acceptB rejectB Total 1 50 300 350 79.03 270.97 2 125 300 425 95.97 329.03 Total 175 600 775 Chi-Sq = 10.665 + 3.111 + 8.783 + 2.562 = 25.120 DF = 1, P-Value = 0.000 ------------------------------------------------ hospital example (Utts chapter 12, pages 213-215) give combined results first, then separate survive die s rate d rate standard 505 595 .46 .54 new 195 905 .18 .82 total 700 1500 standard 5 95 .05 .95 new 100 900 .10 .90 total 105 995 discrimination example (Utts, chapter 12, pages 215-217) ?? death penalty 326 cases, white defendant: 19/160 get death pen. (.119) black defendant: 17/166 get death pen. (.102) when separated by victim's race, see different story [overhead from Moore 207] point: statistically significant means that the effect is not likely to be due to chance alone, but there may be some other factor than the obvious one that is reason Probability random: long-term predictability vs short-term unpredictability law of large numbers / "law" of small numbers scale: 0 to 1 (0% to 100%) personal vs. mathematical (relative frequency) 4 Rules and applications axiomatic method four rules "overhead" examples probability of losing luggage is 1/176 (Krantz) P(heart attack kills) = .33, P(cancer kills) = .2 [assuming death] estimated probability of grades probability of two girls (P(boy) about .512) probability of winning 2 of 3, 3 of 5 given an estimate for each game video -- Life By the Numbers (#4 Prob) 02:00 (or 08:20) - 27:00: intro to prob., Graunt, casinos 27:00 - 42:25: polling, polio, prob assesses results Thursday, January 13 Topic: More Probability Topic: Sampling Distributions Act: Sampling Milk Lids Vocab: expected value, false positive, false negative, %% gamblers fallacy, statistic, parameter, sampling distribution %% HW/quiz questions/comments P-value observational study & cause/effect equal portions on normal curve pictures on HW Long-term vs. short-term (Free-Throw simulation in Excel) gambler's fallacy, "law" of small numbers" Expected Value of Lottery Ticket insurance False Positives/False Negatives data from Utts 303 Quiz 5 Sampling hands on (milk jug lids) web simulation video -- Life by the Numbers (#4) Read: Utts 17 Read: Utts 18 (Categorical Parts) Due: HW #5 @ hw03.shtml Friday, January 14 Topic: Confidence Intervals Topic: Hypothesis Testing Act: Colors of Reeses' Pieces Read: Utts 19 Read: Utts 20 (optional) Read: Utts 21 Due: HW #6 @ hw04.shtml Vocab: confidence level, confidence interval, margin of error,%% hypothesis testing, test statistic,%% null hypothesis, alternative hypothesis, p-value Questions on HW/Quiz, etc bin size for histograms (Old Faithful eruption times) Nothing new today: just going to put all the pieces together Two inference tasks estimating a paramter 1) get sample 2) compute statistic from sample 3) determine the quality of that statistic as an estimate for paramter a) confidence level b) confidence interval, margin of error testing a hypothesis Video -- Against All Odds #23 (01:40 - 20:15) Woburn Leukemia, BLS stats, example computations Confidence Intervals for proportions conditions under which the math applies 1) parameter must have a fixed (unknown value) for population 2) simple random sample or repeatable experiment 3) sample includes 5 of each outcome 4) population at least 10 times size of sample the math: distribution of sample proportions (statistic) will be approximately normal N(p,root(p(1-p)/n) example: suppose fair coin (50% heads) flip it 100 times flip it 400 times flip it 1600 times for each: ______% of time with _______ ______% of time between _____ and _____ example: public opinion poll (suppose 30% rate) sample size 1500 unknown p, what do we do? note that (p)(1-p) < .25, so use .25 note that p-hat is usually very close to p, especially if the sample is large example: sample 1600 people and 500 people say yes example: Reeses' pieces Testing a hypothesis 1) determine null and alternative hypotheses 2) collect data 3) compute test statistic test stat is a measure of how true the null hypothesis seems to be 4) determine likelihood of such an extreme test statistic if null hypothesis is true (p-value) 5) make a decision Testing Hypotheses for Proportions test statistic is z-score example: predicting election outcomes == Monday, January 17 Topic: More Inference for Proportions #Topic: Confidence Intervals #Topic: Hypothesis Testing for proportions #Topic: Chi-Squared & Hypothesis Testing Read: Utts 21 Read: Utts 22 Due: HW #7 @ hw04.shtml Vocab: one-sided hypothesis test, two-sided hypothesis test,%% type 1 error, type 2 error, (power of a test) Answer questions and do examples of inference procedures 200 british couples, 10 with wife taller than husband 61/165 correct in 1 out of 4 esp test 55 quitters out of 120 volunteers assigned to use nicotine patch 24 of 120 placebo users quit quitting measured after 8 weeks Hugo -- 11 times in 30 rolls; how unusual is that? 105 times in 300 rolls? have students roll 30 dice several times and count number of 6's rolled (work in pairs) hypothesis testing ganzfeld experiments 122 successes out of 355 trials drinking and sex 77 of 404 men, 16 of 138 wemen report of smoking results above High smoking cessation rates were observed in the active nicotine pathc group at 8 weeks (46.7% vs 20%) (P<.001) and at 1 year (27.5% vs 14.2%) (P = 0.11). Tuesday, January 18 Topic: More about Inference Topic: Significance and Importance Read: Utts 23 Due: HW #8 @ hw05.shtml Wednesday, January 19 Topic: Risk Assessment Read: Utts 12.3-12.4 Read: Utts 16 Greatest Risks (problem 11.4 on page 191 of Utts) have students rank risks of 10 to 30 different items discuss how to measure relative risk Act: Video: Are We Scaring Ourselves to Death? @ videos/stossel.shtml Thursday, January 20 Topic: Test Friday, January 21 Topic: Work on Projects Due: Report on video (via email) @ videos/stossel.shtml == Monday, January 24 Topic: Consumer Price Index @ overheads/prices.shtml Due: HW #9 @ hw05.shtml Vocab: inflation, Consumer Price Index, price index, base year,%% Index of Leading Economic Indicators Review of video -- comments and questions progress report on projects Consumer Price Index Break hand out exams inference review example -- Hugo 5/12; how unusual is that? example -- 100 coin tosses does coin look fair if ... 45 heads 40 heads 35 heads comments on test 1 Tuesday, January 25 Topic: Time Series Topic: Wrap-Up Due: HW #10 @ hw06.shtml Vocab: time series, long-term trend, seasonal variation,%% seasonal adjustment, cycle A look at Calvin Tuition Data Time series plot Calvin Tuition Data plot CPI (with Calvin Price Index?) births Dow Jones postage? Things to watch for cherry picking data choice of units ($ vs inflation adjustments, etc) vertical axis doesn't start at zero (magnifies steepness) Wednesday, January 26 Topic: Test Due: HW #11 @ hw06.shtml Thursday, January 27 Note: no class Friday, January 28 Note: no class == end of calendar