Problem Set #3
Due Tuesday 10 PM EST
Exercises 3.1: What’s the normal probability? For each scenario,
sketch the normal curve and shade the area of interest
estimate the probability using the Empirical Rule
determine the actual probability (and show how you found it)
Scenario 3.1a: The IQ scores of a certain college follow a normal distribution with
µ=115 and σ=12. What is the likelihood that a student would score below 100?
Scenario 3.1b: The National Vital Statistics Report reveals that the duration of human pregnancies is normally distributed with µ =270 days and σ =15 days. What is the likelihood that a pregnancy would last between 260 and 280 days (8.5 and 9.5 months) ?
Scenario 3.1c: The National Center for Health Statistics revealed that weights of American women aged 20-29 are normally distributed with µ =140 lbs. and σ =30 lbs. What is the likelihood that an American woman in this age group would weigh more than
200 lbs.?
adapted from workshop Statistics p. 357-359
Exercise 3.2: Revisiting DoW #3 POSTED BELOW
Return to your group’s data for DoW #3. Run a hypothesis test to test if the difference in the
mean number of raisins for the generic brand and the Sun-Maid brand is statistically significant. Use a significance level of alpha = .005 (0.5%).
Be sure to include:
The Null Hypothesis (H0) and the Alternative Hypothesis (HA)
The Hypothesis Test values
The p-value
An interpretation of the results
DATA TO BE USED
Plan for research of raisins: Group B
Question:
Does the number of raisins in a ½ ounce box of raisins differ for a store-brand box of raisins and a name-brand box of raisins?
Variables:
1. Store box versus name brand raisins ( categorical)
2. the number of raisins ( quantitative)
3. size of box ( quantitative) – constant variable
Observational units:
1. Equal number of the boxes of raisins( store bought and brand name.
- If we cannot get ½ ounce boxes of raisins then
We would have to measure quantity of raisins per ounce in name brand and non-name brand.
2. The stores that the raisins were bought.
Type of study:
This is clearly an observational study, as the researcher would be required to choose randomly samples of both store bought and brand name raisins and count the number of raisins in each.
Expected difference:
We suspect that there will be a difference, but not significant. We think that the raisin boxes from both store brand and brand name may not contain equal number of raisins. For example two boxes of Sun –maid raisins may not contain the same number of raisins.
Sampling Strategy:
We think stratified random sampling would be a better approach to ensure that we have a representative number from both store box and brand name.
So we would first randomly choose 30 – ½ ounce boxes of store brand raisins and 30 – ½ ounce boxes of brand name raisins.
Potential bias:
Since not all the group members have access to the same store brand raisins then there may be a bias.
Our plan:
1. Research to find out the brand of raisins that are available.
2. We could each pick a brand and be responsible for that, or we could have two people do store brand (different or the same), and two people do the name brand
3 buy multiples boxes from the same brand at the same store.
4. We will count and record the number of raisins in each box.
5. We will share our data in a table or spreadsheet format.
6 We will analyses our findings
Exercise 3.3: High School Degree and Poverty Rate
Eight states were randomly selected from among the 50 United States. This data set presents the
percentage of households in each state that were below the poverty level (Poverty Rate) and the percentage of adults in the state who had earned a high school degree or higher (HS and Above).
(a) Determine the relationship (if any) between these two variables and the strength of this relationship. Justify your response.
(b) Determine the Least Squares Regression Best-Fit Line. Discuss how well this line fits the data.
(c) The state of Massachusetts had 75% of its population with a high school degree or higher at the time of the study. Predict the poverty rate using the Best-Fit Line for this
data. Discuss the reliability of this prediction.
Exercise 3.4: Gender-Stereotyping Toy Advertising
Allan J. Rossman, Beth L. Chance, and Robin H. Lock, Workshop Statistics: Discovery with Data (Key College Publishing,
2001), p. 160
To study whether toy advertisements tend to picture children with toys considered to be typical of their gender, researchers examined a random collection of pictures of toys in a number of children’s catalogs. For each picture, they recorded whether the child in the picture was a boy or a girl (ignoring ads with both a boy and a girl together). They also recorded whether the toy pictured was a traditional “male” toy (like a truck or a toy soldier) or a traditional “female” toy (like a doll or a play kitchen) or a “neutral” toy (like a puzzle or a toy phone). The results are summarized in the table below:
Boy shown
Girl shown
“boy” toy
59
15
“girl” toy
2
24
“neutral” toy
36
47
(a) Construct a segmented bar graph to display the conditional distribution for ads showing boys and ads showing girls. Briefly describe any key features you see in this graph.
A Chi Square Test for Independence was run to test for an association between these two variables, at the alpha = .005 (0.5%) significance level. The results are shown:
(b) State the Null Hypothesis and Alternative Hypothesis, and p-value for this test. Interpret the results of this test. Comment on whether the researcher’s data support the claim that toy advertisers do indeed tend to picture children with toys stereotypical of their gender.