Statistics
Measures of Distribution Shape and Detecting Outliers
Statistics (exercises)
Aleksandra Pawłowska
May 19, 2020
Glossary
Skewness A measure of the shape of a data distribution. Data skewed to the left result in negative skewness; a symmetric data distribution results in zero skewness; and data skewed to the right result in positive skewness. z-score A value computed by dividing the deviation about the mean (xi − x̄) by the standard deviation s. A z-score is referred to as a standardized value and denotes the number of standard deviations xi is from the mean. Chebyshev’s theorem A theorem that can be used to make statements about the proportion of data values that must be within a specified number of standard deviations of the mean. Empirical rule A rule that can be used to compute the percentage of data values that must be within one, two, and three standard deviations of the mean for data that exhibit a bell-shaped distribution. Outlier An unusually small or unusually large data value.
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers
Task 1
Consider a sample with data values of 10, 20, 12, 17, and 16. Com- pute the z-score for each of the five observations.
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers
Task 1 – solution
Consider a sample with data values of 10, 20, 12, 17, and 16. Com- pute the z-score for each of the five observations. z1 = −1.25, z2 = 1.25, z3 = −0.75, z4 = 0.5, z5 = 0.25
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers
Task 2
Consider a sample with a mean of 500 and a standard deviation of 100. What are the z-scores for the following data values: 520, 650, 500, 450, and 280?
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers
Task 2 – solution
Consider a sample with a mean of 500 and a standard deviation of 100. What are the z-scores for the following data values: 520, 650, 500, 450, and 280? z1 = 0.2, z2 = 1.5, z3 = 0, z4 = −0.5, z5 = −2.2
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers
Task 3
Consider a sample with a mean of 30 and a standard deviation of 5. Use Chebyshev’s theorem to determine the percentage of the data within each of the following ranges:
1 20 to 40 2 15 to 45 3 22 to 38 4 18 to 42 5 12 to 48
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers
Task 3 – solution
1 20 to 40 75% 2 15 to 45 89% 3 22 to 38 61% 4 18 to 42 83% 5 12 to 48 92%
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers
Task 4
Suppose the data have a bell-shaped distribution with a mean of 30 and a standard deviation of 5. Use the empirical rule to determine the percentage of data within each of the following ranges:
1 20 to 40 2 15 to 45 3 25 to 35
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers
Task 4 – solution
1 20 to 40 95% 2 15 to 45 almost 100% 3 25 to 35 68%
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers
Task 5
The results of a national survey showed that on average, adults sleep 6.9 hours per night. Suppose that the standard deviation is 1.2 hours.
1 Use Chebyshev’s theorem to calculate the percentage of individuals who sleep between 4.5 and 9.3 hours.
2 Use Chebyshev’s theorem to calculate the percentage of individuals who sleep between 3.9 and 9.9 hours.
3 Assume that the number of hours of sleep follows a bell-shaped distribution. Use the empirical rule to calculate the percentage of individuals who sleep between 4.5 and 9.3 hours per day. How does this result compare to the value that you obtained using Chebyshev’s theorem in part (a)?
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers
Task 5 – solution
1 Use Chebyshev’s theorem to calculate the percentage of individuals who sleep between 4.5 and 9.3 hours. 75%
2 Use Chebyshev’s theorem to calculate the percentage of individuals who sleep between 3.9 and 9.9 hours. 84%
3 Assume that the number of hours of sleep follows a bell-shaped distribution. Use the empirical rule to calculate the percentage of individuals who sleep between 4.5 and 9.3 hours per day. How does this result compare to the value that you obtained using Chebyshev’s theorem in part (1)? 95%; if we assume, that data are distributed symetrically, then more observations are closer to the mean
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers
Task 6
The Energy Information Administration reported that the mean re- tail price per gallon of regular grade gasoline was $2.05 (Energy Information Administration, May 2009). Suppose that the standard deviation was $0.10 and that the retail price per gallon has a bell- shaped distribution.
1 What percentage of regular grade gasoline sold between $1.95 and $2.15 per gallon?
2 What percentage of regular grade gasoline sold between $1.95 and $2.25 per gallon?
3 What percentage of regular grade gasoline sold for more than $2.25 per gallon?
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers
Task 6 – solution
1 What percentage of regular grade gasoline sold between $1.95 and $2.15 per gallon? 68%
2 What percentage of regular grade gasoline sold between $1.95 and $2.25 per gallon? 81.5%
3 What percentage of regular grade gasoline sold for more than $2.25 per gallon? 2.5%
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers
Task 7
The national average for the math portion of the College Board’s Scholastic Aptitude Test (SAT) is 515 (The World Almanac, 2009). The College Board periodically rescales the test scores such that the standard deviation is approximately 100. Answer the following questions using a bell-shaped distribution and the empirical rule for the verbal test scores.
1 What percentage of students have an SAT verbal score greater than 615?
2 What percentage of students have an SAT verbal score greater than 715?
3 What percentage of students have an SAT verbal score between 415 and 515?
4 What percentage of students have an SAT verbal score between 315 and 615?
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers
Task 7 – solution
1 What percentage of students have an SAT verbal score greater than 615? 16%
2 What percentage of students have an SAT verbal score greater than 715? 2.5%
3 What percentage of students have an SAT verbal score between 415 and 515? 34%
4 What percentage of students have an SAT verbal score between 315 and 615? 81.5%
Aleksandra Pawłowska Measures of Distribution Shape and Detecting Outliers