Business Statistics Exam 2
Exam 2 Note: Both Part I and Part II consist of multiple-choice questions and all questions each have one and only one correct answer. Part I: Concept-Review Questions (30 points, each question worth 1 point) 1. Which of the following would be an appropriate null hypothesis? a) The mean of a population is equal to 55. b) The mean of a sample is equal to 55. c) The mean of a population is greater than 55. d) The mean of a sample is greater than 55. 2. Which of the following would be an appropriate null hypothesis? a) The population proportion is less than 0.65. b) The sample proportion is less than 0.65. c) The population proportion is not less than 0.65. d) The sample proportion is no less than 0.65. 3. Which of the following would be an appropriate alternative hypothesis? a) The mean of a population is equal to 55. b) The mean of a sample is equal to 55. c) The mean of a population is greater than 55. d) The mean of a sample is greater than 55. 4. A Type I error is committed when a) you reject a null hypothesis that is true. b) you don't reject a null hypothesis that is true. c) you reject a null hypothesis that is false. d) you don't reject a null hypothesis that is false. 5. A Type II error is committed when a) you reject a null hypothesis that is true. b) you don't reject a null hypothesis that is true. c) you reject a null hypothesis that is false. d) you don't reject a null hypothesis that is false.
6. The power of a test is measured by its capability of a) rejecting a null hypothesis that is true. b) not rejecting a null hypothesis that is true. c) rejecting a null hypothesis that is false. d) not rejecting a null hypothesis that is false. 7. If an economist wishes to determine whether there is evidence that mean family income in a community exceeds $50,000 a) either a one-tail or two-tail test could be used with equivalent results. b) a one-tail test should be utilized. c) a two-tail test should be utilized. d) None of the above.
8. If the p-value is less than in a two-tail test, a) the null hypothesis should not be rejected. b) the null hypothesis should be rejected. c) a one-tail test should be used. d) no conclusion should be reached.
9. If a test of hypothesis has a Type I error probability ( ) of 0.01, it means that a) if the null hypothesis is true, you don't reject it 1% of the time. b) if the null hypothesis is true, you reject it 1% of the time. c) if the null hypothesis is false, you don't reject it 1% of the time. d) if the null hypothesis is false, you reject it 1% of the time.
10. If the Type I error ( ) for a given test is to be decreased, then for a fixed sample size n
a) the Type II error ( ) will also decrease.
b) the Type II error ( ) will increase. c) the power of the test will increase. d) a one-tail test must be utilized.
11. You have created a 95% confidence interval for with the result 10 15 . What decision will you make if you test H0 : = 16 versus H1 : 16 at = 0.05? a) Reject H0 in favor of H1. b) Do not reject H0 in favor of H1. c) Fail to reject H0 in favor of H1. d) We cannot tell what our decision will be from the information given.
12. An entrepreneur is considering the purchase of a coin-operated laundry. The current owner claims that over the past 5 years, the mean daily revenue was $675 with a population standard deviation of $75. A sample of 30 days reveals a daily mean revenue of $625. If you were to test the null hypothesis that the daily mean revenue was $675, which test would you use? a) Z-test of a population mean b) Z-test of a population proportion c) t-test of population mean d) t-test of a population proportion 13. A pizza chain is considering opening a new store in an area that currently does not have any such stores. The chain will open if there is evidence that more than 5,000 of the 20,000 households in the area have a favorable view of its brand. It conducts a telephone poll of 300 randomly selected households in the area and finds that 96 have a favorable view. State the test of hypothesis that is of interest to the pizza chain. a) H0 : π ≤ 0.32 versus H1 : π > 0.32 b) H0 : π ≤ 0.25 versus H1 : π > 0.25 c) H0 : π ≤ 5,000 versus H1 : π > 5,000 d) H0 : µ ≤ 5,000 versus H1 : µ > 5,000
14. The marketing manager for an automobile manufacturer is interested in determining the proportion of new compact-car owners who would have purchased a GPS navigation system if it had been available for an additional cost of $300. The manager believes from previous information that the proportion is 0.30. Suppose that a survey of 200 new compact-car owners is selected and 79 indicate that they would have purchased the GPS navigation system. If you were to conduct a test to determine whether there is evidence that the proportion is different from 0.30, which test would you use? a) Z-test of a population mean b) Z-test of a population proportion c) t-test of population mean d) t-test of a population proportion 15. The t test for the difference between the means of 2 independent populations assumes that the respective a) sample sizes are equal. b) sample variances are equal. c) populations are approximately normal. d) All the above.
16. If we are testing for the difference between the means of 2 independent populations
presuming equal variances with samples of n1 = 20 and n2 = 20, the number of degrees of freedom is equal to a) 39. b) 38. c) 19. d) 18. 17. In testing for differences between the means of two independent populations, the null hypothesis is:
a) H0 : 1 −2 = 2.
b) H0 : 1 −2 = 0.
c) H0 : 1 −2 > 0.
d) H0 : 1 −2 < 2.
18. Given the following information, calculate sp2, the pooled sample variance that should be used in the pooled-variance t test. s1
2 = 4 s22 = 6 n1 = 16 n2 = 25 a) sp2 = 6.00 b) sp2 = 5.00 c) sp2 = 5.23 d) sp2 = 4.00
19. In testing for the differences between the means of two related populations, you assume that the differences follow a _______ distribution. a) normal b) t c) F d) χ2
20. The Y-intercept (b0) represents the a) predicted value of Y when X = 0. b) change in estimated Y per unit change in X. c) predicted value of Y. d) variation around the sample regression line.
21. The slope (b1) represents a) predicted value of Y when X = 0. b) the estimated average change in Y per unit change in X. c) the predicted value of Y. d) variation around the line of regression. 22. The residuals represent a) the difference between the actual Y values and the mean of Y. b) the difference between the actual Y values and the predicted Y values. c) the square root of the slope. d) the predicted value of Y for the average X value. 23. The strength of the linear relationship between two numerical variables may be measured by the a) scatter plot. b) coefficient of correlation. c) slope. d) Y-intercept. 24. Based on the residual plot below, you will conclude that there might be a violation of which of the following assumptions.
a) Linearity of the relationship b) Normality of errors c) Homoscedasticity d) Independence of errors
25. If the Durbin-Watson statistic has a value close to 0, which assumption is violated?
a) Normality of the errors. b) Independence of errors. c) Homoscedasticity. d) None of the above. 26. Assuming a linear relationship between X and Y, if the coefficient of correlation (r) equals – 0.30, a) there is no correlation. b) the slope (b1) is negative. c) variable X is larger than variable Y. d) the variance of X is negative. 27. In a multiple regression problem involving two independent variables, if b1 is computed to be +2.0, it means that a) the relationship between X1 and Y is significant. b) the estimated mean of Y increases by 2 units for each increase of 1 unit of X1, holding X2 constant. c) the estimated mean of Y increases by 2 units for each increase of 1 unit of X1, without regard to X2. d) the estimated mean of Y is 2 when X1 equals zero. 28. In a multiple regression model, the value of the coefficient of multiple determination a) has to fall between –1 and +1. b) has to fall between 0 and +1. c) has to fall between –1 and 0. d) can fall between any pair of real numbers. 29. If a categorical independent variable contains 2 categories, then _________ dummy variable(s) will be needed to uniquely represent these categories. a) 1 b) 2 c) 3 d) 4 30. An interaction term in a multiple regression model may be used when a) the coefficient of determination is small.
b) there is a curvilinear relationship between the dependent and independent variables. c) neither one of 2 independent variables contribute significantly to the regression model. d) the relationship between X1 and Y changes for differing values of X2. Part II: Scenario Questions (70 points, each question worth 2 points) Questions 31-34 are based on the following scenario: How many tissues should the Kimberly Clark Corporation package of Kleenex contain? Researchers determined that 60 tissues is the mean number of tissues used during a cold. Suppose a random sample of 100 Kleenex users yielded the following data on the number of tissues used during a cold: X = 52, S = 22. 31. Give the null and alternative hypotheses to determine if the number of tissues used during a cold is less than 60.
a) H0 : 60 and H1 : 60.
b) H0 : 60 and H1 : 60.
c) H0 : X 60 and H1 : X 60.
d) H0 : X = 52 and H1 : X 52. 32. Using the sample information provided, calculate the value of the test statistic.
a) t = (52 − 60) / 22
b) t = (52 − 60) / (22 /100)
c) t = (52 − 60) / (22 /1002) d) t = (52 − 60) / (22 /10)
33. Suppose the alternative you wanted to test was H1 : 60 . State the correct rejection region for = 0.05. a) Reject H0 if t > 1.6604. b) Reject H0 if t < – 1.6604. c) Reject H0 if t > 1.9842 or Z < – 1.9842. d) Reject H0 if t < – 1.9842. 34. Suppose the test statistic does fall in the rejection region at = 0.05. Which of the following conclusion is correct? a) At = 0.05, there is not sufficient evidence to conclude that the mean number of tissues used during a cold is 60 tissues.
b) At = 0.05, there is sufficient evidence to conclude that the mean number of tissues used during a cold is 60 tissues. c) At = 0.05, there is insufficient evidence to conclude that the mean number of tissues used during a cold is not 60 tissues. d) At = 0.10, there is sufficient evidence to conclude that the mean number of tissues used during a cold is not 60 tissues. Questions 35-39 are based on the following scenario:
A researcher randomly sampled 30 graduates of an MBA program and recorded data concerning their starting salaries. Of primary interest to the researcher was the effect of gender on starting salaries. The result of the pooled-variance t-test of the mean salaries of the females (Population 1) and males (Population 2) in the sample is given below.
35. The researcher was attempting to show statistically that the female MBA graduates have a significantly lower mean starting salary than the male MBA graduates. Which of the following is an appropriate alternative hypothesis?
a) H1 : females males
b) H1 : females males
c) H1 : females males
d) H1 : females = males 36. From the analysis in this scenario, the correct test statistic is: a) -6610 b) -1.3763 c) -1.7011 d) 0.0898 37. The proper conclusion for this test is:
a) At the = 0.05 level, there is sufficient evidence to indicate that females have a lower mean starting salary than male MBA graduates. b) At the = 0.05 level, there is sufficient evidence to indicate that females have a higher mean starting salary than male MBA graduates. c) At the = 0.05 level, there is insufficient evidence to indicate that females have a lower mean starting salary than male MBA graduates. d) At the = 0.05 level, there is insufficient evidence to indicate that females have a higher mean starting salary than male MBA graduates. 38. What is the 95% confidence interval estimate for the difference between two means?
a) −$16,447.85 to −$3,227.85
b) −$16,447.85 to $3,227.85
c) −$3,227.85 to $3,227.85
d) $3,227.85 to $16,447.85 39. The researcher was attempting to show statistically that the female MBA graduates have a significantly lower mean starting salary than the male MBA graduates. What assumptions were necessary to conduct this hypothesis test? a) Both populations of salaries (male and female) must have approximate normal distributions. b) The population variances are approximately equal. c) The samples were randomly and independently selected. d) All of the above assumptions were necessary.
Questions 40-47 are based on the following scenario: A corporation randomly selects 150 salespeople and finds that 66% who have never taken a self-improvement course would like such a course. The firm did a similar study 10 years ago in which 60% of a random sample of 160 salespeople wanted a self-improvement course. The
groups are assumed to be independent random samples. Let and represent the true proportion of workers who would like to attend a self-improvement course in the recent study and the past study, respectively. 40. If the firm wanted to test whether this proportion has changed from the previous study, which represents the relevant hypotheses?
a) H0: - = 0 versus H1: - 0
b) H0: - ≠ 0 versus H1: - = 0
c) H0: - ≤ 0 versus H1: - > 0
d) H0: - 0 versus H1: - < 0 41. If the firm wanted to test whether a greater proportion of workers would currently like to attend a self-improvement course than in the past, which represents the relevant hypotheses?
a) H0: - = 0 versus H1: - 0
b) H0: - ≠ 0 versus H1: - = 0
c) H0: - ≤ 0 versus H1: - > 0
d) H0: - 0 versus H1: - < 0 42. What is the point estimate for the difference between the two population proportions? a) 0.06 b) 0.10 c) 0.15 d) 0.22 43. What is/are the critical value(s) when performing a Z test on whether population proportions are different if = 0.05? a) 1.645 b) 1.96
c) −1.96 d) 2.08 44. Referring to Scenario 10-10, what is/are the critical value(s) when testing whether the current population proportion is higher than before if = 0.05? a) + 1.337 b) + 1.645 c) + 1.96 d) + 2.25
45. What is the estimated standard error of the difference between the two sample
proportions?
a) 0.629 b) 0.500 c) 0.055 d) 0 46. What is the value of the test statistic to use in evaluating the alternative hypothesis that there is a difference in the two population proportions? a) 4.335 b) 1.96 c) 1.093 d) 0 47. The company tests to determine at the 0.05 level whether the population proportion has changed from the previous study. Which of the following is correct? a) Reject the null hypothesis and conclude that the proportion of employees who are interested in a self-improvement course has changed over the intervening 10 years. b) Do not reject the null hypothesis and conclude that the proportion of employees who are interested in a self-improvement course has not changed over the intervening 10 years. c) Reject the null hypothesis and conclude that the proportion of employees who are interested in a self-improvement course has increased over the intervening 10 years. d) Do not reject the null hypothesis and conclude that the proportion of employees who are interested in a self-improvement course has increased over the intervening 10 years.
Questions 48-53 are based on the following scenario: A candy bar manufacturer is interested in trying to estimate how sales are influenced by the price of their product. To do this, the company randomly chooses 6 small cities and offers the candy bar at different prices. Using candy bar sales as the dependent variable, the company will conduct a simple linear regression on the data below:
City Price
($) Sales
River Falls 1.3 100
Hudson 1.6 90
Ellsworth 1.8 90
Prescott 2 40
Rock Elm 2.4 38
Stillwater 2.9 32
48. What is the estimated slope for the candy bar price and sales data? a) 161.386 b) 0.784 c) –3.810 d) –48.193 49. What is the coefficient of determination (r2) for these data? a) –0.8854 b) –0.7839 c) 0.7839 d) 0.8854 50. What percentage of the total variation in candy bar sales is explained by prices? a) 100% b) 88.54% c) 78.39% d) 48.19% 51. What is the standard error of the estimate, SYX, for the data? a) 0.784 b) 0.885 c) 12.650 d) 16.299 52. What is the standard error of the regression slope estimate, Sb1? a) 0.784 b) 0.885
c) 12.650 d) 16.299 53. If the price of the candy bar is set at $2, the predicted sales will be a) 30 b) 65 c) 90 d) 100 Questions 54-57 are based on the following scenario:
A manager of a product sales group believes the number of sales made by an employee (Y) depends on how many years that employee has been with the company (X1) and how he/she scored on a business aptitude test (X2). A random sample of 8 employees provides the following:
Employee Y X1 X2
1 100 10 7
2 90 3 10
3 80 8 9
4 70 5 4
5 60 5 8
6 50 7 5
7 40 1 4
8 30 1 1
54. For these data, what is the value for the regression constant, b0? a) 0.998 b) 3.103 c) 4.698 d) 21.293 55. What is the estimated coefficient for the variable representing years an employee has been with the company, b1? a) 0.998 b) 3.103 c) 4.698 d) 21.293 56. What is the estimated coefficient for the variable representing scores on the aptitude test, b2? a) 0.998 b) 3.103
c) 4.698 d) 21.293 57. If an employee who had been with the company 5 years scored a 9 on the aptitude test, what would his estimated expected sales be? a) 79.09 b) 60.88 c) 55.62 d) 17.98 Questions 58-65 are based on the following scenario: A real estate builder wishes to determine how house size (House) is influenced by family income (Income) and family size (Size). House size is measured in hundreds of square feet and income is measured in thousands of dollars. The builder randomly selected 50 families and ran the multiple regression. Partial Microsoft Excel output is provided below:
58. What fraction of the variability in house size is explained by income and size of family? a) 17.56% b) 53.78% c) 71.89%
d) 84.79% 59. Which of the independent variables in the model are significant at the 5% level? a) Income only b) Size only c) Income and Size d) None
60. When the builder used a simple linear regression model with house size (House) as the
dependent variable and family size (Size) as the independent variable, he obtained an r2 value of
1.25%. What additional percentage of the total variation in house size has been explained by
including income in the multiple regression?
a) 15.00% b) 70.64% c) 71.50% d) 73.62% 61. Suppose the builder wants to test whether the coefficient on Income is significantly different from 0. What is the value of the relevant t-statistic? a) –0.7630 b) 3.2708 c) 10.8668 d) 60.0864 62. At the 0.01 level of significance, what conclusion should the builder reach regarding the inclusion of Income in the regression model? a) Income is significant in explaining house size and should be included in the model because its p-value is less than 0.01. b) Income is significant in explaining house size and should be included in the model because its p-value is more than 0.01. c) Income is not significant in explaining house size and should not be included in the model because its p-value is less than 0.01. d) Income is not significant in explaining house size and should not be included in the model because its p-value is more than 0.01. 63. What annual income (in thousands of dollars) would an individual with a family size of 9 need to attain a predicted 5,000 square foot home (House = 50)? a) 10.19 b) 11.19 c) 13.19 d) 15.19
64. The observed value of the F-statistic is missing from the printout. What are the degrees of freedom for this F-statistic? a) 2 for the numerator, 47 for the denominator b) 2 for the numerator, 49 for the denominator c) 49 for the numerator, 47 for the denominator d) 47 for the numerator, 49 for the denominator
65. Allowing for a 1% probability of committing a type I error, what is the decision and
conclusion for the test
H0 : = = 0 vs. H1 : At least one j 0, j =1,2 ? a) Do not reject H0 and conclude that the 2 independent variables taken as a group have significant linear effects on house size. b) Do not reject H0 and conclude that the 2 independent variables taken as a group do not have significant linear effects on house size. c) Reject H0 and conclude that the 2 independent variables taken as a group have significant linear effects on house size. d) Reject H0 and conclude that the 2 independent variables taken as a group do not have significant linear effects on house size. Bonus Questions (10 points, each question worth 2 points)
66. What null hypothesis would you test to determine whether the slope of the linear relationship between weight loss (Y) and time on the program (X1) varies according to time of session?
a) H0 : 1 = 0
b) H0 : = 0
c) H0 : = 0
d) H0 : = = 0
67. In terms of the s in the model, give the mean change in weight loss (Y) for every 1 month
increase in time on the program (X1) when not attending the morning session.
a) b) + c) + d) +
68. In terms of the s in the model, give the mean change in weight loss (Y) for every 1 month increase in time on the program (X1) when attending the morning session.
a) b) + c) + d) + 69. Which of the following statements is supported by the analysis shown?
a) There is sufficient evidence (at = 0.05) of curvature in the relationship between weight loss (Y) and months on program(X1).
b) There is sufficient evidence (at = 0.05) to indicate that the relationship between weight loss (Y) and months on program (X1) varies with session time.
c) There is insufficient evidence (at = 0.05) of curvature in the relationship between weight loss (Y) and months on program(X1).
d) There is insufficient evidence (at = 0.05) to indicate that the relationship between weight loss (Y) and months on program(X1) varies with session time. 70. What is the unit for this analysis?
a) A clinic b) A client on a weight-loss program c) A month d) A morning, afternoon, or evening session