Module 7 - Assignment: Submit Quantitative Data Analysis
Question do not have to be answered like a novel, to the point, but good and precise answers. Also, in most answers that you get on statcrunch, like graphs and diagrams, you need to COPY paste that into the word document that you are using, please see the reference sample paper of a previous student in one of the two attachments.
RSCH 202 – Introduction to Research Methods
Module 7 - Assignment: Submit Quantitative Data Analysis
RSCH202 – Quantitative Data Analysis Assignment
In this assignment, you will be required to download data and then use techniques from descriptive and inferential statistics to analyze that data. The following sets the scenario for the data set that you will download.
StatCrunch U is a fictitious university made up of 46,000 students (the population). Each student completed the survey shown here:
That gives us a data base with that information for all 46,000 students.
For this assignment, you will first select a sample of 200 students from this population and then analyze the data from the sample to draw conclusions about the entire population of StatCrunch U students. The following instructions tell you how to select your sample.
Log in to StatCrunch.com and click on Resources at the top of the page, then on StatCrunch U on the left side. You should see the following paragraph.
Click on the StatCrunchU link in the paragraph above.
In the next screen, scroll down until you see the following, set the sample size to 200, and click Survey.
Your sample with the survey data should then show on the StatCrunch screen as shown below. If you don’t see the file, click MyStatCrunch, then My Preferences, then make sure classic StatCrunch is chosen. After that, try doing the survey. Be sure to click on Data>Save data and save the file to your My Data folder.
You should see a menu like the one below where you can give the file a name. Click on Save and you should get confirmation that the file was saved.
Next go to your StatCrunch My Data folder and find the file so that you can work with it to answer the following questions. Each question is worth 10 points.
1. Copy and paste the first 15 rows from your StatCrunch data file below. You can do that by highlighting the data in StatCrunch, then clicking on Edit>Copy. Next return to this file, place your cursor in the space below, and click Paste. The purpose of this is to allow your instructor to see the first part of your data set. Use all 200 students in your sample to answer the remaining questions.
Gender Class Hours Work Loans CC Debt
Male 1 9 27.5 2863 1469
Male 3 20 0 0 4934
Female 1 21 8 5029 1241
Female 3 18 0 0 4020
Male 4 15 0 0 14587
Male 4 15 0 0 2762
Female 2 13 0 0 2073
Male 2 11 20 9193 2311
Male 1 16 18.5 0 1507
Female 4 4 34 15272 4181
Male 4 6 31 15735 9514
Female 3 15 13 11006 3910
Male 4 3 41.5 15597 8518
Male 3 12 0 11026 0
Male 1 15 0 0 0
2. What is the shape of the distribution of credit hours? Compute summary statistics and use StatCrunch to construct a histogram of the credit hour data. Include a short paragraph answering the question.
The histogram shows that the distribution of credit hours is slightly left-skewed. The majority of students are taking 15-17.5 credit hours as indicated by the histogram above.
This is evident based on conducting the quartile range of 200 students (Q1 = 12, Q3 = 17). The mean credit hours is 14.165, and the median is 15. The Min is 3 credit hours and the Max is 21 credit hours
3. Suppose you want to construct a 95% confidence interval for the mean credit hours taken by StatCrunch U students. What sample size would be needed to limit the margin of error to 0.5 credit hours? Use the sample standard deviation from your sample as an estimate of the population standard deviation. You will need to follow the example on page 266 of the text.
Sx = Standard Error = 0.261
n = (3.692/0.261)2
N = 200
Using the formula from page 266, we have calculated that the sample size we require in order to limit the margin of error to 0.5 at a 95% confidence interval is 200.
With our table above, we can say that we are 95% confident that the mean credit hours is between 13.65 credit hours and 14.68 credit hours.
4. (a) What is the proportion of females at StatCrunch U? Create a pie chart showing the proportion of female students at StatCrunch U. Be sure to include a couple of sentences answering the question.
The proportion of females at StatCrunch U is approximately 57.5%.
(b) Does the proportion of females change across classes? Create a stacked bar chart and a contingency table to show how the proportion changes across classes. Make sure your bar chart shows proportions or percentages, not counts. Be sure to include a couple of sentences answering the question.
According to the stacked bar chart and the contingency table, which we have generated above, the proportion of females does show changes across different classes.
For example, the proportion of females in Class 1 shows 29 out of 58 and Class 2 shows 29 out of 47.
The proportion of females in Class 3 shows 38 out of 57 and Class 4 shows 19 out of 38.
5. Does the number of credit hours taken vary depending on whether or not students work? Create two boxplots on the same set of axes showing the number of credit hours taken by students who work and by students who do not work. Describe the distributions and any similarities or differences.
To do this in StatCrunch, you will first need to create an additional column in your data file that indicates whether or not a student works. You can do that in StatCrunch by using the Data>Bin column command as shown to the right and explained below. You will need to use the column you create here in problems 7, 8, and 10 also.
With your StatCrunch U file open, go to StatCrunch>Data>Bin Columns and complete the menu as shown to the right. Then click Calculate. This will create a new column titled Bin(Work) in the StatCrunch file. In that column working students will be labeled, “0.1 or above.” See below.
Go to StatCrunch/Graphics/ Boxplot and set up the boxplot menu as shown to the right, and click Create Graph! Be sure to include the graph in your paper and also include a few sentences answering the question.
According to the Box plot graph above we can deduce that the students who are not working are more likely to take up more credit hours than those students who are working.
6. Does the mean number of credit hours taken by all students appear to be significantly below 15? Use StatCrunch to conduct a one-sample t-test. Be sure to state the null and alternate hypotheses, include the output from StatCrunch, and briefly answer the question including justification for your answer.
According to the table above, we got a P-value of 0.0008 (Significantly small), therefore we must reject the null hypotheses and accept the alternate hypothesis, which is the mean number of credit hours taken by all students appears to be significantly smaller than 15.
7. For students who work, are there differences in the average loan amounts across classes? Use StatCrunch to conduct an ANOVA test to answer this question. Be sure to state the null and alternate hypotheses, include the output from StatCrunch, and briefly answer the question.
To set up this problem in StatCrunch, you will need to use the Bin(Work) column you created in Problem 5. Go to STAT>ANOVA>One Way, and complete the menu as shown to the right.
Null Hypothesis (Ho): Average amount of loans are the same across all classes
Alternate Hypothesis (HA): Average amount of loans are not the same across all classes
According to data from the above table, we got a P-value of <0.0001, which means that we can reject the null hypothesis and accept the alternate hypothesis, from which we can determine that the average amounts of loans are not the same across all classes.
8. For students that work, is there a relationship between the dollar amount of loans they have and the number of hours per week that they work?
To set up this problem, you will need to use the Bin(Work) column you created in Problem 5. Open the StatCrunch U file and go to Stat>Regression>Simple Linear as shown to the right. In the menu that appears, select the x-variable, the y-variable, and complete the “Where” entry as shown. Include the StatCrunch output in your paper and be sure to briefly answer the question and explain your answer.
· Construct a scatter plot with Work Hours on the x-axis and Loan Amount on the y-axis.
· Compute the coefficient of correlation, the coefficient of determination, and the linear regression equation.
The coefficient of correlation is -0.2246831
The coefficient of determination is 0.050483494
The linear regression equation is: Loans = 11528.058 - 300.57454 Hours
· Does there appear to be a relationship between work hours and loan amount? Explain your answer.
Yes, there is a negative linear relationship between work hours and loan amounts. The more hours a student works per week, the lesser the amount of student loans they have.
· Explain the meaning of the coefficient of determination as it applies specifically to this problem.
The coefficient of determination in our table is 0.05 (5%), which means that only 5% of the variance in “Loans” (y-axis) is predictable from “Hours” (X-axis).
· Explain the meaning of the slope in the regression equation as it applies specifically to this problem. Is this the relationship you expected? Explain. If it isn’t what you expected, explain why it might have occurred.
Loans = 11528.058 - 300.57454 Hours
The slope in the regression equation (Loans = 11528.058 - 300.57454 Hours) in this specific problem means that for every hour worked, the loan amount decreases approximately by $300.57.
This is exactly the relationship I expected because we predicted that the student loans would have been significantly lower for those students who are working as they do not need to take up as much loans as those students who are not working.
· Does a linear model appear to be appropriate for this comparison? Explain.
Yes, because the regression model shows a negative linear trend, since the students who work longer hours take up lesser loans and vice versa.
9. For students with credit card debt, does there appear to be a difference in the mean amount of credit card debt based on gender of the student? Conduct an appropriate two-sample t-test. Be sure to state the null and alternate hypotheses, include the output from StatCrunch, and briefly answer the question.
NOTE: You will need to use the following in the Where blocks in StatCrunch for this test as shown to the right: Gender=Male and “CC Debt”>0, Gender=Female and “CC Debt”>0.
According to the table above, the null hypotheses is that there is no significant difference in credit card debts based on gender. The alternate hypothesis is that there is a significant difference in credit card debts.
The P-value above of 0.047 shows that we can reject the null hypotheses (P-value < 0.05) and accept the alternate hypotheses. Therefore, the credit card debt shows a significant difference between genders.
10. Is there evidence of a relationship between class and whether or not students work? Conduct a chi-square test of independence. State the null and alternate hypotheses, include StatCrunch output, and explain the results of your test.
To set up this problem, you will need to use the Bin(Work) column you created in Problem 5. Open the StatCrunch U file and go to Stat>Tables>Contingency>With Data and complete the menu as shown to the right.
According to the table above, the null hypothesis is that there is no relationship between class and whether or not students work. While the alternate hypothesis shows that there is a relationship between classes and whether or not students work.
According to our P-value of 0.634 in our chi square test, we can accept our null hypotheses and reject our alternate hypothesis, where there is no evidence of a relationship between class and whether or not students work.