Case Study 1 - Data Visualization and Descriptive Statistics
The data file Home_Values.xlsx contains median home values ( Home Value ), median household income ( HH Inc ), median per capita ( Per Cap Inc ) and percent of homes that are owner occupied ( Pct Owner Occ ) for each state and the District of Columbia. Prior to a more detailed analysis of the data, a company wants to get a good understanding of the 4 variables (e.g. central tendency, variability, shape of the distribution, pattern of relationship between the variables). A company representative contracts with you to help with this process. To help the company get a better understanding of the data, you are asked to perform the following analysis steps:
Using Data>Data Analysis>Descriptive Statistics in Excel, calculate the mean, median, range and standard deviation of each variable and summarize the results in table.
Using Excel, create a frequency histogram for each variable to determine the shape of the distributions. Be sure to give each chart a title and label the axes clearly.
Using Excel, create boxplots for each variable. Be sure to give each chart a title and label the axes clearly.
Using Excel, create scatterplots of each variable with each other variable (hint: you should have 6 scatterplots). Be sure to give each chart a title and label the axes clearly.
Using Data>Data Analysis>Correlation in Excel, calculate the correlation coefficient each variable with each other variable.
In Word, write a summary report of the findings that includes the tables and charts from steps 1-5 and includes the following:
An introductory paragraph summarizes the purpose of the analysis.
A section (1 or more paragraphs) describing what the tabular data from step 1 indicate about the central tendency, variability and distribution of each variable. For example, do the variables appear to be distributed in a symmetric or skewed pattern.
A section (1 or more paragraphs) describing how the frequency histograms from step 2 and the boxplots from step 3 support and clarify the findings of the tabular data. Include in this section any evidence suggesting outliers in the data.
A section (1 or more paragraphs) describing what the scatterplots from step 4 and correlations from step 5 indicate about the relationship between the various pairs of variables (e.g., are the variables related?, does the relationship appear to be linear or nonlinear?, is the direction of the relationship positive or negative?).
A concluding paragraph summarizing the key findings of the analysis and making recommendations for the variable among HH Inc , Per Cap Inc and Pct Owner Occ that is most strongly correlated with Home Value .
Note: Submit a single Excel workbook showing all work for steps 1-5 and a Word document of your summary report that addresses all parts step 6. Make sure to clearly label your tables and charts.