Instructions:
Congratulations. You are a finalist in for a data analyst position for a Major League Baseball (MLB) team. As you prepare for the final round of interviews, you've been asked to use the above data set to create a series of analytics / dashboards to help show how well the team is doing in two important KPIs: home-game attendance and salaries. Within the MLB, the San Francisco Giants are in the: League = National League Division = West (W) Division The intended audience for this dashboard is the Director of Analytics. Limitations: Clearly this project is limited in terms of scope of data. In the real world setting there would be ticket sales, customer demographic information, television viewership ratings, social media mentions / hits, and a whole host of additional data to churn through. But (realistically) like any project, it's good to start with a piece of the puzzle at a time, and in sequence. So consider this an initial step in what could be a much larger project. Two files are needed for this submission: Your Power BI dashboard file and the answers to the questions below (in a Word document). Broadly speaking this project's learning outcomes include:
· Assigning KPIs
· Trend analysis
· Comparative analysis
· Creating columns and measures
· Creating relationships between multiple data sources
· Creating the best visualization to appropriately show the data
Hint: Use the TeamsMostRecent table as your centralized table that all others are related to (connected with). But only connect Salaries to Team_Statistics and Team_Statistics to TeamsMostRecent as you don't want to have unnecessary relationships that will cause a circular logic in your design.
Hint2: you will need to create a new column to join the Salaries and Team_statistics tables together. What 2 (or more fields) create a unique identifier for each individual row that exists in both of these tables? You will need to use this field to join these tables together.
Analytics portion:
1. Get a sense of the data to start. Create a matrix that has every ball club, each year (2006-2014) and total games played. This will allow you to see if there are any significant gaps in the data. Are there? Explain.
2. a. Choose the most appropriate visualization to show the total attendance for the team from 2006 - 2014. What's their trend? b. Choose the most appropriate visualization to show the total attendance for each year and each club in their division attendance for 2006 - 2014. What is the trend for the team? Which team came closest to surpassing them in attendance and in what year? c. Choose the most appropriate visualization to show how the team's attendance average (combined for all years, 06-14) compares with the attendance average of all other teams in the League. Sort by average attendance in descending order (Most to Least). How are they ranked? Overall is their attendance numbers considered "good" or "bad"? How do you know?
3. Plot all stadium addresses on a map. Include the club names in the legend. Include in the plot the average attendance (for the size of the circles). How does the team compare with other clubs in the region? Between the visualizations you created in problems 2 and 3, which do you think best tells the "attendance" story and why?
4. a.Show the total salary for every ball club in 2013. (2014 salary information is incomplete for some clubs.) Who are the top 5 ball clubs that have the highest salary? Who has the least? Where does the team reside in the salary "rankings"? Highlight where they rank. b. Show what the Team salary has been like for the last 5 years. What is the trend? c. Again for 2013, show the number of wins per ball club, highlight where the Team ranks. Create a measure that calculates Salaries by Wins to measure how much each win "costs" based on the player salary (total salary / total wins). Where does the team rank? Describe your findings.
5. a.Select the best visualization type to show the average salary per player by every ball club in 2013. Which club has the highest average salary? The Lowest? Where does the team sit? b. Create a slicer by ball club name and create an additional visualization that lists out all ball players and total salary. Filter on the team. And sort by salary in descending order (highest to lowest). Who were the top 5 players by salary in 2013? What was the most common amount (mode) paid to the players?
6. Create your own visualization here that does not repeat one of the above questions. And describe what you are showing, the question being addressed and the results of the analysis