Mid-term Exam, summer 2019 ITS632 – Introduction to Data Mining Instructions: You must show all of your calculations. Ghost Map 1. (1 point) Write a one sentence summary of how John Snow used his crude form of data mining to conclude the Broad Street well was the source of cholera. Type of data and management scales 2. (8 points) Use the four data and management scales on the left to categorize the descriptions on the right. a. nominal ____ customer service rating from 1 to 5 b. ordinal ____ gender: male or female c. interval ____ today’s low temperature is 50F and today’s high is 75F d. ratio ____ hair color such as black, brown, red ____ he is 6 feet tall ____ pain level from 1 to 10 ____ average age in the course is 24.3 ____ he ran the mile in exactly 4 minutes Scatter diagram 3. (1 point) Using one sentence, explain the correlation between the number of beach visitors and the average daily temperature. 4. (3 points) Gini Index Use the following table to calculate your answers to the three questions below. a. What is the Gini Index for Home Owners? b. What is the Gini Index for non-Home Owners? c. Compute the weighted average for the Home Owner type. 5. (1 point) Bayes Theorem Probability of a dangerous fire = 1% Probability of smoke is common mainly due to barbeques = 10% Probability of dangerous fires when there is smoke = 90% Calculate the probability of a dangerous fire when there is smoke. 6. (6 points) Decision Trees a. Examine the following dataset. If a datapoint with an x coordinate = 3 is added, what color would the datapoint be? b. Given the following dataset, write rules for each color of datapoints. 1) green datapoints 2) red datapoints 3) blue datapoints c. Calculate the Gini impurities for the following imperfect split. 1) Left = 2) Right = ...