Part 1- 250 words
Create a discussion thread (with your name) and answer the following question:Discussion (Chapter 4): What are the privacy issues with data mining? Do you think they are substantiated?
Your response should be 250-300 words.
Part 2- 200 words
Replies to 2 students each 100 words
part 3 - APA paper 300 words
Chapter 4 – discussion question #1-5 & exercise 1
Examine how new data capture devices such as RFID tags help organizations accurately identify and segment their customers for activities such as targeted marketing. Many of these applications involve data mining. Scan the literature and the Web and then propose five potential new data mining applications that can use the data created with RFID technology. What issues could arise if a country’s laws required such devices to be embedded in everyone’s body for a national identification system?
2. Interview administrators in your college or executives in your organization to determine how data mining, data warehousing, Online Analytics Processing (OLAP), and visualization tools could assist them in their work. Write a proposal describing your findings. Include cost estimates and benefits in your report. 3. A very good repository of data that has been used to test the performance of many data mining algorithms is available at ics.uci.edu/~mlearn/MLRepository.html. Some of the data sets are meant to test the limits of current machine-learning algorithms and to compare their performance with new approaches to learning. However, some of the smaller data sets can be useful for exploring the functionality of any data mining software, such as RapidMiner or KNIME. Download at least one data set from this repository (e.g., Credit Screening Databases, Housing Database) and apply decision tree or clustering methods, as appropriate. Prepare a report based on your results. (Some of these exercises, especially the ones that involve large/challenging data/problem may be used as semester-long term projects.)
4. Large and feature-rich data sets are made available by the U.S. government or its subsidiaries on the Internet. For instance, see a large collection of government data sets (data.gov), the Centers for Disease Control and Prevention data sets (www.cdc.gov/DataStatistics), Surveillance, Cancer.org’s Epidemiology and End Results data sets (http://seer.cancer.gov/data), and the Department of Transportation’s Fatality Analysis Reporting System crash data sets (www.nhtsa.gov/FARS). These data sets are not preprocessed for data mining, which makes them a great resource to experience the complete data mining process. Another rich source for a collection of analytics data sets is listed on KDnuggets.com (KDnuggets.com/datasets/index.html).
1. Visit the AI Exploratorium at cs.ualberta.ca/~aixplore. Click the Decision Tree link. Read the narrative on basketball game statistics. Examine the data, and then build a decision tree. Report your impressions of its accuracy. Also explore the effects of different algorithms