Complete the following assignment in one MS word document:
Chapter 5 –
Discussion Question #1 Go to the Teradata University Network Web site (teradatauniversitynetwork.com) or a URL given by your instructor. Locate Web seminars related to data mining and neural networks. Specifically, view the seminar given by Professor Hugh Watson at the SPIRIT2005 conference at Oklahoma State University; then, answer the following questions:
a. Which real-time application at Continental Airlines might have used a neural network?
b. What inputs and outputs can be used in building a neural network application?
c. Given that its data mining applications are in real time, how might Continental implement a neural network in practice?
d. What other neural network applications would you propose for the airline industry?
Discussion Question #2 Go to the Teradata University Network Web site (teradatauniversitynetwork.com) or a URL given by your instructor. Locate the Harrah’s case. Read the case and answer the following questions:
a. Which of the Harrah’s data applications are most likely implemented using neural networks?
b. What other applications could Harrah’s develop using the data it collects from its customers?
c. What are some concerns you might have as a customer at this casino?
Discussion Question #3 A bankruptcy-prediction problem can be viewed as a problem of classification. The data set you will be using for this problem includes five ratios that have been computed from the financial statements of real-world firms. These five ratios have been used in studies involving bankruptcy prediction. The first sample includes data on firms that went bankrupt and firms that did not. This will be your training sample for the neural network. The second sample of 10 firms also consists of some bankrupt firms and some nonbankrupt firms. Your goal is to use neural networks, SVM, and nearest neighbor algorithms to build a model using the first 20 data points and then to test its performance on the other 10 data points. (Try to analyze the new cases yourself manually before you run the neural network and see how well you do.) The following tables show the training sample and test data you should use for this exercise.
Training Sample
Firm WC/TA RE/TA EBIT/TA MVE/TD S/TA BR/NB
1 0.1650 0.1192 0.2035 0.8130 1.6702 1
2 0.1415 0.3868 0.0681 0.5755 1.0579 1
3 0.5804 0.3331 0.0810 1.1964 1.3572 1
4 0.2304 0.2960 0.1225 0.4102 3.0809 1
5 0.3684 0.3913 0.0524 0.1658 1.1533 1
6 0.1527 0.3344 0.0783 0.7736 1.5046 1
7 0.1126 0.3071 0.0839 1.3429 1.5736 1
8 0.0141 0.2366 0.0905 0.5863 1.4651 1
9 0.2220 0.1797 0.1526 0.3459 1.7237 1
10 0.2776 0.2567 0.1642 0.2968 1.8904 1
11 0.2689 0.1729 0.0287 0.1224 0.9277 0
12 0.2039 -0.0476 0.1263 0.8965 1.0457 0
13 0.5056 -0.1951 0.2026 0.5380 1.9514 0
14 0.1759 0.1343 0.0946 0.1955 1.9218 0
15 0.3579 0.1515 0.0812 0.1991 1.4582 0
16 0.2845 0.2038 0.0171 0.3357 1.3258 0
17 0.1209 0.2823 -0.0113 0.3157 2.3219 0
18 0.1254 0.1956 0.0079 0.2073 1.4890 0
19 0.1777 0.0891 0.0695 0.1924 1.6871 0
20 0.2409 0.1660 0.0746 0.2516 1.8524 0
Describe the results of the neural network, SVM, and nearest neighbor model predictions, including software, architecture, and training information.
Discussion Question #4 The purpose of this exercise is to develop models to predict the type of forest cover using a number of cartographic measures. The given data set (see Online Supplements) includes four wilderness areas found in the Roosevelt National Forest of northern Colorado. A total of 12 cartographic measures were utilized as independent variables; seven major forest cover types were used as dependent variables. This is an excellent example for a multi-class classification problem. The data set is rather large (with 581,012 unique instances) and feature rich. As you will see, the data are also raw and skewed (unbalanced for different cover types). As a model builder, you are to make necessary decisions to preprocess the data and build the best possible predictor. Use
your favorite tool to build the models for neural networks, SVM, and nearest neighbor algorithms, and document the details of your results and experiences in a written report. Use screenshots within your report to illustrate important and interesting findings. You are expected to discuss and justify any decision that you make along the way.
Exercise #6 Go to neoxi.com. Identify at least two software tools that have not been mentioned in this chapter. Visit Web sites of those tools and prepare a brief report on their capabilities.
Internet exercise #7 Go to neuroshell.com. Look at Gee Whiz examples. Comment on the feasibility of achieving the results claimed by the developers of this neural network model. (go to neuroshell.com click on the examples and look at the current examples listed, the Gee Whiz example is no longer on the page).