CSCI 680/490 Tpc in Computer Science: Algorithm Design & Analysis Assignment 2 10 pts
You have learned the data structure of red-black tree, which is a height balanced binary search tree. In this assignment, you will implement this data structure using a certain programming language based on the course survey during the first lecture. Please look up your assignment in Table 1. You will also compare the performance of your implementation (of the red-black tree) with a library class which implements a red-black tree. You are required to write a report about the result and running time of your program. You need to submit your source file(s) to Blackboard, and your report on the due day in class. See requirements below.
1. Your program runs on the Linux server (hopper/turing) at the Computer Science Department.
2. Implementation of the data structure -- Provide at least the following public member methods for the red-black tree class:
Insertion: insert a value in the tree. If the value already exists in the tree, this operation has no effect on the tree.
Search : return true if a given value exists in the tree; otherwise false. Height : return the height of the tree. The height of an empty tree is -1. Size: return the number of elements in the tree. The size of an empty tree is 0.
1) It is best to design a system of base and derived classes of binary tree (Height, Size), binary search tree (Search, Binary-Search-Tree-Insertion), and red-black tree (Red-Black-Tree-Insertion) classes. But it is not required to do so. You will also need to design your Node class(es) accordingly.
2) You may find it relatively easy to implement most of above methods in a recursive fashion.
3. The library class that implements a red-black tree: C++ user: STL set Java user: java.util.TreeSet Python user: There is no built-in library class in python. You may need to install a separate package. Please contact the instructor if you want to switch to another programming language.
4. About the program: 1) The driver program takes two command line arguments in form of “[N=n] [L=Y/N]”. The first
argument specifies the number of elements to be inserted in the tree, i.e. the number of integers. You will use values 100, 1000, 10000, 100000, 1000000, … etc. to test your program. The second argument specifies whether to use the library implementation of the red black tree (e.g. STL set in C++). ‘Y’ means yes; ‘N’ means no. “[]” indicates an argument is optional. By default, N is 1000, L is ‘N’. For example, invoking the following command
./your-program N=100000 L=Y
will generate 100000 integers to be added in the tree provided by the library. And the command
2
./your-program
will generate 1000 integers to be stored in the tree that you implement. I assumed a C++ program in above examples. If yours is a Java or Python program, you need to use proper command.
2) Preparation of your input data: For the command line argument N, produce two collections of data, each has N random integers in the range of [1, 2N]. These two collections of random integers need to be generated by different seed values of the random generator (e.g. srand() and rand() for C++). One collection of integers (referred to as tree-data) is to be inserted into the red black tree. One collection of integers (referred to as search-data) is to be searched in the red black tree. Note that you will get the same sequence of random numbers given a same seed value for the random number generator. You want to use the same seed value for all tree-data, and the same seed value for all search-data to compare the performance of your implementation and the library class of the red-black tree.
3) Implementation of the program: a. Produce the tree-data and search-data described above; b. Create an instance of the red-black-tree (your own implementation or the library class,
depending on the command line argument); c. Insert all tree-data to the red-black-tree; d. Print out the height of the tree and the size of the tree. Since there may be duplicated values
from the random generator, this value of size may be less than your input N. If the class library does not support certain operation, you can skip it. For example, the STL set does not have public method to display height. Do not print out the values of whole tree. (But you can print out these values during testing and debugging stage of your work.)
e. Search each value from the search-data in the red-black-tree, count the number of successful searches, and divide it by the total number of searches. Print the percentage of successful searches.
4) When running your program, use the time command to record time spent on the program. E.g.
time ./your-program N=100000
Again, this example assumes a C++ program. If yours is a Java or Python program, you need to use a proper command to run your program. When N is small, your program will finish in a very short time. You can gradually increase your input size to test your program. If your program has run for over a few seconds, kill the program and stop increasing the size.
5) Documentation: The documentation standards are largely consistent with CSCI 241. Refer to this site for details: http://faculty.cs.niu.edu/~mcmahon/CS241/241DocStandards.html. In the documentation box on top of your program, indicate how to compile and/or run your program from the command line.
5. Report: Summarize your discovery about the data structure you have implemented in respect to the relationship between the height and the problem size, the relationship between time cost and problem size, and compare the result and performance of your implementation and the library class. Your report needs to be typewritten. The next page contains a form you can use.
http://faculty.cs.niu.edu/%7Emcmahon/CS241/241DocStandards.html
3
CSCI 680/490 Tpc in Computer Science: Algorithm Design & Analysis Assignment 2 Report Name: ___________________________ Zid: ___________________________ Programming Language: __________________________________________________ The library class implementing a red-black tree you used (e.g. STL set for C++): _____________ Command to compile the program (if applicable): ______________________________________ Command to run the program (Give an example): _____________________________________
Size N Library (Y/N)
Tree size Tree height
Percentage successful search (%)
System time User time Elapsed time
Summary of your observation: ____________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
______________________________________________________________________
4
Table 1. Assignments of programming languages.
________________________________________________________________________________
First Name Last Name Programming Language
Srikar
Akula
Java Yasser
Algasem
C++
Harika
Bandi
Python Magda
Baniukiewicz Python
Terina
Burr
C++ Karishma
Carvalho
Nithin
Devang
Python Bhaskara Reddy Devarapalli Java Deva Nag Avaneesh
Divvela Java
Robert
Durham
Java Timothy
Gehant
Java
Nicholas
Guerrero
Java Vishrant Krishna
Gupta Java
Brian
Homerding C++ Lin
Jia
C++
Bharat
Kale
C++ Kyle
Koschak
C++
Satyanarayana Kotha
Python Andrew
Krutke
C++
Eric
Lavin
Python Pradeep
Maddipatla
Logan
McMillin
C++ Shoeb
Mohammed Java
Nicole
Myers
C++ Marcus
Nguyen
Python
Syed
Rizvi
C++ Robert
Ruschke
C++
Aditya
Sabbineni C++ Nicholas
Sattler
C++
Rajarshi
Sen
Java James
Sheputis
C++
Tejasvi
Srigiriraju C++ Randall
Suvanto
Java
Sai
Vemuri
Java Jagadeesh Vinnakota Python Reid
Wixom
Python
Saiteja Yagni Java