Assg 13: Dictionaries and Hashing
COSC 2336 Spring 2019
April 18, 2019
Dates:
Due: Sunday May 05, by Midnight
Objectives
• More practice with using class templates
• Learn about implementing and using key/value pair Dictionary ab- straction
• Implement and learn about some basic hashing techniques, like mid- square hasing and quadratic probing for closed hashing schemes.
Description
In this assignment you will be implementing some basic mechanisms of a hash table to implement a Dictionary that uses hashing to store and search for items in its collection. You have been given many files for this assign- ment. I have provided an Employee class in "Employee.[hpp|cpp]" and a KeyValuePair class in "KeyValuePair.[hpp|cpp]". You will not need to make any changes to these files or classes, they should work as given for this as- signment.
You will be adding and implementing some member functions to the HashDictionary class. The initial "HashDictionary.[hpp|cpp]" file contains a constructor and destructor for a HashDictionary as well as some other accessors and operators already implemented that are used for testing.
You will be implementing a closed hash table mechanism using quadratic probing of the slots. You will also implement a version of the mid-square
1
hashing function described in our textbook (Shaffer section 9.4.3 on closed hashing mechinsms).
For this assignment you need to perform the following tasks.
1. Your first task is to implement methods to define the probe sequence for closed hasing. Add a member function named probe() to the HashDictionary class. Be aware that the HashDictionary class is a templatized on templates, thus when you implement the class methods you need to templatize the class methods correctly. You can look at the example implementations of size() and the con- structors to remind yourself how to do this correctly.
In any case, probe() is a member function that takes two parameters, a Key and an integer index value. We are not using secondary hashing (as described in our textbook) so the Key value will actually not be used in your function. However, keep it as a parameter as the gen- eral abstraction/API for the probe function should include it for cases where secondary hashing is used. probe() should be a const class member function, as calling it does not change the dictionary. Finally probe() will return an ineger as its result.
Your probe() funciton should implement a quadratic probing scheme as described in our Shaffer textbook section 9.4.3 on pg. 338. Use c1 = 1, c2 =2, c3 = 2 as the parameters for your quadratic probe (the tests of probe() assume your probe sequence is using these pa- rameter values for the quadratic function).
2. You second tasks is to implement a hash function for integer like keys using the described mid-square hasing function (Shaffer 9.4.1 Example 9.6 pg. 327). We will create a slight variation of this algorithm for our hashing dictionary. First of all the hash() member functions should take a Key as its only input parameter, and it will then return a regular int as its result. Since this is a hash function, the integer value should be in the range 0 - tableSize-1, so don’t forget to mod by the tableSize before returning your hash result.
hash() should work like this. First of all, you should square the key value that is passed in. Then, assuming we are working with a 32 bit int, we want to only keep the middle 16 bits of the square of the key to use for our hash. There are many ways to work with and get the bits you need, but most likely you will want to use C bitwise operators to do this. For example, a simple method to get the middle 16 bits is
2
to first mask out the upper 8 bits using the bitwise & operator (e.g. key & 0x00FFFFFF) will mask out the high order 8 bits to 0). Then once you have removed the upper most significant 8 bits, you can left shift the key by 8 bits, thus dropping out the lower least significant 8 bits (e.g. key >> 8). Performing a mask of the upper 8 bits and shifting out the lower 8 bits will result in you only retaining the middle 16 bits.
If your system is using 64 bit integers rather than 32 bit integers, perform the mid-square method but retain the middle 32 bits of the result. You can use the sizeof(int) method to determine how many bytes are in an int on your system. I will give a bonus point if you write your hash() function to correctly work for both 32 and 64 bit values by testing sizeof(int) and doing the appopriate work to get the middle bits. Again after you square the key and get the middle bits, make sure you modulo the result to get an actual has index in the correct range.
3. The third task is to add the insert()method to your HashDictionary so that you can insert new key/value pairs into the dictionary. insert() should take a constant Key reference and a constant Value reference as its input parameters (note that both of these parameters should be declared as const, and they should both be reference pa- rameters, so use the & to indicate they are passed by reference). Your insert() function does not return a result, so it will be a void func- tion.
The algorithm for insert is described in Shaffer 9.4.3 on pg. 334. You need to call and use the probe() and hash() funciton you created in the first 2 steps to correctly define/implement your closed hashing probe sequence. The basica algorithm is that you use hash() to de- termine the initial home slot, and probe() gives an offset you should add. Basically you have to search the hashTable using the probe se- quence until you find an empty slot. Once you find an empty slot, you should create a new instance of a KeyValuePair object, that contains the key and value that were provided as input param- eters to your insert() function. This KeyValuePair instance should then be inserted into the table at the location where you find the first empty slot on the probe sequence. Also don’t forget to update the valueCount paramemter of the HashDictionary class that keeps track of the number of items currently in the dictionary.
3
4. Finally you will also implement the find() method to search for a particular key in your dictionary. The find() member function taks a single Key parameter as input (it should be a const Key& reference parameter). The find() functin will return a Value as a result, which will be the Value of the record associated with the given Key if it was found in the dictionary, or an empty Value() object if it was not found.
The find() method uses the same probe sequence as insert() imple- mented by your probe() and hash() methods. So you should again search along the probe sequence, until you either find the key you were given to search for, or else find an empty slot. Then at the end, if you found the key in the hashTable you should return the value that corresponds to the key that was searched for. If the search failed and you found an empty slot on your probe sequence, you should instead return an empty Value() object, which is used as an indicator for a failed search.
In this assignment you will be given a lot of starting code. As usual, there is an "assg-13.cpp" file which contains commented out tests of the code/functions you are to write. You have been given and "Em- ployee.[hpp|cpp]" file containing a simple definition of a (non-templated) class/record that holds a few pieces of information about a theoretical Employee. We use this class to create a hash dictionary for testing with the employee id as the key, and the Employee record as the associated value in our Dictionary. You have also been given a template class in the file "KeyValuePair.[hpp|cpp]". This contains a templatized container to holding a key/value pair of items. You will not need to add any code or make any changes in the Employee or KeyValuePair class files.
You have also been given a "HashDictionary.[hpp|cpp]" file containing beginning defintions of a HashDictionary class. The member functions you need to add for this assignment should be added to these files.
Here is an example of the output you should get if your code is passing all of the tests and is able to run the simulation. You may not get the exact same statistics for the runSimulation() output, as the simulation is generating random numbers, but you should see similar values.
----- testing Employee record and KeyValuePair class ----------- test key: 42 test value: blue ( id: 3, Derek Harter, 1234 Main Street, Commerce TX, 12345.67 )
4
-------------- testing quadratic probing ----------------------- Newly created hash dictionary should be empty, size: 0 probe index: 0 returned probe value: 2 probe index: 1 returned probe value: 5 probe index: 5 returned probe value: 37
-------------- testing mid-square hashing ---------------------- Assuming 32 bit (4 byte) ints for these tests: 4 hash key: 3918 returned hash value: 1 hash key: 48517 returned hash value: 6 hash key: 913478 returned hash value: 5 hash key: 8372915 returned hash value: 4 counts for slot[0] = 143055 counts for slot[1] = 143040 counts for slot[2] = 143362 counts for slot[3] = 142399 counts for slot[4] = 142966 counts for slot[5] = 142658 counts for slot[6] = 142520
-------------- testing dictionary insertion -------------------- After inserting 4 employees: Slot: 0
Key : 362817371 Value: ( id: 362817371, Carol Black, 8913 FM 24 Cooper TX, 28.50 )
Slot: 1 Key : 438901234 Value: ( id: 438901234, Derek Harter, 123 Main St. Commerce TX, 58.23 )
Slot: 2 Key : 0 Value: ( id: 0, , , 0.00 )
Slot: 3 Key : 192834192 Value: ( id: 192834192, Alice White, 384 Bois'darc. Campbell TX, 45.45 )
Slot: 4
5
Key : 998439281 Value: ( id: 998439281, Bob Green, 92 Washington Apt. 5 Greenville TX, 16.00 )
Slot: 5 Key : 0 Value: ( id: 0, , , 0.00 )
Slot: 6 Key : 0 Value: ( id: 0, , , 0.00 )
-------------- testing dictionary search ----------------------- Search for id: 438901234
Found employee: ( id: 438901234, Derek Harter, 123 Main St. Commerce TX, 58.23 )
Search for id: 362817371 Found employee: ( id: 362817371, Carol Black, 8913 FM 24 Cooper TX, 28.50 )
Unsuccessful Search for id: 239481432 Found employee: ( id: 0, , , 0.00 )
Assignment Submission
A MyLeoOnline submission folder has been created for this assignment. You should attach and upload your completed "HashDictionary.[hpp|cpp]" source files to the submission folder to complete this assignment. You do not need to submit your "assg-13.cpp" file with the tests, nor the Employee or KeyVal- uePair files, since you should not have made changes to any of these (except to uncomment out the tests in assg-13.cpp). Please only submit the asked for source code files, I do not need your build projects, executables, project files, etc.
6
Requirements and Grading Rubrics
Program Execution, Output and Functional Requirements
1. Your program must compile, run and produce some sort of output to be graded. 0 if not satisfied.
2. (20 pts.) probe() member function implemented. Function is using quadratic probing as asked for, with correct values for c1, c2 and c3 parameters. Probe sequence appears correct and passes tests.
3. (20 pts.) hash() member function implemented correctly. Function implements the mid-square method as described. Function correctly uses only the 16 middle bits if system uses 32 bit integers.
4. (30 pts.) insert() member function implemented and working. Func- tion appears to be correctly generating probe sequence using the probe() and hash() functions. Items are correctly inserted into ex- pected location in the hash table.
5. (30 pts.) find()member function implemented and working. Function appears to be also correctly using the probe sequence in the same was as insert(). Function passes the expected tests.
Program Style
Your programs must conform to the style and formatting guidelines given for this class. The following is a list of the guidelines that are required for the assignment to be submitted this week.
1. Most importantly, make sure you figure out how to set your indentation settings correctly. All programs must use 2 spaces for all indentation levels, and all indentation levels must be correctly indented. Also all tabs must be removed from files, and only 2 spaces used for indentation.
2. A function header must be present for member functions you define. You must give a short description of the function, and document all of the input parameters to the function, as well as the return value and data type of the function if it returns a value for the member functions, just like for regular functions. However, setter and getter methods do not require function headers.
7
3. You should have a document header for your class. The class header document should give a description of the class. Also you should doc- ument all private member variables that the class manages in the class document header.
4. Do not include any statements (such as system("pause") or inputting a key from the user to continue) that are meant to keep the terminal from going away. Do not include any code that is specific to a single operating system, such as the system("pause") which is Microsoft Windows specific.