Suppose we are a member of a group of music lovers. Each music lover has a machine on which they store their favorite mp3 files. Each machine is connected to the other music lover's machines over a network. Wouldn't it be great if there were some easy way to share our favorite songs with our fellow music lovers? And if we could download songs from other people's machines over the network? And if we didn't have to worry about copyright infringement? In this assignment we will simulate how this can be done (well, everything except the legal issues) over a peer-to-peer network (as opposed to a client-server style network).
Of course we will need some way to store our song databases, and for this we will be using the Dictionary ADT. A Dictionary is a collection of data that supports the operations Lookup, Insert and Delete. Furthermore, each data element stored in the Dictionary must have a unique key value that may be used as a parameter to the three Dictionary operations. Here are some examples: in an English Language Dictionary the data are the word definitions and the keys are the words themselves; in a phonebook the data are names, phone numbers and addresses (and occasional advertisements) and the keys are the people's names; any book may be thought of as a dictionary, with each page being the data element and the page number being the key. In this assignment our data will be songs and our keys will be unique song identification numbers.
Our network will be represented by a collection of nodes (each node representing some music lover's machine). Each node will keep a database of songs (a Dictionary), and will have a predetermined capacity (i.e. maximum number of songs that can be stored in the database). Each node will also have a preferred neighbor node. A node may only initiate network communication with its preferred neighbor, however it may respond to any node that contacts it. In our simulation, all network communication will be either requests or responses to requests. A request is initiated when a user at a given machine requests a song that doesn't already exist in that machine's database. That node will then request the song from its preferred neighbor. If the node that receives the request has the song in its database, it will respond by sending the requested song back over the network. Otherwise, it will in turn request the song from its preferred neighbor. If the song is not found within a specified number of network "hops" from the original machine then the request will time out, and NULL will be returned. If the song is eventually found, it will be copied into the database of every node along the request path (including the node that initiated the request). If any node becomes full then it deletes the oldest song (i.e. the song that's been in the database the longest) from its database before adding the new song. (I know what you're thinking--oldies are the best! But this was the easiest way to work the need for deletion into the assignment.)
At the beginning of the simulation the network configuration will be read from an input file (see the Input files section below) describing the number of nodes, each node's capacity, and each node's preferred neighbor. Then a separate input file will be read for each node, initializing that node with a list of songs for its database. Finally, a request file will be read, listing the requested songs and at which node each request originates. Once all the requests are processed, the program will display the total number of file transfers (see the Viewing the results section below) and terminate.
You may work in teams of two on this assignment. Please refer to the Teamwork section of the Programming Assignment Guidelines. We strongly encourage you to work in teams! Individuals will be required to complete the same amount of work.
Your assignment has the following three parts:
The Dictionary ADT is by definition implementation independent. We will need to perform many Lookups and Inserts, as well as some Deletes, and we'd like to find an efficient implementation of the Dictionary ADT for this purpose. We will be examining and comparing three implementations: the SortedVectorDictionary, the BSTree, and the AVLTree. The first two are already implemented; your job is to implement the third.
Your AVLTree class must be derived from the provided BSTree class. You will probably also want to create your own AVLNode class derived from the BSTNode class, since the latter doesn't store height information required to keep the balance. You may not add any public methods to AVLTree beyond what BSTree has unless you clear it with me (Nic) beforehand. However, you may add as many private methods as you wish (within reason).
The text explains fairly well how AVL Trees work, and even provides code for some of the methods. I encourage you to refer to this code if you are stuck, but you will not be able to copy it directly because you must conform to the interface of the BSTree class. The text does not cover AVL deletion, however! It is up to you to come up with a deletion algorithm that runs in O(log N) time. Hint: AVL deletion may require O(log N) rotations (in contrast to AVL insert, which requires at most two single rotations).
IMPORTANT: You must name your AVL class "AVLTree" (not "AvlTree" or "AVL_Tree"). Your specification file must be named "AVLTree.h", and your implementation file must be named "AVLTree.cpp". If you do not follow these requirements then your code may not work properly with my test files.
Currently, the Node class uses only a SortedVectorDictionary as its song database. Implement the command-line options -BST and -AVL to indicate which data structure the Node class should use. If neither of these flags are given then a SortedVectorDictionary should be used. If both of these flags are given the program should output an error message and exit.
A large proportion of your grade will be based on the completeness and clarity of your README file. See the The README section below for a description of what your README file must include.
As a starting point we've provided a few files:
All provided files can be found in the project directory:
/cse/courses/cse326/01sp/project2
There are three types of input files for this project:
Some sample input files may be found in the same directory with the other provided files.
You are required to turn in your project2 directory, which must include at least the following files:
Turn in your files electronically using turnin:
% cd ~/your326Directory/yourProject2Directory
% make clean
% cd ../
% turnin -c cse326 -p project2 yourProject2Directory
Note that you do not need to print out anything or hand in any paper copies! Only electronic submissions will be accepted.
Your README file should document your submission (though it does not by any means replace comments in the code!). It should contain the following:
None for this assignment.
For a small amount of extra credit, you may attempt one or more of the following. If you choose to do one of these, it is important that you first have your program working exactly according to the specifications, so that it will work with my test scripts. If you would like to try something not on this list, please email me about it first (bone@cs).
The BSTree class is modified from Weiss' BinarySearchTree class. The idea behind this assignment was developed by Alon Halevy, Maya Rodrig, and Nic Bone. Except as mentioned above, this document and all the supplied code were written by Nic Bone.