CSE 326 Data Structures
Project 4
Heaps vs. Calendar Queues
Due:
Tuesday, June 2nd, 11:30 PM for the program and Wednesday, June 3rd, in
class, for the written report.
Objective:
To compare the performance of heaps and calendar queue as the event
queue in a discrete event simulator and write a report describing
your findings.
Overview
A discrete event simulator needs a priority queue to maintain its event
queue. The event queue is used in the main simulation loop: the event
with the smallest time is removed from the event queue and new events
are inserted into the event queue
as a result of the processing of the smallest time event. There are
many implementations of priority queues. One of the very best is the
elegant binary heap. Another, which is especially good for
discrete event simulation, is the calendar queue. The calendar queue
has two user controlled
parameters, bucket width and number of buckets, that govern
its performance. If the parameters are chosen correctly then the
calendar queue can have constant average time per event processed.
By contrast, the binary heap has logarithmic average time per event.
In this project you will write programs implementing the binary heap
and calendar queue and explore the differences between the two.
In addition, you will explore the process of selecting the parameters
of the calendar queue so as to achieve optimal performance.
A principal part of the project is a written report with your findings,
including your experimental methodology and a number of
graphs showing the results of your research. In this project
we will only consider discrete event simulations where the number of
active events does not vary, that is, the static case.
Assignment
There are three pieces of code that you will have to write:
the binary heap, the calendar queue, and the driver code that
exercises the two priority queues and measures the time used by each of them.
We provide one function, NewEvent,
that emulates the processing of an event. The function can also be used
to seed the priority queue with initial events. The NewEvent
function has three parameters: t, the time of the event being
processed, m, the mean of the jump, and w, the amount
of work needed to process the event. The function returns a time
t' that is t plus a random amount that is
exponentially
distributed with mean m. The function takes about w
times .01 milliseconds to return the value, which emulates the time to process
an event in a real simulation. To seed the priority queue with N
events you simply call NewEvent N times with t = 0 and
w = 0. The parameter m can be chosen to an
appropriate fixed positive constant. The NewEvent function
is provided for you in the files "newevent.h" and
"newevent.C", which can be found in the
/cse/courses/cse326/98sp/project4 directory.
The report should describe and compare
the performance of the calendar queue and
the binary heap.
Four graphs that should be included in your report. These graphs
should be presented professionally using plotting tools in common
spreadsheets.
- Time per event vs. bucket width for the calendar queue.
Fix the number of events to be 1,000, fix the mean of the
jump to be 1,000,000, fix
the work to be 0, and fix the number of buckets to be 5,000.
Plot the expected time per event for varying sizes of bucket
width. Your plot should demonstrate that there is an optimal bucket width.
- Time per event vs. number of buckets for the calendar queue.
Fix the number of events to be 1,000, fix the mean of the
jump to be 1,000,000, fix
the work to be 0, and fix the bucket width to be approximately optimal
for the number buckets equal to 5,000. Plot the expected time per event
for varying number of buckets. Your plot should demonstrate that choosing
a small number of buckets degrades performance and that as the number
of buckets grows performance eventually levels off.
- Compare the binary heap and the calendar queue for varying priority
queue size.
Fix the mean of the jump to be 1,000,000, fix and the work to be 0. Plot the
expected time per event for each of the binary heap and the calendar queue
for varying number of events ranging from 100 events to 10,000 events.
Care must be taken to choose the bucket width and number of buckets
to achieve good performance for the calendar queue for the various
numbers of events.
- Compare the binary heap and the calendar queue for varying work.
Fix the mean of the jump to be 1,000,000 and the number of events to be 10,000.
Fix a bucket width and number of bucket for near optimal performance for
the calendar queue. Define the speed-up to be the expected time of
the calendar queue divided by the expected time for the binary heap.
Plot the speed-up for varying amounts of work, starting with work equal
to 0.
Experimental Methodology
Care must be taken to make sure that the numbers you generate are valid
and reflect the true performance of two priority queues.
You should use a function to measure the wall clock time. A good one to
use is the standard C library function clock (defined in
"time.h")
which returns the number of clock ticks that have passed since your program
started. There are also some non-standard functions available on our UNIX
machines that may be more accurate. You will want to run the simulation
for a number of steps before starting to measure the time per event. This
is the "warm-up" period. How long should the warm-up period be?
You will then want to run the simulation for
long enough to get a good measurement of the time per event. How long
is long enough? One common test is that if you run the simulation
for K steps and 2K steps and there is little change in the time per event,
the K steps is probably enough.
You will discover the need to make
repeated experiments for each of your data points. If you are running
your experiment on a multi-tasking system, then your timing measurement
can be affected by other tasks running on the same machine. How many
trials should you make in order to get good data? What data should be
thrown out? A common experimental approach is to make a small number of
trials like 15 and take the median as indicative of the correct value.
Taking the average of 15 would include bad data.
To minimize interference from other tasks you might consider running your
experiments on a single user computer. Even on a single user machine that
is also multi-tasking you should be careful to stop any background jobs.
Remember that your project report should contain a description of your
experimental methodology.
What to hand in
You are required to hand in your project4 directory with your implementation
of the binary heap and calendar queue. You may have implemented
a number of different drivers to generate the data for your four plots.
We don't need to see them all. You can just include one in your
project4 directory. You also need to turn in a written report on Wednesday,
June 3rd.
-
Source Code:
If you generated your data on a machine other than a Sieg 329 Lab machine
then be sure to compile and run your programs on a Lab machine before
turning them in.
-
README: - This is a text file describing your project
implementation. This does not need to duplicate information in your project
4 report.
-
Project 4 Report: This is a formal report on what you have learned
from the experiments you did in this project. The report must be
type written and well organized. It should include the four plots
described above. For each plot write a paragraph that explains what
the plot demonstrates about the performance of the calendar queue
and/or the binary heap. Your report should contain your experimental
methodology. Your report should contain a conclusion that
gives the pros and cons of using a binary heap or calendar queue as the
event queue in a discrete event simulator. The report should be less
than 5 pages not including the plots. This report is to be
turned in on Wednesday, June 3rd, in class.
Turn your source code and README
files in electronically using turnin:
% cd ~/326stuff/myproj4dir
% make clean
% cd ../
% turnin -c cse326=AA -p project4 myproj4dir
Warning
Some of your experiments can take a long time. You do not want to
wait until the last minute to do this project.