Running SortExplorer: The sort explorer jar file is here: SortExplorer.jar . Once downloaded, most likely you can just click on the .jar file in a directory and it will run. If not, more information on running a jar file can be found here. For large input sizes and/or particular input patterns, some algorithms may run out of stack space. You will notice this because you click on the name of a sort and it returns without the experimental results changing. This should return fairly quickly (as opposed to waiting awhile for something to finally return) allowing you to hit other buttons – you will not be able to hit other buttons while waiting for a long-running algorithm to finish. If you run the jar file from the command line you will see the stack overflow messages reported there. If you run it by just clicking on the .jar file you will not see the messages. If you are interested in changing the runtime stack size you can do so when running from the command line. To increase stack size to 2 MB type: java -jar -Xss2m SortExplorer.jar
Don’t forget your student ID!! Remember to type in your student ID before running any experiments! Your student ID must be visible in the upper window before you run any sorts. Different student numbers will have different mappings of sorts to buttons. Be sure you are collecting data for YOUR student number or else we will not be able to grade your homework!
What to turn
in
Turn
in a written report (pdf or word are the preferred file formats) explaining
which sort is which, including supporting data to back up your claims. The following supporting data is required,
for every sorting algorithm:
1. Full statistics for input size 16: The runtime, number of comparisons, and movements for Random, InOrder, and ReverseOrder for an array of size 16. A table will be sufficient as your answer for question 1, no explanation required..
2.
Runtimes for at least four nontrivial
input sizes: In addition you must record the runtime for several (at
least four) nontrivial input sizes for Random, InOrder, and
ReverseOrder. A nontrivial input size
means one where the runtime is more than just a few milliseconds. Feel free to also record number of
comparisons and data movements for all of your input sizes, but it is only
required for size 16. Feel free to record data on AlmostOrder but that is not
required. A
table will be sufficient as your answer for question 2, no explanation required.
When you can, increase the input size until the runtime takes at least one second. If input sizes are chosen well, they will allow you to see enough variation in the statistics to give you strong evidence to support your answers to (3) and (4) below. You need not use the same input sizes for every algorithm, since this might not produce the most useful data; but if you intend to compare two algorithms directly by their statistics, you should use the same input size and list properties for that comparison.
Each algorithm should have its own table. You will probably find it useful to create your tables in a spreadsheet program and just cut and paste values from the SortExplorer directly into the spreadsheet. Later on you can transfer your tables into a document. Here is a sample table, input sizes shown are just an example, and you are not required to put your values for N=16 into the table but you may find it convenient to keep all your data in one place. Shaded boxes show data that is optional (but may be useful) to collect.
Sort Alpha:
Input Size |
Random |
InOrder |
ReverseOrder |
||||||
|
ms |
Comp |
Mvmt |
ms |
Comp |
Mvmt |
ms |
Comp |
Mvmt |
16 |
0 |
5 |
23 |
0 |
75 |
15 |
0 |
80 |
45 |
10000 |
71 |
34234 |
234 |
123 |
7567 |
6767 |
237 |
4545 |
4545 |
20000 |
463 |
3434234 |
2323 |
456 |
87567 |
5767 |
628 |
5454 |
45453 |
30000 |
1527 |
9342342 |
52245 |
1891 |
755675 |
6756 |
2500 |
53454 |
54545 |
40000 |
2007 |
23423234 |
523424 |
2543 |
756776 |
7567 |
4370 |
545455 |
54545 |
3. Worst Case Big-O: Your estimation of the worst case Big-O runtime for each algorithm. You must gather appropriate data that clearly shows the relevant patterns of the algorithm's runtime. This likely means measuring different algorithms at different input sizes, since some algorithms are much faster than others.
No explanation is required here, however, if the runtime data you collected in question 2 is in conflict or in no way suggesting the runtimes you give for question 3 then you should consider collecting more runtime data. We don't expect curves that show a perfect nlogn or anything like that though, just reasonable data. The recommendation of "measuring different algorithms at different input sizes" is a good one to be mindful of. Try to pick values of n that help you discern the running time of a particular algorithm.
4. Which algorithm is it? Your statement of which sorting algorithm it is, along with your reasoning. You should base your reasoning on the following:
• growth rate (Big-O) of the algorithm's runtime as input size changes
• speed or number of comparisons and data movements of the algorithm compared to the other algorithms
• changes in behavior of the algorithm, if any, for different input orderings
If errors caused by the algorithm caused you to deduce that algorithm, explain why you suspect that algorithm would cause the error.
If all of your claims are based only on relative speeds of the algorithms ("sort Beta was the fastest, so therefore it's X-Sort"), you will not receive full credit. You will also not receive full credit if you use "process of elimination" to state which algorithm is being used ("I have already figured out the other four algorithms, so the last algorithm must be X-Sort").
It is only question 4 that requires you to give explanations. A good answer to question 4 is a strong argument. I would try for two, if possible three strong points that support your claim of what sort this is. I would recommend looking at this posting may help clarify what is needed for the different questions.