We have similarly discussed other parameters of VM systems - page size, number of physical pages, the application's locality - but, again, have not presented evidence that our theoretical analysis holds in practice.
This assignment seeks to explore these replacement algorithms and VM parameters, using real data collected by Dennis Lee, a graduate of the department. The data was collected using Etch, a tool for instrumenting Windows NT applications.
Etch produces trace files that, for our purposes, list every virtual address referenced (be it an instruction fetch, a load, or a store) by a program during execution. Note that these trace files (like the applications) do not have any information about the underlying pages. These files are stored in .et format, and the parsing of that format is taken care of for you.
Since this project does not involve modifying the kernel, it does not require use of VMWare. The vmtrace package should work on pretty much any UNIX machine (which should include any recent version of Linux, Mac OS X, *BSD, Solaris, or even Linux running under VMware). It should even work on Windows using the Cygwin package (make sure you install the zlib package if you are using Cygwin). Because trace analysis is very CPU intensive, we encourage you to use your own machine, if possible. As always, please do not use tahiti, fiji, ceylon, or sumatra.
If you are using a shared machine, please nice(1) your vmtrace process. Ex: nice ./vmtrace [vmtrace arguments]
Vmtrace is available on spinlock/coredump in /cse451/doug/vmtrace-1.X.tar.gz. (Where X is the release number number, which may be updated. Use the latest version.) For your convenience, the latest version is also available via http.
Like simplethreads, vmtrace contains a lot of files, but most are safe
to ignore. Pay attention to:
File | Contents |
---|---|
vmtrace.c | The main() routine; very simple. |
vmtrace.h | Defines common datatypes (e.g. vaddr_t). |
simulate.{c,h} | The main loop; gets the next reference, determines if it is a fault, and updates the modified/reference bits. |
fault.{c,h} | The fault handlers; this is where you'll be adding most of your code. |
pagetable.{c,h} | Implements a pagetable. Also contains the definition of the pte_t struct. |
physmem.{c,h} | Models physical memory, which your replacement algorithm needs to manage. |
stats.{c,h} | Collect and output statistics. Note the increment-accessors are in stats.h as inline functions. |
util.{c,h} | Utility routines to access bit fields and compute logarithms/exponents (base 2). util.h also contains vaddr_to_vfn, which converts a virtual address to a virtual frame number. |
options.{c,h} | Parses command line options; if you add configuration parameters to your algorithm, you can parse them here. |
input.{c,h} | Parses the tracefile and returns the next reference. You probably won't need to modify or use these files. |
Makefile.am | This file lists the source files (both .c and .h) for the project. See below for instructions on adding new files. |
The build procedure should seem familiar: it is identical to that for simplethreads. Vmtrace should compile without any warnings.
In summary, the steps are:
Run ./vmtrace -h to see the help/usage information. Note that you do not need to gunzip the tracefiles before using them; vmtrace will decompress them on the fly (assuming the zlib library is available on your system; the -h output will confirm this).
vmtrace has several options intended to make simulation easier. It can append the statistics to a given file (-o FILE) rather than printing them to stdout. The results are reported in comma-separated-value format (CSV) for ease of analysis. I recommend using the -o option to save your stats in combination with the -v option, which will output progress information.
Vmtrace, as shipped, contains only a single page-replacement algorithm (random). For part 1, it is your task to add the LRU algorithm. The algorithm should find a space in physical memory for the given pte. This may mean evicting (physmem_evict) a page which is already occupying that space (note that nothing bad will happen if you call evict on a PFN that is not occupied). It should then call physmem_load to insert the pte.
To make your algorithms available, add them to the fault_handlers
array in fault.c. See the random algorithm for an example.
Design an experiment using the scientific method to examine some aspect of virtual memory. There are many parameters available in the simulation - replacement algorithm, number of pages, page size, and parameters of the algorithm - that you may choose to vary; note that a good experiment will probably focus on one parameter (at a time).
The simulation currently reports the following statistics for each type of reference (instruction fetch, load, and store):
Statistic | Description |
---|---|
references | Total number of memory accesses. |
miss | Number of page faults. |
compulsory | Number of compulsory faults (first time a page was accessed). |
evictions | Number of times a page was removed from physical memory. |
pageouts | Number of times an eviction required a write to disk. |
In addition, the statistics output includes the number of physical pages used, the page size (in bytes), the input file name, the replacement algorithm, and the simulation limit on number of references (or 0 if unlimited). This is intended to make it easier to track multiple experiments; using the same output file (-o), you can append successive trials to a single stats file. Note that the type statistics (ifetch/load/store) relate to the cause of the eviction or pageout, not the type of page that was evicted. You may find it useful to add more statistics to the simulation.
You may wish to burn the trace onto CDs. CD burners are available in the labs. The files are accessible by mounting the cse451 drive as was done in project 2 to transfer your kernel files. For example, the Windows NT command net use l: \\coredump.cs.washington.edu\cse451 would map the cse451 directory from coredump to drive letter L in Windows NT (alternatively, you may be able to just enter \\coredump.cs.washington.edu\cse451 in any window path, though I've had better luck using the net command). Note: Because the CD burner requires a constant stream of data, it may be helpful to copy the file to the local machine before burning (but be sure to delete it afterwards).
A full trace simulation can take hours, so make sure to leave plenty
of time for actually conducting the experiment. The nohup(1) command
may be useful (normally, if you logout, your simulation would end; nohup
in combination with background (&) will allow you to run your
command and come back for the results later).
Write a report, presenting your experimental design, data, analysis, and conclusions. As always, concise reports are better than overly verbose ones! You should consider how to most effectively present your data (graphs, charts, tables, and/or a discussion).
While your conclusions should contain a discussion of what you believe the experimental results mean, you should be careful to distinguish between what your experiment has actually proven and what you are speculating on. You are not required to write separate reports for both team members, but you may do so if you wish to (and be sure to clearly indicate that you chose so if you did).
Accepted file formats are TXT, HTML, DOC, PS, and PDF. Call it something very intuitive, say, report.pdf so that we can easily find it among the files in your submission.
In your top-level vmtrace directory, run make dist. This will produce a file named vmtrace-1.X.tar.gz. Submit this file using the turnin(1L) program under project name proj4 by 11:59pm on the day it is due. turnin will not work on coredump/spinlock, so you'll need to use one of the general-purpose machines (sumatra, fiji, ceylon, or tahiti).
If you have added any files, run tar -tzf vmtrace-1.X.tar.gz and check to make sure your new files are listed.
Make sure to include your report along with your submission.
Print your submitted report and bring it to lecture on Wednesday, May 28.