Supplement to the hw6 main assignment.Please note the CSE lab bug description at the bottom of this page
Overview
Execution ofbenchImage
takes more than a million cycles. What if something goes wrong in the middle? How do you figure out what went wrong?I've written some tools that might help. The situation is inherently difficult, so don't expect the tools to be a magic bullet. But, they should help.
The basic idea is to create a trace of the
PC
and instruction executed each cycle, writing it to a file. You actually create two traces, one on a known working machine (e.g., Cebollita), and one on the flawed machine. You then compare the two trace files, finding the first place they differ.The tools are the files
convertTrace.pl, diffTrace.pl,
anddiffTrace
. They're available as traceTools.zip.The Process
- Obtain a reference trace -- a trace from a working machine:
- I have produce one using Cebollita to run
benchImage
(with an unaltered OS). It's at
/cygdrive/o/unix/projects/instr/05wi/cse378/cebBenchmarkTrace.txt
(It's 75MB, so don't copy it, just leave it there.)- You can produce one using Cebollita, but it requires distribution cebollita-full.2-4.20050308.jar (Version 2.4 (Build 42)) or later:
$ cebsim --trace cebTrace.txt --bootable benchImage
- Here are the first few lines of a Cebollita produced trace:
00000000 20084000 | 0 "addi $t0, $0, 16384" 00000004 20090010 | 1 "addi $t1, $0, 16" 00000008 01094004 | 2 "sllv $t0, $t0, $t1" 0000000c 8d080028 | 3 "lw $t0, 40($t0)" 00010000 <- 40000028 00000010 2109fe00 | 4 "addi $t1, $t0, -512" 00000014 00096825 | 5 "or $t5, $0, $t1" 00000018 00005020 | 6 "add $t2, $0, $0" 0000001c 8d4b0000 | 7 "lw $t3, 0($t2)" 20084000 <- 00000000 00000020 ad2b0000 | 8 "sw $t3, 0($t1)" 20084000 -> 0000fe00The first two columns are thePC
and the instruction, respectively. The '|' is used to indicate the rest of the line is a comment. For Cebollita produced traces the information is the cycle and the disassembled instruction. If the instruction is a load or a store it also shows the data and the address (in that order, for both loads and stores).
- Create a trace of your
SMOK
machine's execution:A potentially REALLY big file is now in whatever
- Start
SMOK
and load your machine.- Add a
Trace
component to your machine.- Double-click on the
Trace
component to bring up its properties dialog.- For
#Inputs
, say 3.- Type a name for
Trace Output file
, e.g.,smokTrace.txt
.- Hit OK. Here is a picture of the
Trace
component when you've done this, except that I've added names to the input ports here for clarity:![]()
- Connect the uppermost input to your
PC
, the middle input to yourICache
's output (i.e., the current instruction), and the bottom input to a line that is 1 if the machine is stalling this cycle and 0 otherwise.- Boot
- Shutdown
SMOK
SMOK
's working directory was - probably the directory your model file came from. The file does not have a nice format, so I'm not showing any of it here.
- Convert the
SMOK
trace file to a nicer format:$ perl convertTrace.pl smokTrace.txt >smokConvertedTrace.txtThis step requires that you have perl installed, obviously (it comes, or at least can come, withcygwin
), and that you have extracted theconvertTrace.pl
file from the.zip
distrbution.Here's the beginning of a converted
SMOK
trace file:00000000 20084000 | 0 2 00000004 20090010 | 1 5 00000008 01094004 | 2 8 0000000c 8d080028 | 3 11 00000010 2109fe00 | 4 14 00000014 00096825 | 5 17 00000018 00005020 | 6 20 0000001c 8d4b0000 | 7 23 00000020 ad2b0000 | 8 26The first two columns are thePC
and instruction. The comment fields are the "virtual cycle" and the actual cycle. This trace was taken on a machine that does stall. All stall cycles have been eliminated in the conversion process. The virtual cycle is a count of how many instructions have actually been executed so far (i.e., ignoring stalls). The actual cycle is the actualSMOK
cycle (in case you want to set aHalt
component on a counter that simply counts up one each cycle, for debugging).
- diff the two traces:
For example,Eventually (it takes a while) you will get some output in the$ ./diffTrace smokConvertedTrace.txt /cygdrive/o/unix/projects/instr/05wi/cse378/cebBenchmarkTrace.txt
cygwin
window that shows where and how the files differ. Here is an example:792,797d791 < 0000fe68 8d0c0000 < 0000fe6c 01816026 < 0000fe70 140cfffd < 0000fe68 8d0c0000 < 0000fe6c 01816026 < 0000fe70 140cfffd ...The first line means that lines 792-797 in the first file (smokConvertedTrace.txt
) would need to be deleted for the two files to continue matching (at line 791 in the second file). It also means that if you deleted those lines, the files would start matching again. Thus, those cycles are instructions that theSMOK
machine executed but the Cebollita machine did not.To better understand what these lines mean, it's often easiest to look in the Cebollita trace file, since it has instruction disassemlies in it.
- Resolve differences between the traces:
Unfortunately, just because the traces differ at some point, it doesn't mean there's a problem. In the case above, for instance, if we look in the Cebollita trace file at the instructions around the problem area, we see this:0000fe68 8d0c0000 | 788 "lw $t4, 0($t0)" 00002000 <- 40000010 0000fe6c 01816026 | 789 "xor $t4, $t4, $at" 0000fe70 140cfffd | 790 "bne $t4, $0, -3"Here the code is waiting for the block controller to indicate that it is done doing whatever operation was asked of it. Because I/O operations take an unpredictable amount of time, the number of times around this polling loop differs in theSMOK
and Cebollita executions. That is, both trace files indicate working machines, even though the traces differ.To resolve this difference, edit
smokConvertedTrace.txt
, deleting lines 792-797. Then return to the previous step.Doing that, I get:
838,858d837 < 0000fecc 8d0c0000 < 0000fed0 01816024 < 0000fed4 100cfffd < 0000fecc 8d0c0000 < 0000fed0 01816024 < 0000fed4 100cfffd < 0000fecc 8d0c0000 < 0000fed0 01816024 < 0000fed4 100cfffd < 0000fecc 8d0c0000 < 0000fed0 01816024 < 0000fed4 100cfffd < 0000fecc 8d0c0000 < 0000fed0 01816024 < 0000fed4 100cfffd < 0000fecc 8d0c0000 < 0000fed0 01816024 < 0000fed4 100cfffd < 0000fecc 8d0c0000 < 0000fed0 01816024 < 0000fed4 100cfffd ...This is another instance of the same problem. Delete those lines and try again. Eventually either you have eliminated all differences as benign, in which case yourSMOK
machine behaved the same as the reference machine, or you have found a place where theSMOK
machine went wrong. In the latter case, looking at the preceeding few instructions in the Cebollita trace will often tell you just what isn't working. (E.g., if your machine fails on the instruction that follows anrfe
, it's pretty like thatrfe
isn't working, or thatExceptionPC
isn't being written correctly.)Notes
- The two traces can reasons other than the amount of time I/O takes without it indicating a problem. For example, my
SMOK
machine sets the cause to 4 (addressing violation) when the addressing exception occurs in the final benchmark app, while the Cebollita machine sets it to 6 (bus error). When the OS prints the error line, the executions differ (because one is trying to print 4 and the other 6). Both are working.
- The trace files can be very, very big -- a few hundred megabytes. If working on a lab machine, you might have to create the file on the C: or D: drive, for quota reasons. (It will be lost when you logout.)
CSE Lab Bug
I announced some tools a couple days ago that let you compare a trace of your smok machine's execution with a cebollita execution. The tools involve using perl for a number of the steps. Unfortunately, the instructional machines have been (mis?)configured with a version of perl for 64-bit machines. This is causing a problem with the tools, in particular with the step in which you run convertTrace.pl to convert the smok generated trace to a format suitable for comparing with the cebollita trace. A workaround: - $ scp convertTrace.pl diffTrace diffTrace.pl attu: - $ scp mySmokTrace.txt attu: - $ ssh attu - attu> perl convertTrace.pl mySmokTrace.txt >smokTrace - attu> ./diffTrace smokTrace /projects/instr/05wi/cse378/cebBenchmarkTrace.txt -j