|
CSE Home | About Us | Search | Contact Info |
Homework #8: Dependences & ILP
Due: Friday, 3/9/2007, 10:30 AM
This assignment should be done individually, not in teams.
Assignment Overview
This assignment asks you to determine legal parallel execution schedules for a short execution sequence of machine instructions. To do this, you must first determine the dependences that exist between instructions. Then you "hand simulate" a parallel execution that respects those dependences, under the assumption that there are infinite functional units (i.e., no structural hazards): at each cycle you issue all instructions that have no dependences on any as yet unexecuted instructions.To keep things simple, we assume that each instructions fetches its operands, executes, and then writes the result register all in a single cycle. (Writing occurs as the last thing done in the cycle.)
You should do two simulations, one intended to approximate the properties of the scoreboarding algorithm, and the other Tomasulo's algorithm. In particular, in the first simulation an instruction cannot issue iff it is the later instruction involved in a RAW, WAR, or WAW dependence. In the second, an instruction cannot issue iff it is the later instruction involved in a RAW instruction.
We assume you'll solve this problem by hand - i.e., no lab, no programming. Turnin will be online, though, as explained below.
The Instruction Sequence
Here is the code you need to parallelize. It is assembler, but with two annotations:
- Comments that relate the assembler back to a C program it was derived from. Strictly stpeaking, the comments and the C program are completely irrelevant to this assignment - you're parallelizing the execution sequence described by the assembler; the C code is just to show that this isn't completely contrived.
- Each instruction has a label. We'll use those labels when describing the parallel schedule, as described in the next section.
Here is the instruction stream to be parallelized:
//------------------------------------ // Assembly code, ignoring stack operations //------------------------------------ // lp = 0x20000000 01: lui $t4, $0, 0x2000 // readptr = buff; 02: ori $s0, 0, 0x00004060 // tmp1 = *readptr; 03: lw $t5, 0($s0) // *readptr++; 04: add $s0, $s0, 4 // tmp2 = *readptr; 05: lw $t6, 0($s0) // *readptr = tmp2 + ((lp * (tmp1 - tmp2)) >> 32); 06: sub $t7, $t5, $t6 07: mult $t7, $t4 08: mfhi $t7 09: add $t7, $t7, $t6 10: sw $t7, 0($s0) // tmp1 = tmp2; 11: addi $t5, $t6, 0 // *readptr++; 12: add $s0, $s0, 4 // tmp2 = *readptr; 13: lw $t6, 0($s0) // *readptr = tmp2 + ((lp * (tmp1 - tmp2)) >> 32); 14: sub $t7, $t5, $t6 15: mult $t7, $t4 16: mfhi $t7 17: add $t7, $t7, $t6 18: sw $t7, 0($s0) // tmp1 = tmp2; 19: addi $t5, $t6, 0 // *readptr++; 20: add $s0, $s0, 4 // tmp2 = *readptr; 21: lw $t6, 0($s0) // *readptr = tmp2 + ((lp * (tmp1 - tmp2)) >> 32); 22: sub $t7, $t5, $t6 23: mult $t7, $t4 24: mfhi $t7 25: add $t7, $t7, $t6 26: sw $t7, 0($s0) // return (*readptr * gain) >> 32; 27: mult $t7, $a0 28: mfhi $v0Although irrelevant to this assignment, that execution sequence is based on the following
C
code implementing a lowpass filter://------------------------------------ // C code for a 3-pole fixed point lowpass filter //------------------------------------ int buff[4]; // sample, lpbuff1, lpbuff2, lpbuff3 int do_lowpass(int gain){ int* readptr; int tmp1, tmp2; int lp = 0x20000000 // sets filter frequency readptr = buff; // Retrieve values for filter 1 calculation tmp1 = *readptr; *readptr++; tmp2 = *readptr; // Filter 1 calculation *readptr = tmp2 + ((lp * (tmp1 - tmp2)) >> 32); // Retrieve values for filter 2 calculation tmp1 = tmp2; *readptr++; tmp2 = *readptr; // Filter 2 calculation *readptr = tmp2 + ((lp * (tmp1 - tmp2)) >> 32); // Retrieve values for filter 3 calculation tmp1 = tmp2; *readptr++; tmp2 = *readptr; // Filter 3 calculation *readptr = tmp2 + ((lp * (tmp1 - tmp2)) >> 32); // Modify result by output gain and return; return (*readptr * gain)>>32; }Turnin
Useturnin
to hand in two files,alldep.txt
andflowdep.txt
, the former describing your schedule when all of RAW, WAR, and WAW dependences are used, and the latter when only RAW dependences prevent instruction issue.Each file should be encoded with a single line per instruction. Each line contains the label of the instruction followed by the cycle number during which the instruction executes, assuming the first cycle is cycle 1 (i.e., not cycle 0).
For instance:
01: 1 02: 2 03: 3 04: 1 05: 2would mean that instructions 1 and 4 executed during the first cycle, 2 and 5 during the second, and 3 during the third.
This work is licensed under a Creative Commons Attribution-Share Alike 2.5 License.Department of Computer Science & Engineering
University of Washington
Box 352350
Seattle, WA 98195-2350
(206) 543-1695 voice, (206) 543-2969 FAX
[comments to zahorjan]