Convert 101011 to decimal (not 2's complement) Convert 1111001000 to decimal (2's complement) Convert 001101111000 to hex Convert 11011110101011011011111011101111 to hex. (and if you want, try decimal!) 125 in eight bit two's complement binary. -113 in eight bit two's complement binary. add 10101101 and 00101110. 1) How are the execution time and performance of a program related? (If computer A takes 5 seconds to run a program and computer B takes 20 seconds, how do their performances compare?) 2) What are some metrics that can be misleading when judging the speed of a computer? (Which ones should you not use when comparing different architectures?) 3) If your average CPI is 1.3, your execution time is 2 seconds, and your processor speed is 500MHz, how many instructions did you execute? 4) Describe Amdahl's Law briefly. 5) There are two means commonly used in measuring the performance of a workload. What are these means and in which cases are they used? 6) In MIPS assembly language, write a function to reverse a string. The base address of the string is in $a0 and the length of the string is stored in $a1. Write the result into the address located in $a2. (Hint: Try writing it in C first, and translate directly. No shortcuts.) If you are a group of MIPS gurus, or if you want extra practice after class, try it without $a1 and $a2, writing the result back into $a0. 7) In MIPS assembly language, write an "itoa" program which converts a positive integer to an ASCII string. The integer to convert is stored in $a0. The base address of the array in which to store your resulting characters is in $a1. You are given a function called "reverse" which takes a string pointer argument in $a0 and replaces the memory pointed to by $a0 with the reversed string. If you use this function, be sure to observe the MIPS calling conventions. (You can assume that $a0 will always be a positive integer and that the array pointed to by $a1 will always have enough space for the resulting string.) CHALLENGERS: 8) Write a MIPS program that will sort an array of integers. The array is pointed to by $a0. The number of elements in the array is stored in $a1. The sorted array should be at the same address as the original array. 9) Write a MIPS program that will determine if a number is prime. Return a 1 in $v0 is the argument in $a0 is prime. Return a 0 in $v0 if the argument in $a0 is not prime. Solutions to previous problems: 1) How are the execution time and performance of a program related? (If computer A takes 5 seconds to run a program and computer B takes 20 seconds, how do their performances compare?) Inversely. 2) What are some metrics that can be misleading when judging the speed of a computer? (Which ones should you not use?) MIPS, MFLOPS. These numbers when used on very different architectures are very poor comparisons. Parallelism, CISC vs. RISC, floating point programs versus integer programs, etc. 3) If your average CPI is 1.3, your execution time is 2 seconds, and your processor speed is 500MHz, how many instructions did you execute? 770 million. 4) Describe Amdahl's Law briefly. "Performance improvement from speeding up a part of a computer system is limited by the proportion of the time the enhancement is used." Basically, your the overall improvement of your computer is based on the execution time of your programs, not on how much you improved some portion of it. 5) There are two means commonly used in measuring the performance of a workload. What are these means and in which cases are they used? The arithmetic mean is used for averaging execution times. The harmonic mean is used to average rates. The harmonic mean is simply the inverse of the arithmetic mean. So a low arithemetic mean is good while a high harmonic mean is good. 6) In MIPS assembly language, write a function to reverse a string. The base address of the string is in $a0 and the length of the string is stored in $a1. Write the result into the address located in $a2. (Hint: Try writing it in C first, and translate directly. No shortcuts.) If you are a group of MIPS gurus, or if you want extra practice after class, try it without $a1 and $a2, writing the result back into $a0. # Here is my pseudo code $t1 = $a1 - 1; # t1 is the last index into the # destination string $t0 = 0; # t0 is the first index into the # source string. while ($t0 < $a1) { # Loop while we haven't finished # reading the string $a2[$t1] = $a0[$t0]; # Store a character from the source # string into the proper place in the # destination string. $t1--; # Decrement our destination pointer. $t0++; # Increment our source pointer. } $a2[$a1] = 0; # Null terminate our string. return; # Done. # And here is the (untested) assembly. sub $t1, $a1, 1 # t1 is the last index into the # destination string. move $t0, $0 # t0 is the first index into the source # string. loop: beq $t0, $a1, done # Loop until we have finished looking at # the whole source string (when t0 == a1) add $t3, $t0, $a0 # Get the address for the current byte in # the source string. lb $t4, 0($t3) # Load the current byte from the source # string add $t3, $a2, $t1 # Get the address for the current byte in # the destination string. sb $t4, 0($t3) # Load the current byte from the # destination string. sub $t1, 1; # Decrement our index into the destination # string. add $t0, 1; # Increment our index into the source # string. j loop # Continue looping done: add $t3, $a2, $a1 # Get the index to the end of the string sb $0, 0($t3) # and store a 0 to null-terminate it. jr $ra # return; 7) In MIPS assembly language, write an "itoa" program which converts a positive integer to an ASCII string. The integer to convert is stored in $a0. The base address of the array in which to store your resulting characters is in $a1. You are given a function called "reverse" which takes a string pointer argument in $a0 and replaces the memory pointed to by $a0 with the reversed string. If you use this function, be sure to observe the MIPS calling conventions. (You can assume that $a0 will always be a positive integer and that the array pointed to by $a1 will always have enough space for the resulting string.) # My pseudocode $t0 = $a0; $t4 = $a1; if ($t0 == 0) { # 0 is a special case. Deal with it # first. $t2 = '0'; # t2 will hold the character '0' sb $t2, 0($t4) # Store '0' in our destination $t4++; # Keep track of our location in the # string. } else { # This is the non-zero case while ($t0 != 0) { # Keep looping until we have characterized # the entire integer $t1 = $t0 % 10; # This is the last digit of the integer. $t0 = $t0/10; # This is everything but the last digit of # the integer. $t2 = $t1 + '0'; # Add '0' character to get the proper # character for this digit. '0' + 0 = '0' # '0' + 1 = '1', etc. sb $t2, 0($t4); # Store the character version of our digit # at our current location in the string. $t4++; # Increment our current location in the # string. } } sb 0, 0($t4) # Null terminate our string. reverse ($t4) # We have a backwards version of our # integer now, so reverse it. return; # all done. # Here's the MIPS (untested). addiu $sp, $sp, -12 # We are going to call a procedure at some sw $ra, 12($sp) # point during this procuedure, so save sw $a0, 8($sp) # important registers on the stack. sw $a1, 4($sp) move $t0, $a0 # t0 will hold the remaining digits of our # integer move $t4, $a1 # t4 will be the pointer to our current # location in the destination string. bne $t0, $0, WhileLoop # If we have a 0 integer, that is a # special case, so deal with it. li $t2, 48 # 48 is the character code for '0' sb $t2, 0($t4) # Store the character in our destination # string. addi $t4, $t4, 1 # Advance our pointer to the string. b done # Go finish up (this was the 0 case) WhileLoop: beq $t0, $0, done # Keep looping until we have no digits # left div $t0, 10 # Divide t0 by 10. Remember that div # stores the remainder in hi and the # quotient in low. There may be # psuedoinstructions such as rem and div # which take destination registers. If # so, you'd probably want to use those # instead of mfhi and mflo as I do. mfhi $t1 # $t1 = $t0%10 (last digit of t0) mflo $t0 # $t0 = $t0/10 (first digits of t0) addi $t2, $t1, 48 # Store the character code, '0' + $t1, in sb $t2, 0($t4) # our destination string. addi $t4, $t4, 1 # Increment our string pointer. j WhileLoop # Continue looping. done: sb $0, 0($t4) # Null terminate our string. move $a1, $a0 # Put our backwards string as argument a0 jal reverse # Reverse the string. lw $ra, 12(sp) # Restore variables. Oops, guess I didn't # need to save $a0 and $a1 addiu $sp, $sp, 12 # Put stack back in place. jr $ra # Return to calling function. CHALLENGERS (didn't bother writing solutions): 8) Write a MIPS program that will sort an array of integers. The array is pointed to by $a0. The number of elements in the array is stored in $a1. The sorted array should be at the same address as the original array. 9) Write a MIPS program that will determine if a number is prime. Return a 1 in $v0 is the argument in $a0 is prime. Return a 0 in $v0 if the argument in $a0 is not prime. 10/21/99 Practice problems 1) Here is a dump of SPIM (or some similar program). Translate the memory dump into the Java bytecodes it represents. This SPIM machine is little-endian. [0x10000100] 0x10361536 0x073653660 0x12109b64 0x5922039a [0x10000110] 0x00ac1215 2) Finish filling out the control worksheet we started. (Which you can find on ftp://ftp.mkp.com/COD2e/slides/PCppt/ch5win95.ppt, slide 17. Imagine that the chart isn't filled in). *3) What would you need to do to add a jal instruction to the datapath shown in the handout? (Hint: What happens to the PC? What happens to the return address?) Solutions to 10/21/99 Practice problems 1) Here is a dump of SPIM (or some similar program). Translate the memory dump into the Java bytecodes it represents. This SPIM machine is little-endian. [0x10000100] 0x10361536 (bad->)0x073653660 0x12109b64 0x5922039a [0x10000110] 0x00ac1215 The second number should have been: 0x0736053660 (which is also wrong), but in my haste, I messed up! Sorry. istore 21 istore 16 isub iflt 16 18 ifne 3 34 dup iload 18 ireturn 2) Finish filling out the control worksheet we started. 1 0 0 1 0 0 0 0 1 1 1 1 0 0 x 1 x 0 0 1 0 x 0 x 0 0 0 1 *3) What would you need to do to add a jal instruction to the datapath shown in the handout? (Hint: What happens to the PC? What happens to the return address?) The mux which controls the destination register (the one we are writing to) needs to be expanded to handle an extra input. That input should be hardwired to "31" (the return address register). Control will select this option on the mux when we do a JAL. The input to "write data" on the register file needs a mux with two inputs: the current write data line and the PC+4 output (the output of the adder that adds four to the PC). The control will select the PC+4 input when we do a jump and link and will select the other input at all other times. The mux that is an input to the PC needs to be expanded to handle an extra input. This input would be the lower 26 bits of the jal instruction shifted left by 2 (requiring a shifter) and OR'd with the upper 4 bits of the PC. The control will select this new input when we do a jump and link. Other control signals asserted during jal: RegDst 2 (the 31 option). ALUSrc X (We aren't using the ALU) MemtoReg X (This result will not make it to the register file if our new mux determing the data to write is set properly) RegWrite 1 MemRead 0 MemWrite 0 Branch 0 As for what I forgot in class - benefits of a multicycle datapath: * Instructions may now vary in length (adds shorter than memory, etc.) * Thus, an improved instruction execution time (assuming not all of your instructions are memory accesses). * Don't need to replicate as many pieces of datapath (adder can be used both for calculating branch target and for determining whether or not to branch). * As we have seen today, makes it easy to implement pipelining. More practice problems: > For practice (didn't get to this today) - > > Try adding a jal and jr instruction to the multicycle datapath. > Make sure to update the finite state machine appropriately. > > Find as many different ways as you can to cause each type of hazard in the > pipelined datapath. (Hint: There aren't that many ways. There may be no > way to cause some hazards.) And my solutions: Here are the answers to the questions I gave you through e-mail a week ago. If I made any mistakes or you don't understand something, let me know: > > Try adding a jal and jr instruction to the multicycle datapath. > > Make sure to update the finite state machine appropriately. Here is my method (using figure 5.33): First, I'll handle jal. Since we will need to write register 31 on a jal, I want to add a hardwired 31 input to the "RegDst" mux (this mux is the input to the "Write register" input of the Registers unit). I'll make this input 2, or 10 in binary, to the mux (and 0 will become 00 and 1 will become 01). Additionally, since I want to write my PC+4 to register 31, I will need to somehow connect the PC+4 line to the "MemtoReg" mux. Since the PC+4 is written to the PC during the instruction fetch stage, I will want to connect the output of the PC to the "MemtoReg" mux, which requires expanding the mux to 3 inputs. This new input will be 2, or 10 in binary, while changing the old ones, 0 and 1, to 00 and 01 respectively. For the FSM (using figure 5.42), we would add an "Op='jal'" branch to the decode stage, making a new bubble. We would have PCWrite = 1, which signals the PC to latch in our jump address. We would have PCSource = 10 to use our jump address to write the PC. We would also have to have RegWrite = 1 to make sure we write our RA. We also would need our RegDst = 10 and our MemtoReg = 10 to write our return ardress to register 31. For jr: We need to be able to read our new address from a register. So we should be hooking the output of register A to our PC mux. This means we need to expand that mux to have 4 inputs. Our new input will be "3" or 11 in binary. That's the only major change to the datapath, I think. As for the FSM, we will need a jr path out of decode. This would have PCWrite asserted, PCSource would be 11. Let me know if I've missed something and I'll update these answers. > > Find as many different ways as you can to cause each type of hazard in the > > pipelined datapath. (Hint: There aren't that many ways. There may be no > > way to cause some hazards.) Data hazards: R-type instruction followed by anything that depends on the result of the R-type such as an "add $t0, $t1, $t2" followed by an add "$t3, $t0, $t0" or "addi $t3, $t0, 1" or "beq $t0, $t1, next" or "lw $t3, 0($t0)" or "sw $t3, 0($t0)"...you get the point. Non R-type instruction that isn't a load followed by something that depends on it: "addi $t0, $t1, 1" followed by any of those things in the previous example. Load followed by anything that depends on it such as "lw $t0, 0($t1)" followed by any of the dependent instructions listed in the first part. Note that you will have to introduce a bubble in this case due to the fact that the load will not have the result until after the memory stage while the following instruction needs it in the decode stage. Load followed by a store could possibly be more complicated in the following case: "lw $t0, 0($t1)" "sw $t0, 0($t2)". Why? Because the sw doesn't really need t0 until the memory stage, so you wouldn't need to introduce a bubble! But for our purposes, we'll just insert a bubble anyway. What would be different about this case with regard to inserting a bubble or not: "lw $t0, 0($t1)" "sw $t2, 0($t0)" Structural Hazards: This doesn't happen in our multicycle datapath. No component is ever needed by two instructions at the same time in either the multi- or single cycle datapaths. Control Hazards: All branch instructions introduce control hazards because we don't know whether the branch will be taken or not, and thus we don't know which instructions to execute. Jumps introduce the same hazard.