Convert 101011 to decimal (not 2's complement)
Convert 1111001000 to decimal (2's complement)
Convert 001101111000 to hex
Convert 11011110101011011011111011101111 to hex.
(and if you want, try decimal!)
125 in eight bit two's complement binary.
-113 in eight bit two's complement binary.
add 10101101 and 
    00101110.


1) How are the execution time and performance of a program related?
(If computer A takes 5 seconds to run a program and computer B takes 20
seconds, how do their performances compare?)

2) What are some metrics that can be misleading when judging the speed
of a computer?  (Which ones should you not use when comparing different
architectures?)

3) If your average CPI is 1.3, your execution time is 2 seconds, and
your processor speed is 500MHz, how many instructions did you execute?

4) Describe Amdahl's Law briefly.

5) There are two means commonly used in measuring the performance of a
workload.  What are these means and in which cases are they used?

6) In MIPS assembly language, write a function to reverse a string.
The base address of the string is in $a0 and the length of the string
is stored in $a1.  Write the result into the address located in $a2.
(Hint: Try writing it in C first, and translate directly.  No
shortcuts.)
   If you are a group of MIPS gurus, or if you want extra practice
after class, try it without $a1 and $a2, writing the result back into
$a0.

7) In MIPS assembly language, write an "itoa" program which converts a
positive integer to an ASCII string.  The integer to convert is stored
in $a0.  The base address of the array in which to store your
resulting characters is in $a1.
   You are given a function called "reverse" which takes a string
pointer argument in $a0 and replaces the memory pointed to by $a0 with
the reversed string.  If you use this function, be sure to observe the
MIPS calling conventions.
   (You can assume that $a0 will always be a positive integer and that
the array pointed to by $a1 will always have enough space for the
resulting string.)

CHALLENGERS: 

8) Write a MIPS program that will sort an array of integers.  The array
is pointed to by $a0.  The number of elements in the array is stored
in $a1.  The sorted array should be at the same address as the
original array.

9) Write a MIPS program that will determine if a number is prime.  Return
a 1 in $v0 is the argument in $a0 is prime.  Return a 0 in $v0 if the
argument in $a0 is not prime.


Solutions to previous problems:

1) How are the execution time and performance of a program related?
(If computer A takes 5 seconds to run a program and computer B takes 20
seconds, how do their performances compare?)

Inversely.

2) What are some metrics that can be misleading when judging the speed of
a computer?  (Which ones should you not use?)

MIPS, MFLOPS.  These numbers when used on very different architectures
are very poor comparisons.  Parallelism, CISC vs. RISC, floating point
programs versus integer programs, etc.

3) If your average CPI is 1.3, your execution time is 2 seconds, and
your processor speed is 500MHz, how many instructions did you execute?

770 million.

4) Describe Amdahl's Law briefly.

"Performance improvement from speeding up a part of a computer system is
limited by the proportion of the time the enhancement is used."

Basically, your the overall improvement of your computer is based on the
execution time of your programs, not on how much you improved some portion
of it.

5) There are two means commonly used in measuring the performance of a
workload.  What are these means and in which cases are they used?

The arithmetic mean is used for averaging execution times.
The harmonic mean is used to average rates.

The harmonic mean is simply the inverse of the arithmetic mean.  So a low
arithemetic mean is good while a high harmonic mean is good.

6) In MIPS assembly language, write a function to reverse a string.
The base address of the string is in $a0 and the length of the string
is stored in $a1.  Write the result into the address located in $a2.
(Hint: Try writing it in C first, and translate directly.  No
shortcuts.)  If you are a group of MIPS gurus, or if you want extra
practice after class, try it without $a1 and $a2, writing the result
back into $a0.

# Here is my pseudo code

$t1 = $a1 - 1;			# t1 is the last index into the
				# destination string
$t0 = 0;			# t0 is the first index into the
				# source string.
while ($t0 < $a1) {		# Loop while we haven't finished
				# reading the string
      $a2[$t1] = $a0[$t0];	# Store a character from the source
				# string into the proper place in the
				# destination string.
      $t1--;			# Decrement our destination pointer.
      $t0++;			# Increment our source pointer.
}
$a2[$a1] = 0;			# Null terminate our string.
return;				# Done.


# And here is the (untested) assembly.

	sub	$t1, $a1, 1	# t1 is the last index into the
				# destination string.
	move	$t0, $0		# t0 is the first index into the source
				# string.
loop:
	beq	$t0, $a1, done	# Loop until we have finished looking at
				# the whole source string (when t0 == a1)
	add	$t3, $t0, $a0	# Get the address for the current byte in
				# the source string.
	lb	$t4, 0($t3)	# Load the current byte from the source
				# string
	add	$t3, $a2, $t1	# Get the address for the current byte in
				# the destination string.
	sb	$t4, 0($t3)	# Load the current byte from the
				# destination string.
	sub	$t1, 1;		# Decrement our index into the destination
				# string.
	add	$t0, 1;		# Increment our index into the source
				# string.
	j	loop		# Continue looping
done:
	add	$t3, $a2, $a1	# Get the index to the end of the string
	sb	$0, 0($t3)	# and store a 0 to null-terminate it.
	jr	$ra		# return;

7) In MIPS assembly language, write an "itoa" program which converts a
positive integer to an ASCII string.  The integer to convert is stored
in $a0.  The base address of the array in which to store your
resulting characters is in $a1.  You are given a function called
"reverse" which takes a string pointer argument in $a0 and replaces
the memory pointed to by $a0 with the reversed string.  If you use
this function, be sure to observe the MIPS calling conventions.  (You
can assume that $a0 will always be a positive integer and that the
array pointed to by $a1 will always have enough space for the
resulting string.)

# My pseudocode

$t0 = $a0;
$t4 = $a1;
if ($t0 == 0) {			# 0 is a special case.  Deal with it
				# first.
   $t2 = '0';			# t2 will hold the character '0'
   sb  $t2, 0($t4)		# Store '0' in our destination
   $t4++;			# Keep track of our location in the
				# string.

} else {			# This is the non-zero case
  while ($t0 != 0) {		# Keep looping until we have characterized
				# the entire integer
      $t1 = $t0 % 10;		# This is the last digit of the integer.
      $t0 = $t0/10;		# This is everything but the last digit of
				# the integer.
      $t2 = $t1 + '0';		# Add '0' character to get the proper
				# character for this digit.  '0' + 0 = '0'
				# '0' + 1 = '1', etc.
      sb  $t2, 0($t4);		# Store the character version of our digit
				# at our current location in the string.
      $t4++;			# Increment our current location in the
				# string.
  }
}
sb	0, 0($t4)		# Null terminate our string.
reverse ($t4)			# We have a backwards version of our
				# integer now, so reverse it.
return;				# all done.


# Here's the MIPS (untested).

	addiu	$sp, $sp, -12	# We are going to call a procedure at some
	sw	$ra, 12($sp)	# point during this procuedure, so save
	sw	$a0, 8($sp)	# important registers on the stack.
	sw	$a1, 4($sp)

	move	$t0, $a0	# t0 will hold the remaining digits of our
				# integer
	move	$t4, $a1	# t4 will be the pointer to our current
				# location in the destination string.
	bne	$t0, $0, WhileLoop
				# If we have a 0 integer, that is a
				# special case, so deal with it.
	li	$t2, 48		# 48 is the character code for '0'
	sb	$t2, 0($t4)	# Store the character in our destination
				# string.
	addi	$t4, $t4, 1	# Advance our pointer to the string.
	b	done		# Go finish up (this was the 0 case)
WhileLoop:
	beq	$t0, $0, done	# Keep looping until we have no digits
				# left
	div	$t0, 10		# Divide t0 by 10.  Remember that div
				# stores the remainder in hi and the
				# quotient in low.  There may be
				# psuedoinstructions such as rem and div
				# which take destination registers.  If
				# so, you'd probably want to use those
				# instead of mfhi and mflo as I do.
	mfhi	$t1		# $t1 = $t0%10 (last digit of t0)
	mflo	$t0		# $t0 = $t0/10 (first digits of t0)

	addi	$t2, $t1, 48	# Store the character code, '0' + $t1, in
	sb	$t2, 0($t4)	# our destination string.

	addi	$t4, $t4, 1	# Increment our string pointer.
	j	WhileLoop	# Continue looping.
done:
	sb	$0, 0($t4)	# Null terminate our string.
	move	$a1, $a0	# Put our backwards string as argument a0
	jal	reverse		# Reverse the string.
	lw	$ra, 12(sp)	# Restore variables.  Oops, guess I didn't
				# need to save $a0 and $a1
	addiu	$sp, $sp, 12	# Put stack back in place.
	jr	$ra		# Return to calling function.


CHALLENGERS (didn't bother writing solutions):

8) Write a MIPS program that will sort an array of integers.  The
array is pointed to by $a0.  The number of elements in the array is
stored in $a1.  The sorted array should be at the same address as the
original array.

9) Write a MIPS program that will determine if a number is prime.
Return a 1 in $v0 is the argument in $a0 is prime.  Return a 0 in $v0
if the argument in $a0 is not prime.


10/21/99 Practice problems

1)  Here is a dump of SPIM (or some similar program).  Translate the
    memory dump into the Java bytecodes it represents.  This SPIM
    machine is little-endian.

[0x10000100]  0x10361536  0x073653660  0x12109b64  0x5922039a
[0x10000110]  0x00ac1215


2)  Finish filling out the control worksheet we started. (Which you can find on ftp://ftp.mkp.com/COD2e/slides/PCppt/ch5win95.ppt, slide 17.  Imagine that the chart isn't filled in).


*3)  What would you need to do to add a jal instruction to the
     datapath shown in the handout?  (Hint:  What happens to
     the PC?  What happens to the return address?)


Solutions to 10/21/99 Practice problems

1)  Here is a dump of SPIM (or some similar program).  Translate the
    memory dump into the Java bytecodes it represents.  This SPIM
    machine is little-endian.

[0x10000100]  0x10361536  (bad->)0x073653660  0x12109b64  0x5922039a
[0x10000110]  0x00ac1215

The second number should have been: 0x0736053660 (which is also wrong),
but in my haste, I messed up!  Sorry.

istore 21
istore 16
isub
iflt 16 18
ifne 3 34
dup
iload 18
ireturn


2)  Finish filling out the control worksheet we started.

1 0 0 1 0 0 0
0 1 1 1 1 0 0
x 1 x 0 0 1 0
x 0 x 0 0 0 1


*3)  What would you need to do to add a jal instruction to the
     datapath shown in the handout?  (Hint:  What happens to
     the PC?  What happens to the return address?)

The mux which controls the destination register (the one we are writing
to) needs to be expanded to handle an extra input.  That input should be
hardwired to "31" (the return address register).  Control will select this
option on the mux when we do a JAL.

The input to "write data" on the register file needs a mux with two
inputs:  the current write data line and the PC+4 output (the output of
the adder that adds four to the PC).  The control will select the PC+4
input when we do a jump and link and will select the other input at all
other times.

The mux that is an input to the PC needs to be expanded to handle an extra
input.  This input would be the lower 26 bits of the jal instruction
shifted left by 2 (requiring a shifter) and OR'd with the upper 4 bits of
the PC.  The control will select this new input when we do a jump and
link. 

Other control signals asserted during jal:

RegDst 2 (the 31 option).
ALUSrc X (We aren't using the ALU)
MemtoReg X (This result will not make it to the register file if our new
mux determing the data to write is set properly)
RegWrite 1
MemRead 0
MemWrite 0
Branch 0


As for what I forgot in class - benefits of a multicycle datapath:

* Instructions may now vary in length (adds shorter than memory, etc.) 
* Thus, an improved instruction execution time (assuming not all of your
instructions are memory accesses).
* Don't need to replicate as many pieces of datapath (adder can be used
both for calculating branch target and for determining whether or not to
branch).
* As we have seen today, makes it easy to implement pipelining.


More practice problems:

> For practice (didn't get to this today) -
>
> Try adding a jal and jr instruction to the multicycle datapath.
> Make sure to update the finite state machine appropriately.
>
> Find as many different ways as you can to cause each type of hazard in the
> pipelined datapath.  (Hint: There aren't that many ways.  There may be no
> way to cause some hazards.)


And my solutions:

Here are the answers to the questions I gave you through e-mail a week
ago.  If I made any mistakes or you don't understand something, let me
know:

> > Try adding a jal and jr instruction to the multicycle datapath.
> > Make sure to update the finite state machine appropriately.

Here is my method (using figure 5.33):

First, I'll handle jal.  Since we will need to write register 31 on a jal,
I want to add a hardwired 31 input to the "RegDst" mux (this mux is the
input to the "Write register" input of the Registers unit).  I'll make
this input 2, or 10 in binary, to the mux (and 0 will become 00 and 1 will
become 01).

Additionally, since I want to write my PC+4 to register 31, I will need to
somehow connect the PC+4 line to the "MemtoReg" mux.  Since the PC+4 is
written to the PC during the instruction fetch stage, I will want to
connect the output of the PC to the "MemtoReg" mux, which requires
expanding the mux to 3 inputs.  This new input will be 2, or 10 in binary,
while changing the old ones, 0 and 1, to 00 and 01 respectively.

For the FSM (using figure 5.42), we would add an "Op='jal'" branch to the
decode stage, making a new bubble.  We would have PCWrite = 1, which
signals the PC to latch in our jump address.  We would have PCSource = 10
to use our jump address to write the PC.  We would also have to have
RegWrite = 1 to make sure we write our RA.  We also would need our RegDst
= 10 and our MemtoReg = 10 to write our return ardress to register 31.

For jr:  We need to be able to read our new address from a register.  So
we should be hooking the output of register A to our PC mux.  This means
we need to expand that mux to have 4 inputs.  Our new input will be "3" or
11 in binary.  That's the only major change to the datapath, I think.

As for the FSM, we will need a jr path out of decode.  This would have
PCWrite asserted, PCSource would be 11.

Let me know if I've missed something and I'll update these answers.

> > Find as many different ways as you can to cause each type of hazard in the
> > pipelined datapath.  (Hint: There aren't that many ways.  There may be no
> > way to cause some hazards.)

Data hazards:

R-type instruction followed by anything that depends on the result of the
R-type such as an "add $t0, $t1, $t2" followed by an add "$t3, $t0, $t0"  
or "addi $t3, $t0, 1" or "beq $t0, $t1, next" or "lw $t3, 0($t0)" or
"sw $t3, 0($t0)"...you get the point.

Non R-type instruction that isn't a load followed by something that
depends on it:  "addi $t0, $t1, 1" followed by any of those things in the
previous example.

Load followed by anything that depends on it such as "lw $t0, 0($t1)"
followed by any of the dependent instructions listed in the first part.
Note that you will have to introduce a bubble in this case due to the fact
that the load will not have the result until after the memory stage while
the following instruction needs it in the decode stage.

Load followed by a store could possibly be more complicated in the
following case:  "lw $t0, 0($t1)" "sw $t0, 0($t2)".  Why?  Because the sw
doesn't really need t0 until the memory stage, so you wouldn't need to
introduce a bubble!  But for our purposes, we'll just insert a bubble
anyway.  What would be different about this case with regard to inserting 
a bubble or not: "lw $t0, 0($t1)" "sw $t2, 0($t0)"


Structural Hazards:

This doesn't happen in our multicycle datapath.  No component is ever
needed by two instructions at the same time in either the multi- or single
cycle datapaths.


Control Hazards:

All branch instructions introduce control hazards because we don't know
whether the branch will be taken or not, and thus we don't know which
instructions to execute.

Jumps introduce the same hazard.