|
|
|
|
Homework Assignment 2
Writing Question
We have seen Superscalar with dynamic scheduling, Vector processing, and
Multithreading. Write a few paragraphs discussing the effectiveness and
trade offs of each technique. If you were to choose one of those features
for your processor, which one would you chose? Why?
Vector Machines
- Use the VMIPS assembly described in Appendix F (on CD) of the textbook to produce a short program to count the number of values in an array that are above a threshold. Assume the address of the array is in
Ra , the threshold is in F0 , and the array has 37 elements. The result should be placed in register R1 .
Write a VMIPS vector program for the following fortran code (assume vector registers have 64 entries)
real a(*) , b(*) , c(*)
real t
do 760 i = 1 , 64
if( a(i) .gt. b(i) ) then
t = a(i) - b(i)
c(i) = c(i) + t
a(i) = t
endif
760 continue
return
end
-
Complete Problem F.11, parts a,b in Appendix F of the textbook
SMT & CMP
-
- What are vertical and horizontal waste in superscalar processors? How can they be reduced?
- Is the Return Address Stack predictor shared or duplicated in an SMT processor? Why?
- Comparing the performance of different SMT processors running the same multipro-cessing mix can be tricky. Why is this so? How can it be fixed?
- There are factors in favor of and against each class of multiple-issue, single-chip processor. Assume we have the following architectures:
- SMT. A 4-threaded, 8-issue SMT with 64KB of L1 cache.
- CMP. A 4-core CMP, each processor being 4-issue with 32KB of L1 cache
- Superscalar. A 6-issue superscalar processor with 64KB of L1 cache
Suppose we are comparing these processors. For each of the relative rankings below, describe a benchmark that might produce that result:
- SMT > CMP > Sup SMT performs best, Superscalar worst.
- SMT > Sup > CMP SMT performs best, CMP worst.
- Sup > SMT > CMP Superscalar performs best, CMP worst.
- Sup > CMP > SMT Superscalar performs best, SMT worst.
- CMP > Sup > SMT CMP performs best, SMT worst.
- CMP > SMT > Sup CMP performs best, Superscalar worst.
|