Due: April 28th (extended) at 23:00 via the Catalyst dropbox
This homework has two parts that cover different topics from the lectures. We won't accept late submissions so we recommend you start early!
Questions about this assignment should go to the TA, Thierry.
You should submit your modified discover-bpred.py
file and ilp_scheduler.py
to the Catalyst dropbox for part 1 and 2 respectively.
The formatting of the output of the python scripts should be kept intact so our grading scripts will work, though you can otherwise change the code to your heart's content. We will test your code by calling main()
with one of our insn streams and expect output on stdout; as long as you maintain this interface your code will work fine with our tests.
Dr. Chip D. Signer has been put in charge of designing a new chip for his employer. Dr. Signer wants to reverse engineer the branch predictor of a rival chipmaker in order to ensure that his chip performs better. All Dr. Signer knows is that the rival's chip uses a correlating (2-level) branch predictor that uses global history. In order to understand more about the branch predictor, Dr. Signer wants to determine:
Your assignment is to come up with an algorithm to determine the parameters of a
a series of "mystery" branch predictors. Conveniently, these mystery predictors
are provided in a Python file called branchpredictor.py
. Each branch
predictor can be accessed via the following interface:
predict(pc)
:
ask the branch predictor to predict the direction of a branch at the specified PC. Returns True
if the branch is predicted taken, False
if not taken. Note that calls to predict()
do not update the predictor's internal state.
actual(pc, direction)
:
updates the branch predictor's state with the outcome of a branch at the specified PC going in the specified direction (True
means taken, False
means not taken). This method does not return anything.
reset()
:
reset the state of the branch predictor. All entries in the BHR are reset to not taken. Each saturating counter in the BHT is set to a value such that it currently reads not taken, but with 1 increment will read "taken".
The initial state of the branch predictor is the same as after a call to reset()
(see above).
The constants MAX_*
in discover-bpred.py
list the maximum values (inclusive) for various branch predictor parameters, so your inference algorithm needn't check or handle values outside these ranges. The minimum value for each parameter is 1.
The indexing function for the BHT uses the bits of the BHR concatenated with the bits from the PC in the following manner: {BHR bits}{PC bits}, i.e. the PC bits are the low-order bits.
You must fill in the missing function definitions in discover-bpred.py
to infer the parameters of the branch predictors implemented by branchpredictor.py
. You should submit your modified discover-bpred.py
file via the Catalyst dropbox.
We will be looking over your source code, so don't cheat by using any interface to the branch predictor other than the two functions specified above. You can of course change things to help debug, but your code will be tested against our versions of branchpredictor.py
.
Feel free to change what you need to in discover-bpred.py
, but please make sure that the discover()
function returns a dictionary in the format indicated there (so our grading scripts will work). We recommend following the given order of inferring parameters (first BHR size, then saturating counter size, then BHT entries) though you can solve them in a different order if you wish.
Download the Python files you need for the assignment here: hw1-mystery-branch-predictor.tar.gz.
You can run the code for this assignment with the command python discover-bpred.py
in your Windows, Mac or Linux command terminal. You may need to install Python first; it's easy to download and install. Be sure to install Python 2.x; this code will not work with the new syntax of Python 3.x.
You should ensure your code works for the included mystery branch predictors. We will test your code against other mystery branch predictors, too, so feel free to come up with other test cases yourself.
Dr. Chip D. Signer has been put in charge of designing a new dynamically scheduled processor for his employer. In order to know how to best utilize on-chip resources, Dr. Signer wants to understand a few things about the workloads that will be running on his new chip. Dr. Signer wants to simulate these workloads on an ideal machine with infinite resources, to see how many resources these workloads require. In particular, he wants to know, for a given program (i.e. stream of instructions):
Since this is a dynamically scheduled processor, the register operands
of each insn need to be renamed before the insns can be scheduled. In
order to know which physical registers are free and which are still
live, you must track the live-range (region of execution between dispatch of the instruction that allocated the register, and the retirement of the instruction that writes to the same logical register) of each physical register p
. This
range starts as soon as p
is used for the first time, and ends after
p
's corresponding architectural register a
is overwritten, as any
subsequent uses of a
should be routed to the new physical register
responsible for storing a
. p
is then free to be reallocated to hold
the value for some new architectural register.
Your assignment is to write code that takes an insn stream, and performs register renaming and then scheduling of those insns on an infinite machine. After you have the insn schedule, you can answer Dr. Signer's 3 questions above.
There are a number of simplifying assumptions in this assignment to make your life easier.
do
have to worry about the live-ranges of registers, so
that you can use registers as efficiently as possiblegenInsns()
) to
make it easier for you to test your codeThe format for insns is documented at the top of ilp_scheduler.py
.
Each insn is a simple Python dictionary.
Our solution added about 70 lines of Python to ilp_scheduler.py
.
This diagram walks through a modified version of
random-3ticks-3width-4pregs.insns
(included with the assignment), which
has RAW, WAW and WAR data dependences. It shows how to optimally rename
the trace, with an added insn F to make things more interesting.
A few things we can observe:
Since insns B, C and E all execute in cycle 2, they each need a distinct physical register for their output to avoid WAW conflicts. Some of these registers are then dead in the next cycle: e.g., blue is dead (since it's been overwritten by green in insn C) and so can be used in cycle 3 to store the output from insn D.
Keeping track of when a register is free can be tricky, as insn E shows. Data dependences allow it to execute in cycle 2, but we have to make sure that insn D gets the right value for architectural register 1 (from green), so insn E writes into orange, avoiding a WAR conflict. Since E overwrote architectural register 1, the value in green is dead. But we can't free green yet, because D needs to read from green "in the future". This is where OoO execution complicates renaming: all insns get renamed in-order (the light blue arrows), even though they may execute OoO if their data dependences (black arrows) allow.
This situation actually arises in other places, too, but some traces have not really required precise management of the free list. E.g., even in this trace, it's straightforward to rename insns A-D: the blue register always could be (incorrectly) set as "free immediately" and things would work ok, because of the data dependences. Once we add insns E and F, we're not so lucky: we can't let insn F write to green, as this would interfere with insn D which executes at the same time. So we need an extra physical register, yellow, to allow things to execute as quickly as possible.
Download the files you need for this assignment: hw2-ilp-scheduler.tar.gz.
You can run the code for this assignment with the command python ilp_scheduler.py
in your Windows, Mac or Linux command terminal. You may need to install Python first; it's easy to download and install. Be sure to install Python 2.x; this code will not work with the new syntax of Python 3.x.
You should ensure your code works for the included insn trace files (*.insns
). We will test your code on these traces as well as others that we generate randomly, so you should test using the random insn trace generator yourself as well.
Thank you to Adrian Sampson who designed and created the first version of this assignment.