|
CSE Home | CSE 378 Fall 2006 | About Us | Search | Contact Info |
|
Extra Credit Lab: Caching
Assigned: 11/29/2006 DescriptionThis lab is an optional hardware lab which you may choose to complete in order to regain some of the points that you may have lost on the midterm exam. The requirements for this lab are going to be slightly different for other labs, particularly in regards to your choice of partners. For this lab, you may work with a partner, but you must have scored within 10 points of your partner's score on the midterm. The assignments for those working with partners and those working alone will also vary slightly. Warning: This lab will be significantly less structured than previous labs. Start early. Phase 0: AdministrationFor this lab, you will be provided with a board consisting of instruction and data caches, various I/O components, a memory system, and a place for your processor. You will take your processor from lab 4, place it on this board, and perform any modifications that you will require to make your processor compatible with the new system for accessing memory through the caches.
Remember to transfer not just your processor from the previous lab but also any other files that your processor depends on, such as the branch, hazard, and forwarding units. Phase 1: Designing the Data CacheThis lab introduces a new memory system which you will interface with the provided processor via an instruction cache (provided) and a data cache, which you will construct. The new memory system uses the block RAMs on the board, and can thus store significantly more data, but is going to be slightly more complicated to interface with because it does not provide asynchronous reads and may require more than one cycle to retrieve or store data. This is a benefit in disguise as it more closely resembles memory accesses under an actual system where it will take more than one cycle to access memory. The new memory system utilizes a request system where your cache can request to read from or write to memory to interact with main memory. If you want to read data from memory, set a Read request. At some point later, the memory will respond with valid data. Similiarly, writing data back to memory is a matter of setting Write request.Here's a cursory overview of the ports your Data Cache provides for interfacing with the memory system, and the uses for them:
Requesting a Read/Write to the memory looks something like this: Note that the signal names aren't quite the same. This was a write to the data cache that resulted in a cache miss. The Stall signal goes high immediately, and then ReadRequest is asserted for one clock cycle. A few cycles pass (the number CAN change), and then Valid is asserted. The Data from memory is written into the correct cache line during that cycle, which then means the cache line is valid and the processor can unstall.
Another key aspect of this lab is making the I/O devices cooperate with the cache. As you have seen with your programs in the previous lab, it is desired that any I/O access actually poll the device rather than rely on an existing stored value in order to make sure that our I/O accesses provide accurate feedback. In order to facilitate this, your cache must allow I/O read and write operations to bypass the standard caching system and be sent directly to the I/O devices on the board.
The third major consideration of this lab will be making your processor interface correctly with the data and instruction caches. Since memory accesses can now take more than one cycle to complete, you will need to incorporate a new stalling mechanism into your processor to handle stall signals coming from the instruction and data caches while they are reading/writing main memory. During these stalls, you should bring the entire processor to a halt. This is different from bubbling as you will NOT insert a nop into the pipeline at the location of the stall, rather, you will simply preserve the values in each pipeline register until the memory operation is completed. Here is a list of things that you should keep in mind while designing your cache:
If you are working alone, proceed to Phase 1A for instructions on the cache that you will be constructing. If you are working with a partner, proceed to Phase 1B. Phase 1A: Designing the Data Cache (Working Alone)If you are working alone on this lab, you will be constructing a direct mapped cache with 4 word lines. Your cache should store at least 16 lines of data. This implementation is a relatively simple one, and your cache will simply read data into the cache and evict it based on the address. If a collision occurs, you will simply evict the existing line (while taking care of writing back if necessary) and replace it with the new data from memory. To implement your cache, fill in the DataCache.v file provided with the board. Once you have completed this, continue on to Phase 1C. Phase 1B: Designing the Data Cache (Group of 2)If you are working with a partner on this lab, you will be constructing a 2-way set associative cache with 4-word lines. Your cache should contain at least 16 lines divided into two sets, in other words, you will have 2 sets of 8 lines each. This is slightly more complicated than writing a direct mapped cache as you will have to add logic to check both sets of the cache and will have to decide from which set to evict data given collision that requires that data be evicted from the cache. To implement your cache, fill in the DataCache.v file provided with the board. Once you have completed this, continue on to Phase 1C. Phase 1C: Modifying the processorNow that you have data and instruction caches on the board, you must modify the processor to use operate with these caches. The key feature that you will need to implement in your processor design is the ability to stall when a memory operation is in progress. The key thing to keep in mind when you are implementing this feature is that a stall is NOT a bubble. You will not insert nops into the pipeline to effect a stall, rather, you will actually hold the state of each pipeline stage. There should not be any change in the status of the pipeline while this is in effect, and the pipeline resumes after the stall ends as though there had never been a stall. Test FixtureA test fixture that you can use to test your processor and data cache is available here. Add the files to your design, then add your processor to the test fixture board. Run the simulation for about 180us and see if the lights on the board have been blinking. If they have, your processor and cache should be functional. When synthesizing, do NOT include any files from this archive. Phase 2: Implementing your designOnce you have modified your processor as necessary and have completed your caches, you will need to implement your design and put it on the board. The steps for this are going to be the same as with previous labs, so please refer to them for more detailed instructions. First, synthesize your design. Use the top level design board.bde. Make sure that you have included lib378 in the libraries for synthesis and that you have the compile parameter SYNTHESIS set. Also, before starting, confirm that you are synthesizing for a Virtex2P vp30ff896. Once synthesis has completed, implement your design using the Xilinx ISE as you have done in previous labs. Use the provided eclab.ucf file to specify the pin connections. When the bit file has been generated, you can put it on the board and transfer programs to it via the bootloader as before. Phase 3: Programming for your new boardImplementing the data cache provides you with one key benefit when programming for your new processor design - you now have a lot more memory accessible to you. Since this design is capable of accessing the block RAMs on the board, there is a much larger range of memory available that you can use for your programs. However, some slight changes will have to be made to the board.h file that you are using as the memory system on this board assigns the VGA controller to a different range of addresses. Download the new board.h and replace the existing one Your programming assignment for this lab is to get the game that you have been working on to run on your processor. This game should be Pong at a minimum, and you are free to do anything more complicated or awesome that will run on the boards. There will be a contest for the best game at the end of the class and the winner will recieve some kind of prize. CheckoffWhen you are done, show your board running your program to Mark or a TA. If one of us is not around, email and schedule a time that we can check you off for this lab. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
Computer Science & Engineering University of Washington Box 352350 Seattle, WA 98195-2350 (206) 543-1695 voice, (206) 543-2969 FAX [comments to Course Staff] |