CS152 Computer Architecture and Engineering

Homework #4 / Lab #4

Spring 2003, Prof John Kubiatowicz


Homework 4 due Wednesday 3/19 in class.  There will be a short quiz in class on that day.

Lab reports for Lab 4 due Thursday 3/20 at midnight via the submit program.  You will demonstrate your lab to your TA on Thursday in section.  This is the first lab to be done in groups.  Please review this with your partners and get a tentative division of labor to your TA by Tuesday 3/11 by midnight.  This means that you decide which componenets of the lab are going to be done by which partners.  This lab is rather long, so get started early!!!

Please put the TIME or TA NAME of the DISCUSSION section that you attend as well as your NAME and STUDENT ID. Homeworks and labs will be handed back in discussion.

Homework Policy: Homework assignments are due in class. No late homeworks will be accepted. There will be a short quiz in lecture the day the assignment is due; the quiz will be based on the homework. Study groups are encouraged, but what you turn in must be your own work.

Lab Policy: Labs reporsts are due at midnight via the submit program. No late labs will be accepted.

As decided in class, the penalty for cheating on homework or labs is no credit for the full assignment.


Homework 4

Please do the following problems from P&H: 5.2, 5.11, 5.17, 5.18, 5.20, 5.21, 5.22, 5.24, 5.27, 5.28, 5.29
Homework assignments should continue to be done individually.

Lab 4

In this assignment you will build a single-cycle datapath like the one discussed in class and in chapter 5, and verify that it executes a subset of the MIPS instruction set.  This lab has two goals: (1) getting you more familiar with Verilog and the mixing of Verilog and schematics.  (2) constructing a complete working single-cycle datapath.

This lab assignment is to be completed with your project partners. To help you build a successful machine, we will give you intermediate milestones for this assignment. Each group should elect a spokesperson for this lab. By Tuesday (3/11) at Midnight, the spokesperson for each group should e-mail your TA the assignments for each team member. The responsibility of the spokesperson is to communicate questions to the TA and reply to questions from the TA. For each lab, you will need to select a different spokesperson.

Note that all of the simulation in this lab will be done at the high-level (i.e. we will not map things down to gates or do a place-and-route).

Problem 1: Datapath Design

Problem 1a: High-Level components

The following components are provided for this assignment and future ones.  The verilog descriptions of these componenets are located in the cs152 component lib (M:\lib\high-level).

  1. alu.v: arithmetic logical unit
  2. ramblock2048.v: memory of capacity 2048 words
There is a description of these components in the directory as well, called LAB4_Help.pdf.

Start by copying these files to your lab 4 directory.  Look at the top of ramblock2048.v.  Make two new top-level ramblocks, called "instmem()" and "datamem()" in the same style of the top-level ramblock2048(). (They will instantiate copies of  ranblock2048b()).  For your instruction memory, use an initialization file called "instmem.contents" file.  For your data memory, use a file called "datamem.contents".  Make new files with these names by copying the memory.contents from from M:\lib\high-level.  Data from these files will be loaded into your memory at the beginning of the simulation.

Problem 1b: Datapath Components

For this first part of the lab, you should design the following components. Try to divide this work evenly among the various partners in your groups:
  1. bts32: 32-bit tri-state buffer
  2. extend: sign/zero extender
  3. m16x2, m32x2, m32x3, m32x5, m5x2: MUXes with various widths and number of inputs.
  4. reg32: 32-bit register
  5. regfile: register file
  6. shifter: arithmetic logical multibit shifting unit
All your verilog components must incorporate delays!  For each component, think about how that component would be implemented using discrete gates.  Use this mental discrete gate implementation and the gate delays from the on-line components to estimate the delays to use.  There is no exact right or wrong answer for the delays to use; however, if the delays are too large or too small, you will be asked to justify your decision.  The delay models may be kept simple, i.e. one delay value can be used for an entire component (which should be the maximum worst-case delay).  For the datapath controller you should use a delay of 20ns.

All the components should be implemented in Verilog. We suggest that you adhere to standards when building your modules -- it will make your life easier.  Make schematic symbols for each of your high-level components.

For instance, you should do the following:

  1. In all the components the label DOUT refers to the data output, and if there is only one data input (e.g., reg32), it is called DIN.
  2. If there is more than one data input, they are labeled A, B, C, ....  MUX output selectors are labeled SEL0, SEL1,..., with SEL0 representing the least significant bit of the selector. A is routed to the output when the select bits represent 0, B when they represent 1, and so on.
  3. Control signals, such as the write enable on reg32, have a suffix H or L to indicate whether they are active high or active low.
  4. All clock inputs to the modules are called CLK.
All modules that require a clock should be rising-edge triggered. Please look through the comments (lines preceded by --) at the beginning of the Verilog description for the components you have been given (again, verilog files are located in the U:\cs152\lib\high-level directory).

Problem 1c: subcomponent test benches

Design test-benches for each of your components.  A test bench is a top-level module, written in Verilog, that incorporates the module under test with structural Verilog.  The main body of the test-bench should be a process which steps through a number of test vectors, verifying the results as it goes.  Your test-bench should test as many cases as you can think of.   Try to imagine strange things that might happen to a real (non-Verilog) implementation of a component.   One possible diagnostic is to have your test-bench output the bad cases to a diagnostic file, using the file I/O primitives (look at section 17.2 of the IEEE spec on the handouts page).

Each component will have its associated test bench.  Thus, a test bench for a given component should be written by the same person that wrote that component.  Note that these test benches are only used during the testing phase of your components and are not wired into your processor.  However, you may reuse the test benches later in the term if you try more complicated implementations of these components.

Turn in the output from your test-benches that illustrate that your components are working.  You should be able to simulate your Verilog code by itself, without using the schematic capture tool.

Problem 1d: assemble the data path

Using the components listed above, build a singlecycle datapath (with schematics) that implement the following subset of MIPS instructions: addu, addiu, subu, ori, lw, lui, sll, srl, slt, sw, jr, beq (these are a subset of the instructions in for Lab #2). Make sure that your design is clean -- you will be breaking it apart to build a pipelined processor in future assignments. In this section, you don't have to put in the PC and instruction memory. Just set the values of the 32-bit instruction and the control signals in the simulator and test your datapath.

Problem 1e: Test your data path

To test your datapath, build a test-bench in the following way.  Build a top-level verilog file that incorporates your datapath and has a single initial block that includes a bunch of test cases.  This module would incorporate your data path and drive each of the control signals and the instruction value, and inputs for some set of result signals (such as the destination register, the equal? signal, etc).   Each of these test cases should be of the form:

initial
    // Test case #1: Make sure that register read works
    A=something; B=somethingelse; CLK = 0;
    #100  // Wait for datapath to settle
    if (Foo!=something|| )
        $display("Failure for Test case #1\n");

    // Test case #2,

    .....

To get each test case, you could:

  1. Write a simple diagnostic program to test each of the instructions. Put the source in a .s file (e.g. 'foo.s').
  2. Run mipsasm (mipsasm is in M:\bin\mipsasm.exe) to create a .mem file.
  3. Manually convert your .mem file into vectors for your datapath.  You will also need to add some other control (such as setting up a clock).
Turn in a copy of your diagnostic program and a log of your simulation showing that the instructions worked. The objective of this question is to make sure that you have all the parts of the datapath in place to execute all the instructions and you have a clear idea of what the control signals have to be for each instruction. The testing need not be exhaustive, but should be complete.
 

Problem 2: Verilog Controller for the Datapath

Problem 2a: the actual controller

Now we are ready to build the controller for the datapath you built in problem 1. You will use Verilog to create the control module. Make this into a module with pins labeled appropriately and connect it to the datapath you built in problem 1. This controller should take the bits from the 32-bit instruction and implement a combinational circuit that generates the control signals. You should also add in the instruction memory, PC and nPC, and the next instruction logic here.

Problem 2b: The disassembly monitor module

Next, build a special Verilog entity that takes as input the clock, the instruction address, the output of the instruction memory, the output of the registers, and the destination register.  This will be a monitor module which will monitor the execution your processor.  It should output information in textual format, using the "$display" statement or using file I/O to output to a diagnostic trace file.  So, the output of your monitor should be something like:

    0x00000000: addu r3, r4, r5       R[r4]=0x34235432, R[r5]=0x45557322, R[r3]=0x7978C754
    0x00000004:  <etc>

Your monitor should be able to disassemble the complete set of instructions above and print out the course of execution.  This monitor will show you in graphical form that your processor is "doing the right thing".  It should print a new line at the falling edge of the clock. Note that the $display statements are much like a C printf() statement, allowing fairly extensive formatting.

Wire your monitor module into your datapath.  Note that, later in the term, you will have to add some pipelining

Problem 2c: Top-level driver

Wire together your datapath, control path, memory modules, and monitor modules.  Make sure to keep track of the instance names for your memory modules.  These will form a complete path that you can use in the simulator to examine memory during simulation.  For instance, let's say that you called the ramblock2048b in your instmem() module "tempBlock" and then called the block "MyInstMem" at the data path level.  Then, you will be able to refer to the memory block as "MyInstMem/tempBlock/inst/mem" in ModelSim.  In fact, you will use the "show" command to deal with this.  Note that the middle "inst" and "mem" pieces are not under your control (they are inside the "ramblock2048b" and "BLKMEMSP_V4_0" blocks.

Make a top-level module to drive your processor.  It should have a clearly-defined clock generator defined as a an always loop with a delay statement (like #100).  This is the file that we will potentially place breakpoints in from ModelSim.  For instance, you could do:

    always
        begin
        #clock-half-period
        CLK = 1;
        #clock-half-period
        CLK = 0;
    end

Then, set breakpoints at one of the lines that set the clock.

Problem 2d: debugging your processor

The LAB4_Help.pdf file in m:\lib\high-level directory describes how to load information into your instruction memory.  (Recall that in 1a we made the memory contents be "instmem.contents" and "datamem.contents").  You should be able to go into the command mode of ModelSim to set breakpoints at points of your clock cycle and thereby step cycle by cycle.  (Note that the mipsconvert.exe program is in M:\bin)

You should write a diagnostic program to test this mini-MIPS processor; this program may be a subset of the program you used for Lab #2 (broken-spim).  Remember, a single cycle processor does not have delay slots,  and you have not implemented the complete subset of instructions from Lab #2. (Later on, you will be adding more instructions to your processor and will be able to use the complete diagnostic program.) Your program should be written such that the result of the test should be stored in memory which you can peek at using the show command under ModelSim (remember that variables can be referred to by their instance names). In general, you will not have access to the contents of the registers like you did in spim. It is important to write good diagnostic programs to test your hardware because it will simplify your life later on when you have to debug a more complicated processor.

We have provided one diagnostic program for you to use in the M:\lab4.  This is similar to the type of program that we will run in Section to test your processor (although we will use something different).

Other types of monitor modules may help you debug your processor.  You can produce a set of modules that simple check for conditions that should "never happen" and output

Turn in a copy of your Verilog code, processor schematic, diagnostic program(s), a simulation log that shows the correct execution of the program, and your on-line logs.

Problem 3: Delays and Maximum Clock Rate

Now that we have a functional datapath, we will investigate the performance of this processor.  Find the critical path for your processor and determine the maximum clock-rate. Run a diagnostic program at this clockrate to make sure it works.

Turn in a list of all your components with associated delays.  What is the critical path and maximum clock-rate? What happens if you increase the clockrate and then try to run the diagnostic program?
 

Problem 4: Write Up

Make sure to structure your laboratory report to be easily readable.  If you have really complicated listings or schematics, you can include them in an appendix.  Make sure to discuss your testing philosophy for Problems 2 and 3 in detail.  Also, turn in a copy of your laboratory notebook.