CS 310 4/26/2004 Handout #56 C. Edmondson-Yurkanan Lecture Notes for RTL and Control given a specific microarchitecture (v 2.0) (for 2 lectures) I. The CPU's MICROARCHITECTURE consists of 2 components: * datapath, consisting of combinational logic and registers * and control (the finite state machine with finite inputs, finite outputs, and the current state register) II. Introduction to RTL(register transfer level aka register transfer language) - Overview - RTL describes exactly what happens on a cycle-by-cycle basis - it is essentially the "program" for executing an instruction - in each cycle information is transferred from 1 or more src registers to 1 or more dest. registers (by passing through combinational logic) - RTL (register transfer language) - Implementation of instruction (step-by-step) - Each step corresponds to a clock cycle on the datapath (we will assume that non memory steps take 1 clock cycle (cc) and Patt/Patel assume that a memory read takes 5 cc's once MAR has address while memory read control signal is asserted.) - Each step reads from one set of register(s) and writes to another set of register(s). Note the parallelism implied here. - assume that registers are implemented with master slave flip-flops, such that a register can only be read in the first half of the clockcycle, and written in the second half of the clock cycle.) - While a microarchitecture might have parallelism it must avoid resource hazards (ie. only one value can be on one section of the datapath/bus at a time) III. Review: The Phases of the LC-3 fetch-execute instruction cycle: - FETCH (fetch next instruction to execute) - DECODE (figure out what it is supposed to do) - FETCH OPERANDS (read register file) - EVALUATE ADDRESS (compute memory address, i.e. effective address, for instr. with a memory addressing mode) - EXECUTE (compute using ALU) - STORE RESULT (write result into register file or memory) NOTE, very, very, very few instructions require all 6 phases IV. RTL Examples **Example: ADD Rdst, Rsrc1, Rsrc2 Phase 1: MAR <- PC, PC <- PC + 1 (1 clock cycle) MDR <- MEM[MAR] (5 clock cycles) IR <- MDR (1 clock cycle) Phase 2: (1 clock cycle) Phase 3&5&6: GPR[IR[11:9]]<- GPR[IR[8:6]] + GPR[IR[2:0]], NZP <- setCC(the sum that is on the global system bus) (1 clock cycle) - Let's take a look at exactly how the ADD progresses through the datapath. - Global Bus is used to send PC to MAR; at the same time PC is read, incremented, & modified only at the "end" of the clock cycle - Memory access is local to the main memory microarchitecture - Global Bus used to send instruction from MDR to IR - Datapath used to send values from register file to the ALU, producing one ALU result which is then put on the global system bus to store into the register file.... NOTE that the NZP condition code bits will also be set based on the value that is on the global system bus. -------------------------------------------------------------------------- **Example: LD MAR <- PC, PC <- PC + 1 MDR <- MEM[MAR] IR <- MDR MAR <- PC + SEXT( IR[8:0] ) MDR <- MEM[MAR] GPR[ IR[11:9] ] <- MDR, NZP <- setCC( MDR ) How many clock cycles did this instruction take? ----------------------------------------------------------------------------- ----------------------------------------------------------------------------- ?2nd lecture? I Quick review by finishing the BRz and ST instructions **Example: BRz MAR <- PC, PC <- PC + 1 MDR <- MEM[MAR] IR <- MDR (CCz == 1) ? "go to next step" : "done with this instruction" PC <- PC + SEXT(IR[8:0]) **Example: ST MAR <- PC, PC <- PC + 1 MDR <- MEM[MAR] IR <- MDR MAR <- PC + SEXT( IR[8:0] ) MDR <- GPR[ IR[11:9] ] MEM[MAR] <- MDR ------- CONTROL: II. Introduction to control signals and control logic To actually implement an instruction (following our RTL for each clock cycle) the control signals must be specified appropriately to make the datapath obey the RTL. For this we need to actually name all of the control signals so that we can describe whether each is set to 1 or 0. Check out Figure C.3 on page 570 (datapath augmented with control signal names). The general convention is as follows: signals of form LD.?? = (e.g. LD.PC) it functions as a "datapath_register write_enable signal" which when asserted causes the register to "load" the data from its input signals beginning with Gate?? = (e.g. GatePC) this is the signal to control a tri-state device - A tri-state device has 2 inputs: control and data control data output 1 0 0 1 1 1 0 0 - 0 1 - - Control signal selectively connects the input(data) to output - if control signal is zero, then no connection - if control signal is one, then driver can drive a zero or one ???What happens if more than one driver is connected to the bus? ???Whose responsibility is it to guarantee that only one driver is connected to the bus? signals ending in ??MUX = (ie. MARMUX) mux control signals. Typical naming convention in the datapath for a mux; IDEALLY its inputs are labeled with the least significant input on the right the most significant one on the left. Thus for a four input mux the order of inputs from left to right is: 11 10 01 00 R.W = specifies whether the memory access it to be a read or a write MIO.EN = 1, "if this clock cycle is to access Memory"; or = 0 "if the cycle is accessing a memory mapped IO register" ALUK = choose function for ALU (ADD, AND, NOT, "pass through") Set.Priv = Mode of process currently running: Supervisor=0, User=1 III. Finite State Machines - We've already studied FSM's so now let's map it to the "control" box in the CPU diagram in Figure C.3 - Remember that a FSM is a model of computation consisting of: (for the LC-3 FSM See the "control" in figure C.1) - a finite number of of states (e.g. 52 states without interrupts) - a start state (state 18 in Figure C.2) - transition function (state transition diagram) --- see figure C.2 - 2 types of input: (1) external inputs (typically for a CPU, it's the entire IR) Figure C.1 shows the LC-3 detailed inputs that are needed to generate output from the FSM combinational logic, which are: * IR[15:11] ie opcode bits, * interrupt signal ie did an interrupt just occur, * PSR[15] ie is this supervisor or user mode * R, an indication that a memory access has completed and is "ready", * BEN (conditional branch condition is T), ie the condition code bits have been tested and one is True (2) current state: (10bits) * while P/P divide them up into 3 parts, think of it as just an enumeration of the state number - 2 types of output: (1) external (39 bits of control signals generated by the FSM combinational logic) (2) next state value (10bits generated by the FSM combinational logic for "next state") IV: Let's look at the RTL for the first clock cycle of all instructions and itemize all 39 control signals (ignoring interrupts and PSR) RTL Control signal and nextstate settings --- ------------------------------------- 1) MAR <- PC, PC = PC + 1 GatePC=YES(1), LD.MAR=LOAD(1), LD.PC=LOAD(1), PCMUX=PC+1(0) all other Gates=NO(0), all other Loads=NO(0), MIO.EN=NO(0), all other muxes/signals = "don't care" and next_state = 33