Experiment - RISCV pipeline processor design and implementation
1. Pipeline design
![TheStructureChart]
2. rtl file
in the root directory folder.
Three, pipeline realization ideas
(1) Complete the design of each component
1. Design of the operation unit
Including adder, alu, data_mem, imm_gen, mux, basically very simple.
According to the content in the "RISC_V_Experiment-2022" PPT, imm_gen can be implemented,


2. Design of the storage unit
3. Design of the control unit
①proc_controller
proc_controller is a very important control unit.
This picture in PPT "2021-Chapter 4 Lecture 3-Single-cycle Controller" gives some methods of main control unit control signal design
Some tips for control signal design are also given in the proc_controller file:
(ps: The jar_sel variable is additionally set here because branch needs to perform special operations, which will be mentioned later by branch)
(pss: The setting of the ALUop signal here is different from that given by Mr. Fu at the beginning. The rtype is separated from the IType, and the store/load in the IType is also separated from the addi, ori and jalr, because I think it is convenient to design alu_controller)
Here, we first express the instructions that need to be designed as variables to facilitate the writing of the code later.
②branch_unit
branch_unit is to determine the address of the next instruction. If jal, jalr and branch are not executed, the address of the next instruction is pc+4. Otherwise, the value of branch_target is changed to the jump address of the corresponding instruction. The jal_sel variable is additionally set here, which is Because the jump action performed by jal and jalr is different, the source register is included in the jalr instruction. As shown in the figure:
(ps: In fact, the operation result of jalr here should not be placed in the pc_plus_imm variable. This variable should only put the pc+imm of jal and branch, because the pc_plu_imm of the jalr instruction is actually pc+imm+rs1)
③alu_controller
(ps: First of all, the input variable here is additionally added with opcode, in order to distinguish the funct part and the imm part in IType and UType. For example, UType does not have funct, but some parts of imm may have the same value as funct)
The output operation here can be defined by yourself, as long as the operation expected to be performed by the defined 4-bit operation is consistent with the operation performed by the alu_ctrl control in the alu file.
4. Solve risky problems
①Structure Adventure
In this experiment, memory such as registers are used by multiple instructions at the same time, and it seems that there will be no structural risk or resource conflict. For example, the read and write ports of registers are similar to separate, and can be read and written at the same time.
②Data adventure
Use stall and forward, that is, the hazard file and the forward_unit file to solve this problem, and solve the problem of reading after writing,
Take the design idea of this diagram on PPt "2021-Chapter 4, Lecture 4-Pipeline Principle"
That is: as shown in the figure:
If the current instruction is decoded by the ID part, a data risk is detected (the code implementation of the forward file will be attached later)
Then the current instruction ID is blocked, the pc value does not change, the next executed instruction is still this instruction, and the operand of the ALU passes the ALU data selector as shown in the figure below, the previous instruction is written, the current instruction , where the result of a source register, replaces the value of the unmodified register:
So far, the blocking operation as shown on the PPT has been realized.
The forward file code is as follows:
③Control the adventure
There are three basic ideas for controlling adventure solutions, as shown in the figure:
This experiment adopts the first method, that is, blocking the three instructions after the branch instruction. First, if the branch instruction appears and the jump condition is satisfied, in the BrFlush file, first set the BrFlush control signal to 1:
pc_sel = ((branch_taken && alu_result == 0) || jalr_sel || jal_sel) ? 1'b1 : 1'b0;
Then block the next instruction,
The next instruction of the current instruction is blocked by the code shown in the figure below, and the ID/IF register is "flushed" to "empty":
The next instruction of the current instruction is blocked by the following code, and this instruction is set to "empty instruction":
When the branch_target value in branch_unit is calculated, set the address of the current instruction to be executed as branch_target through the code in the following figure:
(2) Connect the components (data_path)
1. Design the IF unit and pass the current instruction address and the undecoded instruction to the IF/ID register.
①If the reset instruction is encountered, the pc is initialized, and the pc value is 0. If the stall is encountered, the next instruction is not executed, and the PC remains unchanged.
PC <= reset ? 0 : stall ? PC : PC_mux_result;
②Set the value of Brflush according to the output of branch_unit.
assign PC_mux_result = BrFlush ? BrPC : PCplus4
③ Pass in the current instruction address and the undecoded instruction to the IF/ID register
(ps: These two instructions are to control it in a non-blocking situation)
2. Understanding of data_path
The code structure of this file is very clear. In short, it is similar to the structure of the following figure:
data_path
`timescale 1ns / 1ps // data path `include "pipeline_regs.sv" import Pipe_Buf_Reg_PKG::*; module Datapath #( parameter PC_W = 9, // width of the Program Counter parameter INS_W = 32, // Instruction Width parameter DATA_W = 32, // reigster width parameter DM_ADDRESS = 9, // width of the data memory address parameter ALU_CC_W = 4 // width of the ALU Control code ) ( input logic clock, // reset , sets the PC to zero input logic reset, // Register file writing enable input logic reg_write_en, // Selection signal for the source of value used to write the regiser input logic WrRegDataSrc, // Register file or Immediate MUX input logic alu_src, // Memroy Writing Enable input logic mem_write_en, // Memroy Reading Enable input logic mem_read_en, // Branch Enable input logic branch_taken, // Jalr Mux Select input logic jalr_sel, // Jal Instruction distinguish from jalr used in the branch function input logic jal_sel, input logic [1:0] alu_op, // Mux4to1 Select input logic [1:0] RWSel, // ALU Control Code ( input of the ALU ) input logic [ALU_CC_W -1:0] alu_cc, output logic [6:0] opcode, output logic [6:0] opcode_IDEX, output logic [6:0] funct7, output logic [2:0] funct3, output logic [1:0] aluop_current, // data write back to register output logic [DATA_W-1:0] wb_data ); // define the pipeline registers if_id_reg PipeRegIFID; id_ex_reg PipeRegIDEX; ex_mem_reg PipeRegEXMEM; mem_wb_reg PipeRegMEWB; /*todo: analysis here * the registers are declared here. * each struct in pipeline_regs.sv is a set of registers */ // ==================================================================================== // Instruction Fetch (IF) // ==================================================================================== // // peripheral logic here. // logic BrFlush, stall; logic [31:0] PC_mux_result, PC, PCplus4, BrPC, instr; // add your code here for PC generation assign PCplus4 = PC + 4; assign PC_mux_result = BrFlush ? BrPC : PCplus4; always @(posedge clock) begin //stall, do not update the PC PC <= reset ? 0 : stall ? PC : PC_mux_result;//①: If the reset instruction is encountered, initialize the pc //②: If the stall is encountered, the next instruction will not be executed //Here pc assignment must use <= instead of = end // your instruction memory Insn_mem IM ( .read_address(PC[PC_W-1 : 0]), .insn(instr) ); // ==================================================================================== // End of Instruction Fetch (IF) // ==================================================================================== // ==================================================================================== //The IF/ID register has the function of controlling the synchronization of pc and instr, and also has the function of clearing the subsequent components (set to bubble) when stall is encountered // ==================================================================================== always @(posedge clock, posedge reset) begin // add your logic here to update the IF_ID_Register if(!stall) begin PipeRegIFID.Curr_Pc <= (reset|BrFlush) ? 9'b0:PC; PipeRegIFID.Curr_Instr <= (reset|BrFlush) ? 32'b0:instr; end end // ==================================================================================== // Instruction Decoding (ID) // ==================================================================================== // // peripheral logic here. // assign opcode = PipeRegIFID.Curr_Instr[6:0]; logic [31:0] rd1, rd2, ImmG; // // add your register file here. // // ==================================================================================== // Register file (space?), the content of the register is placed, and the content of the rf_init file in the test_r folder of the tb folder is the register x0-x31 from top to bottom // ==================================================================================== Reg_file RF ( .clock(clock), .reset(reset), .write_en(PipeRegMEWB.RegWrite), .write_addr(PipeRegMEWB.rd), .data_in(wb_data), .read_addr1(PipeRegIFID.Curr_Instr[19:15]), .read_addr2(PipeRegIFID.Curr_Instr[24:20]), .data_out1(rd1), .data_out2(rd2) ); // // add your immediate generator here // // ==================================================================================== // Immediate expansion components // ==================================================================================== Imm_gen Imm_Gen ( .inst_code(PipeRegIFID.Curr_Instr), .imm_out (ImmG) ); // ==================================================================================== // End of Instruction Decoding (ID) // ==================================================================================== always @(posedge clock, posedge reset) begin // add your logic here to update the ID_EX_Register if (reset | stall | BrFlush) begin PipeRegIDEX.ALU2ndOperandSrc <= 1'b0; PipeRegIDEX.WrRegDataSrc <= 1'b0; PipeRegIDEX.RegWrite <= 1'b0; PipeRegIDEX.MemRead <= 1'b0; PipeRegIDEX.MemWrite <= 1'b0; PipeRegIDEX.ALUOp <= 2'b0; PipeRegIDEX.Branch <= 1'b0; PipeRegIDEX.JalrSel <= 1'b0; PipeRegIDEX.JalSel <= 1'b0; PipeRegIDEX.RWSel <= 2'b0; PipeRegIDEX.Curr_Pc <= 9'b0; PipeRegIDEX.RD_One <= 32'b0; PipeRegIDEX.RD_Two <= 32'b0; PipeRegIDEX.RS_One <= 5'b0; PipeRegIDEX.RS_Two <= 5'b0; PipeRegIDEX.rd <= 5'b0; PipeRegIDEX.ImmG <= 32'b0; PipeRegIDEX.func3 <= 3'b0; PipeRegIDEX.func7 <= 7'b0; PipeRegIDEX.Curr_Instr <= 32'b0; end else if (!stall) begin PipeRegIDEX.ALU2ndOperandSrc <= alu_src; PipeRegIDEX.WrRegDataSrc <= WrRegDataSrc; PipeRegIDEX.RegWrite <= reg_write_en; PipeRegIDEX.MemRead <= mem_read_en; PipeRegIDEX.MemWrite <= mem_write_en; PipeRegIDEX.ALUOp <= alu_op; PipeRegIDEX.Branch <= branch_taken; PipeRegIDEX.JalrSel <= jalr_sel; PipeRegIDEX.JalSel <= jal_sel; PipeRegIDEX.RWSel <= RWSel; PipeRegIDEX.Curr_Pc <= PipeRegIFID.Curr_Pc; PipeRegIDEX.RD_One <= rd1; PipeRegIDEX.RD_Two <= rd2; PipeRegIDEX.RS_One <= PipeRegIFID.Curr_Instr[19:15]; PipeRegIDEX.RS_Two <= PipeRegIFID.Curr_Instr[24:20]; PipeRegIDEX.rd <= PipeRegIFID.Curr_Instr[11:7]; PipeRegIDEX.ImmG <= ImmG; PipeRegIDEX.func3 <= PipeRegIFID.Curr_Instr[14:12]; PipeRegIDEX.func7 <= PipeRegIFID.Curr_Instr[31:25]; PipeRegIDEX.Curr_Instr <= PipeRegIFID.Curr_Instr; end end // ==================================================================================== // Execution (EX) // ==================================================================================== // // add your ALU, branch unit and with peripheral logic here // logic [31:0] FA_mux_result, FB_mux_result; logic [31:0] ALU_result, PCplusImm, PCplus4_EX, src_mux_result, mem_data; logic [1:0] ForwardA, ForwardB; logic zero; assign aluop_current = PipeRegIDEX.ALUOp; assign opcode_IDEX = PipeRegIDEX.Curr_Instr[6:0]; assign funct3 = PipeRegIDEX.func3; assign funct7 = PipeRegIDEX.func7; alu ALU ( .operand_a(FA_mux_result), .operand_b(src_mux_result), .alu_ctrl(alu_cc), .alu_result(ALU_result), .zero(zero) ); // ==================================================================================== // Update delivery if there is any delivery // ==================================================================================== BranchUnit Branch_unit ( .cur_pc(PipeRegIDEX.Curr_Pc), .imm(PipeRegIDEX.ImmG), .jalr_sel(PipeRegIDEX.JalrSel), .jal_sel(PipeRegIDEX.JalSel), .branch_taken(PipeRegIDEX.Branch), .alu_result(ALU_result), .pc_plus_imm(PCplusImm), .pc_plus_4(PCplus4_EX), .branch_target(BrPC), .pc_sel(BrFlush) ); mux4 FA_mux ( .d00(PipeRegIDEX.RD_One), .d10(mem_data), .d01(wb_data), .d11(32'b0), .s (ForwardA), .y (FA_mux_result) ); mux4 FB_mux ( .d00(PipeRegIDEX.RD_Two), .d10(mem_data), .d01(wb_data), .d11(32'b0), .s (ForwardB), .y (FB_mux_result) ); mux2 src_mux ( .d0(FB_mux_result), .d1(PipeRegIDEX.ImmG), .s (PipeRegIDEX.ALU2ndOperandSrc), .y (src_mux_result) ); // ==================================================================================== // End of Execution (EX) // ==================================================================================== always @(posedge clock, posedge reset) begin // add your logic here to update the EX_MEM_Register if (reset) begin PipeRegEXMEM.RegWrite <= 1'b0; PipeRegEXMEM.WrRegDataSrc <= 1'b0; PipeRegEXMEM.MemRead <= 1'b0; PipeRegEXMEM.MemWrite <= 1'b0; PipeRegEXMEM.RWSel <= 2'b0; PipeRegEXMEM.Pc_Imm <= 32'b0; PipeRegEXMEM.Pc_Four <= 32'b0; PipeRegEXMEM.Imm_Out <= 32'b0; PipeRegEXMEM.Alu_Result <= 32'b0; PipeRegEXMEM.MemWrData <= 32'b0; PipeRegEXMEM.rd <= 5'b0; PipeRegEXMEM.func3 <= 3'b0; PipeRegEXMEM.func7 <= 7'b0; PipeRegEXMEM.Curr_Instr <= 32'b0; end else begin PipeRegEXMEM.RegWrite <= PipeRegIDEX.RegWrite; PipeRegEXMEM.WrRegDataSrc <= PipeRegIDEX.WrRegDataSrc; PipeRegEXMEM.MemRead <= PipeRegIDEX.MemRead; PipeRegEXMEM.MemWrite <= PipeRegIDEX.MemWrite; PipeRegEXMEM.RWSel <= PipeRegIDEX.RWSel; PipeRegEXMEM.Pc_Imm <= PCplusImm; PipeRegEXMEM.Pc_Four <= PCplus4_EX; PipeRegEXMEM.Imm_Out <= PipeRegIDEX.ImmG; PipeRegEXMEM.Alu_Result <= ALU_result; PipeRegEXMEM.MemWrData <= FB_mux_result; PipeRegEXMEM.rd <= PipeRegIDEX.rd; PipeRegEXMEM.func3 <= PipeRegIDEX.func3; PipeRegEXMEM.func7 <= PipeRegIDEX.func7; PipeRegEXMEM.Curr_Instr <= PipeRegIDEX.Curr_Instr; end end // ==================================================================================== // Memory Access (MEM) // ==================================================================================== // add your data memory here. logic [31:0] ReadData; datamemory DM ( .clock(clock), .read_en(PipeRegEXMEM.MemRead), .write_en(PipeRegEXMEM.MemWrite), .address(PipeRegEXMEM.Alu_Result[11:0]), .data_in(PipeRegEXMEM.MemWrData), .funct3(PipeRegEXMEM.func3), .data_out(ReadData) ); mux4 mem_mux ( .d00(PipeRegEXMEM.Alu_Result), .d01(PipeRegEXMEM.Pc_Four), .d10(PipeRegEXMEM.Imm_Out), .d11(PipeRegEXMEM.Pc_Imm), .s (PipeRegEXMEM.RWSel), .y (mem_data) ); // ==================================================================================== // End of Memory Access (MEM) // ==================================================================================== always @(posedge clock) begin // add your logic here to update the MEM_WB_Register if (reset) begin PipeRegMEWB.RegWrite <= 1'b0; PipeRegMEWB.WrRegDataSrc <= 1'b0; PipeRegMEWB.RWSel <= 2'b0; PipeRegMEWB.Pc_Imm <= 32'b0; PipeRegMEWB.Pc_Four <= 32'b0; PipeRegMEWB.Imm_Out <= 32'b0; PipeRegMEWB.Alu_Result <= 32'b0; PipeRegMEWB.MemReadData <= 32'b0; PipeRegMEWB.rd <= 5'b0; PipeRegMEWB.Curr_Instr <= 5'b0; end else begin PipeRegMEWB.RegWrite <= PipeRegEXMEM.RegWrite; PipeRegMEWB.WrRegDataSrc <= PipeRegEXMEM.WrRegDataSrc; PipeRegMEWB.RWSel <= PipeRegEXMEM.RWSel; PipeRegMEWB.Pc_Imm <= PipeRegEXMEM.Pc_Imm; PipeRegMEWB.Pc_Four <= PipeRegEXMEM.Pc_Four; PipeRegMEWB.Imm_Out <= PipeRegEXMEM.Imm_Out; PipeRegMEWB.Alu_Result <= PipeRegEXMEM.Alu_Result; PipeRegMEWB.MemReadData <= ReadData; PipeRegMEWB.rd <= PipeRegEXMEM.rd; PipeRegMEWB.Curr_Instr <= PipeRegEXMEM.Curr_Instr; // $display("------------------------------------------------------------------"); // $display("PipeRegMEWB.Curr_Instr = %x",PCplus4); // // $display("PipeRegMEWB.Imm_Out = %x", PipeRegMEWB.Imm_Out); // // $display("PipeRegMEWB.Pc_Four = %x", PipeRegMEWB.Pc_Four); // $display("PCplusImm = %x", PipeRegMEWB.Imm_Out); // $display("PCplus4_EX = %x", PipeRegMEWB.Pc_Four); // $display("PipeRegMEWB.WrRegDataSrc = %x", BrPC); // $display("PipeRegIDEX.Curr_Pc = %x", PipeRegIDEX.Curr_Pc); // $display("PipeRegMEWB.Alu_Result = %x", PipeRegMEWB.Alu_Result); // $display("wb_data = %x",wb_data); end end // ==================================================================================== // Write Back (WB) // ==================================================================================== // // add your write back logic here. // logic [31:0] res_mux_result; mux2 res_mux ( .d0(PipeRegMEWB.Alu_Result), .d1(PipeRegMEWB.MemReadData), .s (PipeRegMEWB.WrRegDataSrc), .y (res_mux_result) ); mux4 wrs_mux ( .d00(res_mux_result), .d01(PipeRegMEWB.Pc_Four), .d10(PipeRegMEWB.Imm_Out), .d11(PipeRegMEWB.Pc_Imm), .s (PipeRegMEWB.RWSel), .y (wb_data) ); // ==================================================================================== // End of Write Back (WB) // ==================================================================================== // ==================================================================================== // other logic // ==================================================================================== // // add your hazard detection logic here // Hazard_detector hazard_unit ( .clock(clock), .reset(reset), .if_id_rs1(PipeRegIFID.Curr_Instr[19:15]), .if_id_rs2(PipeRegIFID.Curr_Instr[24:20]), .id_ex_rd(PipeRegIDEX.rd), .id_ex_memread(PipeRegIDEX.MemRead), .stall(stall) ); // // add your forwarding logic here // ForwardingUnit forwarding_unit ( .rs1(PipeRegIDEX.RS_One), .rs2(PipeRegIDEX.RS_Two), .ex_mem_rd(PipeRegEXMEM.rd), .mem_wb_rd(PipeRegMEWB.rd), .ex_mem_regwrite(PipeRegEXMEM.RegWrite), .mem_wb_regwrite(PipeRegMEWB.RegWrite), .forward_a(ForwardA), .forward_b(ForwardB) ); // // possible extra code // endmodule
pipeline_regs
// package Pipe_Buf_Reg_PKG; // Reg A typedef struct packed{ logic [8:0] Curr_Pc; logic [31:0] Curr_Instr; } if_id_reg; // Reg B typedef struct packed{ logic ALU2ndOperandSrc; logic WrRegDataSrc; logic RegWrite; logic MemRead; logic MemWrite; logic [1:0] ALUOp; logic Branch; logic JalrSel; logic JalSel; logic [1:0] RWSel; logic [8:0] Curr_Pc; logic [31:0] RD_One; // 1st read data from the reg file. logic [31:0] RD_Two; // 2nd read data from the reg file. logic [4:0] RS_One; // 1st read register address of Curr_Instr logic [4:0] RS_Two; // 2nd read register address of Curr_Instr logic [4:0] rd; // dst register to write of Curr_Instr logic [31:0] ImmG; // Immediate value generated by Imm_gen logic [2:0] func3; logic [6:0] func7; logic [31:0] Curr_Instr; } id_ex_reg; // Reg C typedef struct packed{ logic RegWrite; logic WrRegDataSrc; logic MemRead; logic MemWrite; logic [1:0] RWSel; logic [31:0] Pc_Imm; logic [31:0] Pc_Four; logic [31:0] Imm_Out; logic [31:0] Alu_Result; logic [31:0] MemWrData; // data written to the memory logic [4:0] rd; // dst register to write of Curr_Instr logic [2:0] func3; logic [6:0] func7; logic [31:0] Curr_Instr; } ex_mem_reg; // Reg D typedef struct packed{ logic RegWrite; logic WrRegDataSrc; logic [1:0] RWSel; logic [31:0] Pc_Imm; logic [31:0] Pc_Four; logic [31:0] Imm_Out; logic [31:0] Alu_Result; logic [31:0] MemReadData; logic [4:0] rd; logic [31:0] Curr_Instr; } mem_wb_reg; endpackage
proc_controller
`timescale 1ns / 1ps // main controller module Proc_controller ( // ======================== Inputs ======================== input logic [6:0] Opcode, // 7-bit opcode field of the instruction // ======================== Outputs ======================== // Selection signal for the source of the 2nd ALU operand: //0: the 2nd read data from the register file; //1: the immediate value generated by Imm_gen output logic ALU2ndOperandSrc, // Selection signal for the source of value used to write the regiser: // 0: ALU; 1: data memory. output logic WrRegDataSrc, // Write register enable output logic WrRegEn, // Write memory enable output logic WrMemEn, // Read memory enable output logic RdMemEn, // 00: LW/SW/AUIPC; 01: Branch; // 10: Rtype/Itype(Itype wrong!); 11: JAL/LUI/JALR//Itype(include jalr)//This is considering the overlapping and intersecting instructions in funct3 in itype and rtype, which are unclear output logic [1:0] ALUOp, //0: branch is not taken; 1: branch is taken output logic Branch, //0: Jalr is not taken; 1: jalr is taken output logic JalrSel, output logic JalSel,//The separation is because the branch selector handles jalr and jal differently, and jal directly adds // 00: Register Write Back; // 01: PC+4 write back(JAL/JALR); // 10: imm-gen write back(LUI); // 11: pc+imm-gen write back(AUIPC) output logic [1:0] RWSel ); logic [10:0] con; /*****************************************************instruction's Type*****************************************************/ //RType logic [6:0] RType= 7'b0110011;//include slt //IType logic [6:0] ori= 7'b0010011;//andi logic [6:0] lw= 7'b0000011;//lb,lh,lbu,lhu logic [6:0] jalr= 7'b1100111; //UType logic [6:0] jal= 7'b1101111;// logic [6:0] lui= 7'b0110111;// //SType logic [6:0] sw= 7'b0100011;//sb,sh logic [6:0] beq= 7'b1100011;//bne,blt,bge,bltu,bgeu /*****************************************************controll signs*****************************************************/ assign ALU2ndOperandSrc = (Opcode == lw || Opcode == sw || Opcode == ori || Opcode == jalr || Opcode == jal) ? 1'b1 : 1'b0; assign WrRegDataSrc = (Opcode == lw) ? 1'b1 : 1'b0; assign WrRegEn = (Opcode == RType || Opcode == lw || Opcode == ori || Opcode == lui || Opcode == jal || Opcode == jalr) ? 1'b1 : 1'b0; assign RdMemEn = (Opcode == lw) ? 1'b1 : 1'b0; assign WrMemEn = (Opcode == sw) ? 1'b1 : 1'b0; assign ALUOp = (Opcode == RType) ? 2'b10 : (Opcode == beq) ? 2'b01 : (Opcode == jal || Opcode == jalr || Opcode == lui || Opcode == ori) ? 2'b11 : 2'b00; assign Branch = (Opcode == beq) ? 1'b1 : 1'b0; assign JalrSel = (Opcode == jalr) ? 1'b1 : 1'b0; assign JalSel = (Opcode == jal) ? 1'b1 : 1'b0; assign RWSel = (Opcode == jalr || Opcode == jal)? 2'b01 : (Opcode == lui) ? 2'b10 :(Opcode == beq) ? 2'b11 : 2'b00; endmodule
reg_file
`timescale 1ns / 1ps // register file module Reg_file #( parameter DATA_WIDTH = 32, // number of bits in each register parameter ADDRESS_WIDTH = 5, //number of registers = 2^ADDRESS_WIDTH parameter NUM_REGS = 2 ** ADDRESS_WIDTH )( // Inputs input clock, //clock input reset, //synchronous reset; reset all regs to 0 upon assertion. input write_en, //write enable input [ADDRESS_WIDTH-1:0] write_addr, //address of the register that supposed to written into input [DATA_WIDTH-1:0] data_in, // data that supposed to be written into the register file input [ADDRESS_WIDTH-1:0] read_addr1, //first address to be read from input [ADDRESS_WIDTH-1:0] read_addr2, //second address to be read from // Outputs output logic [DATA_WIDTH-1:0] data_out1, //content of reg_file[read_addr1] is loaded into output logic [DATA_WIDTH-1:0] data_out2 //content of reg_file[read_addr2] is loaded into ); integer i; logic [DATA_WIDTH-1:0] register_file [NUM_REGS-1:0]; always @( negedge clock ) begin if( reset == 1'b1 ) for (i = 0; i < NUM_REGS ; i = i + 1) begin register_file [i] <= 0; end else if( write_en == 1'b1 && write_addr ) begin register_file [ write_addr ] <= data_in; end end assign data_out1 = register_file[read_addr1]; assign data_out2 = register_file[read_addr2]; endmodule
riscv_proc
`timescale 1ns / 1ps // the top module for the RISC-V processor // basically, you do not need to modify this file. module riscv #( parameter DATA_W = 32 ) ( input logic clock, input logic reset, output logic [31:0] WB_Data // The ALU_Result ); logic [6:0] opcode; logic [6:0] opcode_IDEX; logic ALU2ndOperandSrc, WrRegDataSrc, RegWrite, MemRead, MemWrite, Branch, JalrSel; logic [1:0] RWSel; logic [1:0] ALUop; logic [1:0] ALUop_Reg; logic [6:0] Funct7; logic [2:0] Funct3; logic [3:0] Operation; Proc_controller proc_controller ( opcode, ALU2ndOperandSrc, WrRegDataSrc, RegWrite, MemWrite, MemRead, ALUop, Branch, JalrSel, JalSel, RWSel ); ALU_Controller proc_alu_controller ( ALUop_Reg, Funct7, Funct3, opcode_IDEX, Operation ); Datapath proc_data_path ( clock, reset, RegWrite, WrRegDataSrc, ALU2ndOperandSrc, MemWrite, MemRead, Branch, JalrSel, JalSel, ALUop, RWSel, Operation, opcode, opcode_IDEX, Funct7, Funct3, ALUop_Reg, WB_Data ); endmodule
adder
`timescale 1ns / 1ps // adder module adder #( parameter WIDTH = 8 ) ( input logic [WIDTH-1:0] a, b, output logic [WIDTH-1:0] y ); // add your adder logic here assign y = a + b; endmodule
alu_controller
`timescale 1ns / 1ps // arithmetic logic unit controller module ALU_Controller ( input logic [1:0] alu_op, // 2-bit opcode field from the Proc_controller input logic [6:0] funct7, // insn[31:25] input logic [2:0] funct3, // insn[14:12] input logic [6:0] opcode, // insn[6:0] //In order to distinguish the funct part and imm part in Itype and UType, for example, utype has no funct, but imm part may have the same value as funct output logic [3:0] operation // operation selection for ALU ); // add your code here. always_comb begin case (alu_op) 2'b00: //Store & load operation = 4'b0010; 2'b01://branch case (funct3) 3'b000: operation = 4'b0110; // beq -> sub 3'b001: operation = 4'b1001; // bne 3'b100: operation = 4'b1010; // blt 3'b101: operation = 4'b1011; // bge 3'b110: operation = 4'b1100; // bltu 3'b111: operation = 4'b1101; // bgeu default:begin operation = 4'b0000; $display("---------------------------------->Undfined_ALU_controller_Type\n"); end endcase 2'b10://RType case (funct3) 3'b000: case (funct7) //add 7'b0000000: operation = 4'b0010; //sub 7'b0100000: operation = 4'b0110; //addi default: operation = 4'b0000; endcase 3'b111://and operation = 4'b0000; 3'b110://or operation = 4'b0001; 3'b100://xor operation = 4'b0111; 3'b101://mul//funct7=0funct3=101 operation = 4'b0011; 3'b010: case (funct7) //slt 7'b0000000: operation = 4'b1000; default:begin operation = 4'b0000; $display("---------------------------------->Undfined_ALU_controller_Type\n"); end endcase default:begin operation = 4'b0000; $display("---------------------------------->Undfined_ALU_controller_Type\n"); end endcase 2'b11: //IType,Utype case (opcode) 7'b0010011: case (funct3) 3'b000: operation = 4'b0010;//addi 3'b111: operation = 4'b0000;//andi 3'b110: operation = 4'b0001;//ori 3'b101: operation = 4'b0011;//muli default: operation = 4'b0000; endcase default:operation = 4'b0010;//jalr endcase endcase end endmodule
alu
`timescale 1ns / 1ps // arithmetic logic unit module alu #( parameter DATA_WIDTH = 32, parameter OPCODE_LENGTH = 4 ) ( input logic signed [ DATA_WIDTH - 1 : 0] operand_a, input logic signed [ DATA_WIDTH - 1 : 0] operand_b, input logic [OPCODE_LENGTH - 1 : 0] alu_ctrl, // Operation output logic signed [ DATA_WIDTH - 1 : 0] alu_result, output logic zero ); logic [31:0] s; logic [31:0] signed_s, unsigned_operand_a, unsigned_operand_b; // modify this //Unsigned and signed integers are not considered here assign unsigned_operand_a = operand_a; assign unsigned_operand_b = operand_b; always_comb begin // modify this case (alu_ctrl) 4'b0000: //and alu_result = operand_a & operand_b; 4'b0001: //or alu_result = operand_a | operand_b; 4'b0010: //add alu_result = operand_a + operand_b; 4'b0110: //sub include beq alu_result = operand_a - operand_b; 4'b0111: //xor alu_result = operand_a ^ operand_b; 4'b0011: //mul,muli alu_result = operand_a * operand_b; 4'b1000: //slt alu_result = (operand_a < operand_b) ? 1'b1 : 1'b0; 4'b1001: //bne alu_result = (operand_a != operand_b) ? 1'b0 : 1'b1; 4'b1010: //blt alu_result = (operand_a < operand_b) ? 1'b0 : 1'b1; 4'b1011: //bge alu_result = (operand_a >= operand_b) ? 1'b0 : 1'b1; 4'b1100: //bltu alu_result = (unsigned_operand_a < unsigned_operand_b) ? 1'b0 : 1'b1; 4'b1101: //bgeu alu_result = (unsigned_operand_a >= unsigned_operand_b) ? 1'b0 : 1'b1; default: begin alu_result = 32'b0; $display("---------------------------------->Undfined_ALU_Type\n"); end endcase assign zero = (alu_result == 0) ? 1'b1 : 1'b0; //$display("this operation: oper_a:%x oper_b:%x ctrl:%x result:%x", operand_a, operand_b, alu_ctrl, alu_result); end endmodule
branch_unit
`timescale 1ns / 1ps // Jump unit module BranchUnit #( parameter PC_W = 9 ) ( input logic [PC_W - 1:0] cur_pc, input logic [ 31:0] imm, input logic jalr_sel, input logic jal_sel, input logic branch_taken, // Branch input logic [ 31:0] alu_result, output logic [ 31:0] pc_plus_imm, // PC + imm output logic [ 31:0] pc_plus_4, // PC + 4 output logic [ 31:0] branch_target, // BrPC output logic pc_sel ); logic [31:0] pc; assign pc = {23'b0, cur_pc}; always_comb begin pc_plus_4 = pc + 4;//3'b100; pc_plus_imm = (jalr_sel) //The operations performed by jal and jalr are different. The fundamental reason is that jalr uses register rs1, while jal only uses imm. ? (alu_result >> 1) << 1/*Here, the lowest value of alu_reslut is set to 0, and the value of alu_result is rs1+imm*/ : pc + imm ;//{double_pc_plus_imm[31:0],1'b0}//Another way to set the lowest position of sum to 0 pc_sel = ((branch_taken && alu_result == 0) || jalr_sel || jal_sel) ? 1'b1 : 1'b0; branch_target = pc_plus_imm; end endmodule
data_mem
`timescale 1ns / 1ps // data storage module datamemory #( parameter ADDR_WIDTH = 12, parameter DATA_WIDTH = 32 ) ( input logic clock, input logic read_en, input logic write_en, input logic [ADDR_WIDTH -1 : 0] address, // read/write address input logic [DATA_WIDTH -1 : 0] data_in, // write Data input logic [ 2:0] funct3, // insn[14:12] output logic [DATA_WIDTH -1 : 0] data_out // read data ); logic [7 : 0] MEM[(2**12) - 1 : 0] ;//The difference between reading or writing a byte, 2 bytes and 4 bytes was not considered at first. logic sin01, sin00;//useless"-" always @(posedge clock) if (write_en) begin // modify this. Note the data width case(funct3) 3'b000://sb begin MEM[address] <= data_in[7:0]; end 3'b001://sh begin MEM[address] <= data_in[7:0]; MEM[address + 1] <= data_in[15:8]; end 3'b010://sw begin MEM[address] <= data_in[7:0]; MEM[address + 1] <= data_in[15:8]; MEM[address + 2] <= data_in[23:16]; MEM[address + 3] <= data_in[31:24]; end endcase end // maybe some extra code here always_comb begin if (read_en) begin // modify this data_out = 32'b0; case(funct3) 3'b000://lb begin data_out = {{(24){MEM[address][7]}},MEM[address]}; end 3'b001://lh begin data_out = {{(16){MEM[address+1][7]}}, MEM[address + 1],MEM[address]}; end 3'b010://lw begin //data_out = {MEM[address],MEM[address + 1],MEM[address + 2],MEM[address + 3]}; data_out[31:24] = MEM[address + 3];//8'b00000001; data_out[23:16] = MEM[address + 2];//8'b00110010; data_out[15:8] = MEM[address + 1];//8'b01010100; data_out[7:0] = MEM[address];//8'b01110110; end 3'b100://lhu begin data_out[7:0] = MEM[address];//8'b01110110; end 3'b101://lbu begin data_out[15:8] = MEM[address + 1];//8'b01010100; data_out[7:0] = MEM[address];//8'b01110110; end endcase end end endmodule
flipflop
`timescale 1ns / 1ps // // Module Name: flipflop // Description: An edge-triggered register // When reset is `1`, the value of the register is set to 0. // When reset is set to 1, the signal that resets the register is all 0s // Otherwise: // otherwise // - if stall is set, the register preserves its original data // - else, it is updated by `d`. // If stall is set to 1, the register retains its original value, stall is set to 0, the value of d is written to the register // // edge-triggered register module flipflop # ( parameter WIDTH = 8 )( input logic clock, input logic reset, input logic [WIDTH-1:0] d, input logic stall, output logic [WIDTH-1:0] q ); always_ff @(posedge clock, posedge reset) begin if (reset) q <= 0; else if (!stall) q <= d; end endmodule
forwarding_unit
`timescale 1ns / 1ps // Data Oriented Processing Unit module ForwardingUnit ( input logic [4:0] rs1, input logic [4:0] rs2, input logic [4:0] ex_mem_rd, input logic [4:0] mem_wb_rd, input logic ex_mem_regwrite, input logic mem_wb_regwrite, output logic [1:0] forward_a, output logic [1:0] forward_b ); // define your forwarding logic here. // forward_a & forward_b see book P300 assign forward_a = ((ex_mem_regwrite) && (ex_mem_rd != 0) && (ex_mem_rd == rs1)) ? 2'b10 : ((mem_wb_regwrite) && (mem_wb_rd != 5'b0) && (rs1 == mem_wb_rd)) ? 2'b01 : 2'b00 ; assign forward_b = ((ex_mem_regwrite) && (ex_mem_rd != 0) && (rs2 == ex_mem_rd)) ? 2'b10 : ((mem_wb_regwrite) && (mem_wb_rd != 5'b0) && (rs2 == mem_wb_rd)) ? 2'b01 : 2'b00 ; endmodule
hazard_detector
`timescale 1ns / 1ps // Adventure Detector (blocking generator) module Hazard_detector ( input logic clock, input logic reset, input logic [4:0] if_id_rs1, input logic [4:0] if_id_rs2, input logic [4:0] id_ex_rd, input logic id_ex_memread, output logic stall ); // define your hazard detection logic here logic [1:0] counter; always @(negedge clock) begin stall <= (id_ex_memread && ((id_ex_rd == if_id_rs1) || (id_ex_rd == if_id_rs2)))? 1'b1 : 1'b0; end endmodule
imm_gen
`timescale 1ns / 1ps // immediate expansion module Imm_gen ( input logic [31:0] inst_code, output logic [31:0] imm_out ); logic [6:0] test; assign test = inst_code[6:0]; always_comb begin imm_out = 32'b0;//init but seem not been used case (test) 7'b0110011://Rtype does not use imm imm_out = 32'b0; 7'b0010011://addi ori etc are all zero extension lw is sign extension? //the answer is signed extend imm_out = {{20{inst_code[31]}},inst_code[31:20]}; 7'b0000011,7'b1100111://lw//jalr imm_out = {{(20){inst_code[31]}},inst_code[31:20]}; 7'b0110111://lui imm_out = inst_code[31:12] << 12; 7'b1101111://jal imm_out = {{10{inst_code[31]}},inst_code[31],inst_code[19:12],inst_code[20],inst_code[30:21],1'b0}; 7'b0100011://35://sw begin imm_out[11:5] = inst_code[31:25]; imm_out[4:0] = inst_code[11:7]; end 7'b1100011://beq imm_out = {{(19){inst_code[31]}},inst_code[31],inst_code[7],inst_code[30:25],inst_code[11:8],1'b0} ;// default: imm_out = 32'b0; endcase end endmodule
insn_mem
`timescale 1ns / 1ps // // instruction memory? module Insn_mem #( parameter ADDR_WIDTH = 9, parameter INSN_WIDTH = 32 )( input logic [ADDR_WIDTH - 1 : 0] read_address, output logic [INSN_WIDTH - 1 : 0] insn ); logic [INSN_WIDTH-1 :0] insn_array [(2**(ADDR_WIDTH - 2))-1:0]; assign insn = insn_array[read_address[ADDR_WIDTH - 1 : 2]]; endmodule
mux2
`timescale 1ns / 1ps // Two-port multiplexer module mux2 #( parameter WIDTH = 32 ) ( input logic [WIDTH-1:0] d0, d1, input logic s, output logic [WIDTH-1:0] y ); assign y = s ? d1 : d0; endmodule
mux4
`timescale 1ns / 1ps // Four-port multiplexer module mux4 #( parameter WIDTH = 32 ) ( input logic [WIDTH-1:0] d00, input logic [WIDTH-1:0] d01, input logic [WIDTH-1:0] d10, input logic [WIDTH-1:0] d11, input logic [1:0] s, output logic [WIDTH-1:0] y ); // add your logic here assign y = (s == 2'b00) ? d00 : (s == 2'b01) ? d01 : (s == 2'b10) ? d10 : d11 ; endmodule