A short note on a computer system course experiment - RISCV pipeline processor design and implementation

Experiment - RISCV pipeline processor design and implementation

1. Pipeline design

![TheStructureChart]

2. rtl file

in the root directory folder.

Three, pipeline realization ideas

(1) Complete the design of each component

1. Design of the operation unit

Including adder, alu, data_mem, imm_gen, mux, basically very simple.
According to the content in the "RISC_V_Experiment-2022" PPT, imm_gen can be implemented,

The more complicated is the data_mem file. First of all, RISC-V adopts the little-endian format, and according to the different load/store instructions funct3, the length to be read/written is different, and different instructions also need to perform different sign/unsigned extensions. Such as lhu and lh.

2. Design of the storage unit

3. Design of the control unit

①proc_controller

proc_controller is a very important control unit.
This picture in PPT "2021-Chapter 4 Lecture 3-Single-cycle Controller" gives some methods of main control unit control signal design

Some tips for control signal design are also given in the proc_controller file:

(ps: The jar_sel variable is additionally set here because branch needs to perform special operations, which will be mentioned later by branch)
(pss: The setting of the ALUop signal here is different from that given by Mr. Fu at the beginning. The rtype is separated from the IType, and the store/load in the IType is also separated from the addi, ori and jalr, because I think it is convenient to design alu_controller)
Here, we first express the instructions that need to be designed as variables to facilitate the writing of the code later.

②branch_unit

branch_unit is to determine the address of the next instruction. If jal, jalr and branch are not executed, the address of the next instruction is pc+4. Otherwise, the value of branch_target is changed to the jump address of the corresponding instruction. The jal_sel variable is additionally set here, which is Because the jump action performed by jal and jalr is different, the source register is included in the jalr instruction. As shown in the figure:

(ps: In fact, the operation result of jalr here should not be placed in the pc_plus_imm variable. This variable should only put the pc+imm of jal and branch, because the pc_plu_imm of the jalr instruction is actually pc+imm+rs1)

③alu_controller

(ps: First of all, the input variable here is additionally added with opcode, in order to distinguish the funct part and the imm part in IType and UType. For example, UType does not have funct, but some parts of imm may have the same value as funct)
The output operation here can be defined by yourself, as long as the operation expected to be performed by the defined 4-bit operation is consistent with the operation performed by the alu_ctrl control in the alu file.

4. Solve risky problems

①Structure Adventure

In this experiment, memory such as registers are used by multiple instructions at the same time, and it seems that there will be no structural risk or resource conflict. For example, the read and write ports of registers are similar to separate, and can be read and written at the same time.

②Data adventure

Use stall and forward, that is, the hazard file and the forward_unit file to solve this problem, and solve the problem of reading after writing,
Take the design idea of ​​this diagram on PPt "2021-Chapter 4, Lecture 4-Pipeline Principle"

That is: as shown in the figure:

If the current instruction is decoded by the ID part, a data risk is detected (the code implementation of the forward file will be attached later)
Then the current instruction ID is blocked, the pc value does not change, the next executed instruction is still this instruction, and the operand of the ALU passes the ALU data selector as shown in the figure below, the previous instruction is written, the current instruction , where the result of a source register, replaces the value of the unmodified register:

So far, the blocking operation as shown on the PPT has been realized.
The forward file code is as follows:

③Control the adventure
There are three basic ideas for controlling adventure solutions, as shown in the figure:

This experiment adopts the first method, that is, blocking the three instructions after the branch instruction. First, if the branch instruction appears and the jump condition is satisfied, in the BrFlush file, first set the BrFlush control signal to 1:
pc_sel = ((branch_taken && alu_result == 0) || jalr_sel || jal_sel) ? 1'b1 : 1'b0;
Then block the next instruction,
The next instruction of the current instruction is blocked by the code shown in the figure below, and the ID/IF register is "flushed" to "empty":

The next instruction of the current instruction is blocked by the following code, and this instruction is set to "empty instruction":

When the branch_target value in branch_unit is calculated, set the address of the current instruction to be executed as branch_target through the code in the following figure:

(2) Connect the components (data_path)

1. Design the IF unit and pass the current instruction address and the undecoded instruction to the IF/ID register.

①If the reset instruction is encountered, the pc is initialized, and the pc value is 0. If the stall is encountered, the next instruction is not executed, and the PC remains unchanged.
PC <= reset ? 0 : stall ? PC : PC_mux_result;
②Set the value of Brflush according to the output of branch_unit.
assign PC_mux_result = BrFlush ? BrPC : PCplus4
③ Pass in the current instruction address and the undecoded instruction to the IF/ID register

(ps: These two instructions are to control it in a non-blocking situation)

2. Understanding of data_path

The code structure of this file is very clear. In short, it is similar to the structure of the following figure:

data_path

`timescale 1ns / 1ps
// data path
`include "pipeline_regs.sv"
import Pipe_Buf_Reg_PKG::*;

module Datapath #(
    parameter PC_W       = 9,   // width of the Program Counter
    parameter INS_W      = 32,  // Instruction Width
    parameter DATA_W     = 32,  // reigster width
    parameter DM_ADDRESS = 9,   // width of the data memory address
    parameter ALU_CC_W   = 4    // width of the ALU Control code
) (
    input logic clock,

    // reset , sets the PC to zero
    input logic reset,

    // Register file writing enable
    input logic reg_write_en,

    // Selection signal for the source of value used to write the regiser
    input logic WrRegDataSrc,

    // Register file or Immediate MUX
    input logic alu_src,

    // Memroy Writing Enable
    input logic mem_write_en,

    // Memroy Reading Enable
    input logic mem_read_en,

    // Branch Enable
    input logic branch_taken,

    // Jalr Mux Select
    input logic jalr_sel,
    // Jal Instruction distinguish from jalr used in the branch function
    input logic jal_sel,

    input logic [1:0] alu_op,

    // Mux4to1 Select
    input logic [1:0] RWSel,

    // ALU Control Code ( input of the ALU )
    input logic [ALU_CC_W -1:0] alu_cc,

    output logic [6:0] opcode,

    output logic [6:0] opcode_IDEX,

    output logic [6:0] funct7,

    output logic [2:0] funct3,

    output logic [1:0] aluop_current,

    // data write back to register
    output logic [DATA_W-1:0] wb_data
);

  // define the pipeline registers
  if_id_reg  PipeRegIFID;
  id_ex_reg  PipeRegIDEX;
  ex_mem_reg PipeRegEXMEM;
  mem_wb_reg PipeRegMEWB;
  /*todo: analysis here
    * the registers are declared here.
    * each struct in pipeline_regs.sv is a set of registers

  */

  // ====================================================================================
  //                                Instruction Fetch (IF)
  // ====================================================================================
  //
  // peripheral logic here.
  //
  logic BrFlush, stall;
  logic [31:0] PC_mux_result, PC, PCplus4, BrPC, instr;

  // add your code here for PC generation
  
  assign PCplus4 = PC + 4;
  assign PC_mux_result = BrFlush ? BrPC : PCplus4;
  always @(posedge clock) begin
    //stall, do not update the PC
    PC <= reset ? 0 :  stall  ? PC : PC_mux_result;//①: If the reset instruction is encountered, initialize the pc //②: If the stall is encountered, the next instruction will not be executed
    //Here pc assignment must use <= instead of =
  end

  

  // your instruction memory
  Insn_mem IM (
      .read_address(PC[PC_W-1 : 0]),
      .insn(instr)
  );

  // ====================================================================================
  //                             End of Instruction Fetch (IF)
  // ====================================================================================

// ====================================================================================
  //The IF/ID register has the function of controlling the synchronization of pc and instr, and also has the function of clearing the subsequent components (set to bubble) when stall is encountered
  // ====================================================================================
  always @(posedge clock, posedge reset) begin
    // add your logic here to update the IF_ID_Register
    if(!stall) begin
      PipeRegIFID.Curr_Pc    <= (reset|BrFlush) ? 9'b0:PC;
      PipeRegIFID.Curr_Instr <= (reset|BrFlush) ? 32'b0:instr;
    end
  end

  // ====================================================================================
  //                                Instruction Decoding (ID)
  // ====================================================================================
  //
  // peripheral logic here.
  //
  assign opcode = PipeRegIFID.Curr_Instr[6:0];
  logic [31:0] rd1, rd2, ImmG;

  //
  // add your register file here.
  //
  // ====================================================================================
  //  Register file (space?), the content of the register is placed, and the content of the rf_init file in the test_r folder of the tb folder is the register x0-x31 from top to bottom
  // ====================================================================================
  Reg_file RF (
      .clock(clock),
      .reset(reset),
      .write_en(PipeRegMEWB.RegWrite),
      .write_addr(PipeRegMEWB.rd),
      .data_in(wb_data),
      .read_addr1(PipeRegIFID.Curr_Instr[19:15]),
      .read_addr2(PipeRegIFID.Curr_Instr[24:20]),
      .data_out1(rd1),
      .data_out2(rd2)
  );

  //
  // add your immediate generator here
  //
  // ====================================================================================
  //  Immediate expansion components
  // ====================================================================================
  Imm_gen Imm_Gen (
      .inst_code(PipeRegIFID.Curr_Instr),
      .imm_out  (ImmG)
  );

  // ====================================================================================
  //                                End of Instruction Decoding (ID)
  // ====================================================================================
  always @(posedge clock, posedge reset) begin
    // add your logic here to update the ID_EX_Register
    if (reset | stall | BrFlush) begin
      PipeRegIDEX.ALU2ndOperandSrc <= 1'b0;
      PipeRegIDEX.WrRegDataSrc     <= 1'b0;
      PipeRegIDEX.RegWrite         <= 1'b0;
      PipeRegIDEX.MemRead          <= 1'b0;
      PipeRegIDEX.MemWrite         <= 1'b0;
      PipeRegIDEX.ALUOp            <= 2'b0;
      PipeRegIDEX.Branch           <= 1'b0;
      PipeRegIDEX.JalrSel          <= 1'b0;
      PipeRegIDEX.JalSel           <= 1'b0; 
      PipeRegIDEX.RWSel            <= 2'b0;
      PipeRegIDEX.Curr_Pc          <= 9'b0;
      PipeRegIDEX.RD_One           <= 32'b0;
      PipeRegIDEX.RD_Two           <= 32'b0;
      PipeRegIDEX.RS_One           <= 5'b0;
      PipeRegIDEX.RS_Two           <= 5'b0;
      PipeRegIDEX.rd               <= 5'b0;
      PipeRegIDEX.ImmG             <= 32'b0;
      PipeRegIDEX.func3            <= 3'b0;
      PipeRegIDEX.func7            <= 7'b0;
      PipeRegIDEX.Curr_Instr       <= 32'b0;
    end else if (!stall) begin
      PipeRegIDEX.ALU2ndOperandSrc <= alu_src;
      PipeRegIDEX.WrRegDataSrc     <= WrRegDataSrc;
      PipeRegIDEX.RegWrite         <= reg_write_en;
      PipeRegIDEX.MemRead          <= mem_read_en;
      PipeRegIDEX.MemWrite         <= mem_write_en;
      PipeRegIDEX.ALUOp            <= alu_op;
      PipeRegIDEX.Branch           <= branch_taken;
      PipeRegIDEX.JalrSel          <= jalr_sel;
      PipeRegIDEX.JalSel           <= jal_sel;
      PipeRegIDEX.RWSel            <= RWSel;
      PipeRegIDEX.Curr_Pc          <= PipeRegIFID.Curr_Pc;
      PipeRegIDEX.RD_One           <= rd1;
      PipeRegIDEX.RD_Two           <= rd2;
      PipeRegIDEX.RS_One           <= PipeRegIFID.Curr_Instr[19:15];
      PipeRegIDEX.RS_Two           <= PipeRegIFID.Curr_Instr[24:20];
      PipeRegIDEX.rd               <= PipeRegIFID.Curr_Instr[11:7];
      PipeRegIDEX.ImmG             <= ImmG;
      PipeRegIDEX.func3            <= PipeRegIFID.Curr_Instr[14:12];
      PipeRegIDEX.func7            <= PipeRegIFID.Curr_Instr[31:25];
      PipeRegIDEX.Curr_Instr       <= PipeRegIFID.Curr_Instr;
    end
  end

  // ====================================================================================
  //                                    Execution (EX)
  // ====================================================================================
  //
  // add your ALU, branch unit and with peripheral logic here
  //
  logic [31:0] FA_mux_result, FB_mux_result;
  logic [31:0] ALU_result, PCplusImm, PCplus4_EX, src_mux_result, mem_data;
  logic [1:0] ForwardA, ForwardB;
  logic zero;

  assign aluop_current = PipeRegIDEX.ALUOp;

  assign opcode_IDEX = PipeRegIDEX.Curr_Instr[6:0];

  assign funct3 = PipeRegIDEX.func3;

  assign funct7 = PipeRegIDEX.func7;

  alu ALU (
      .operand_a(FA_mux_result),
      .operand_b(src_mux_result),
      .alu_ctrl(alu_cc),
      .alu_result(ALU_result),
      .zero(zero)
  );
// ====================================================================================
  //  Update delivery if there is any delivery
  // ====================================================================================
  BranchUnit Branch_unit (
      .cur_pc(PipeRegIDEX.Curr_Pc),
      .imm(PipeRegIDEX.ImmG),
      .jalr_sel(PipeRegIDEX.JalrSel),
      .jal_sel(PipeRegIDEX.JalSel),
      .branch_taken(PipeRegIDEX.Branch),
      .alu_result(ALU_result),
      .pc_plus_imm(PCplusImm),
      .pc_plus_4(PCplus4_EX),
      .branch_target(BrPC),
      .pc_sel(BrFlush)
  );
  mux4 FA_mux (
      .d00(PipeRegIDEX.RD_One),
      .d10(mem_data),
      .d01(wb_data),
      .d11(32'b0),
      .s  (ForwardA),
      .y  (FA_mux_result)
  );

  mux4 FB_mux (
      .d00(PipeRegIDEX.RD_Two),
      .d10(mem_data),
      .d01(wb_data),
      .d11(32'b0),
      .s  (ForwardB),
      .y  (FB_mux_result)
  );

  mux2 src_mux (
      .d0(FB_mux_result),
      .d1(PipeRegIDEX.ImmG),
      .s (PipeRegIDEX.ALU2ndOperandSrc),
      .y (src_mux_result)
  );

  // ====================================================================================
  //                                End of Execution (EX)
  // ====================================================================================
  always @(posedge clock, posedge reset) begin
    // add your logic here to update the EX_MEM_Register
    if (reset) begin
      PipeRegEXMEM.RegWrite     <= 1'b0;
      PipeRegEXMEM.WrRegDataSrc <= 1'b0;
      PipeRegEXMEM.MemRead      <= 1'b0;
      PipeRegEXMEM.MemWrite     <= 1'b0;
      PipeRegEXMEM.RWSel        <= 2'b0;
      PipeRegEXMEM.Pc_Imm       <= 32'b0;
      PipeRegEXMEM.Pc_Four      <= 32'b0;
      PipeRegEXMEM.Imm_Out      <= 32'b0;
      PipeRegEXMEM.Alu_Result   <= 32'b0;
      PipeRegEXMEM.MemWrData    <= 32'b0;
      PipeRegEXMEM.rd           <= 5'b0;
      PipeRegEXMEM.func3        <= 3'b0;
      PipeRegEXMEM.func7        <= 7'b0;
      PipeRegEXMEM.Curr_Instr   <= 32'b0;
    end else begin
      PipeRegEXMEM.RegWrite     <= PipeRegIDEX.RegWrite;
      PipeRegEXMEM.WrRegDataSrc <= PipeRegIDEX.WrRegDataSrc;
      PipeRegEXMEM.MemRead      <= PipeRegIDEX.MemRead;
      PipeRegEXMEM.MemWrite     <= PipeRegIDEX.MemWrite;
      PipeRegEXMEM.RWSel        <= PipeRegIDEX.RWSel;
      PipeRegEXMEM.Pc_Imm       <= PCplusImm;
      PipeRegEXMEM.Pc_Four      <= PCplus4_EX;
      PipeRegEXMEM.Imm_Out      <= PipeRegIDEX.ImmG;
      PipeRegEXMEM.Alu_Result   <= ALU_result;
      PipeRegEXMEM.MemWrData    <= FB_mux_result;
      PipeRegEXMEM.rd           <= PipeRegIDEX.rd;
      PipeRegEXMEM.func3        <= PipeRegIDEX.func3;
      PipeRegEXMEM.func7        <= PipeRegIDEX.func7;
      PipeRegEXMEM.Curr_Instr   <= PipeRegIDEX.Curr_Instr;
    end
  end
  // ====================================================================================
  //                                    Memory Access (MEM)
  // ====================================================================================
  // add your data memory here.
  logic [31:0] ReadData;
  datamemory DM (
      .clock(clock),
      .read_en(PipeRegEXMEM.MemRead),
      .write_en(PipeRegEXMEM.MemWrite),
      .address(PipeRegEXMEM.Alu_Result[11:0]),
      .data_in(PipeRegEXMEM.MemWrData),
      .funct3(PipeRegEXMEM.func3),
      .data_out(ReadData)
  );

  mux4 mem_mux (
      .d00(PipeRegEXMEM.Alu_Result),
      .d01(PipeRegEXMEM.Pc_Four),
      .d10(PipeRegEXMEM.Imm_Out),
      .d11(PipeRegEXMEM.Pc_Imm),
      .s  (PipeRegEXMEM.RWSel),
      .y  (mem_data)
  );

  // ====================================================================================
  //                                End of Memory Access (MEM)
  // ====================================================================================
  always @(posedge clock) begin
    // add your logic here to update the MEM_WB_Register
    if (reset) begin
      PipeRegMEWB.RegWrite     <= 1'b0;
      PipeRegMEWB.WrRegDataSrc <= 1'b0;
      PipeRegMEWB.RWSel        <= 2'b0;
      PipeRegMEWB.Pc_Imm       <= 32'b0;
      PipeRegMEWB.Pc_Four      <= 32'b0;
      PipeRegMEWB.Imm_Out      <= 32'b0;
      PipeRegMEWB.Alu_Result   <= 32'b0;
      PipeRegMEWB.MemReadData  <= 32'b0;
      PipeRegMEWB.rd           <= 5'b0;
      PipeRegMEWB.Curr_Instr   <= 5'b0;
    end else begin
      PipeRegMEWB.RegWrite     <= PipeRegEXMEM.RegWrite;
      PipeRegMEWB.WrRegDataSrc <= PipeRegEXMEM.WrRegDataSrc;
      PipeRegMEWB.RWSel        <= PipeRegEXMEM.RWSel;
      PipeRegMEWB.Pc_Imm       <= PipeRegEXMEM.Pc_Imm;
      PipeRegMEWB.Pc_Four      <= PipeRegEXMEM.Pc_Four;
      PipeRegMEWB.Imm_Out      <= PipeRegEXMEM.Imm_Out;
      PipeRegMEWB.Alu_Result   <= PipeRegEXMEM.Alu_Result;
      PipeRegMEWB.MemReadData  <= ReadData;
      PipeRegMEWB.rd           <= PipeRegEXMEM.rd;
      PipeRegMEWB.Curr_Instr   <= PipeRegEXMEM.Curr_Instr;
      // $display("------------------------------------------------------------------");
      // $display("PipeRegMEWB.Curr_Instr = %x",PCplus4);
      // // $display("PipeRegMEWB.Imm_Out = %x", PipeRegMEWB.Imm_Out);
      // // $display("PipeRegMEWB.Pc_Four = %x", PipeRegMEWB.Pc_Four);
      // $display("PCplusImm = %x", PipeRegMEWB.Imm_Out);
      // $display("PCplus4_EX = %x", PipeRegMEWB.Pc_Four);
      // $display("PipeRegMEWB.WrRegDataSrc = %x", BrPC);
      // $display("PipeRegIDEX.Curr_Pc = %x", PipeRegIDEX.Curr_Pc);
      // $display("PipeRegMEWB.Alu_Result = %x", PipeRegMEWB.Alu_Result);
      // $display("wb_data = %x",wb_data);
    end
  end

  // ====================================================================================
  //                                  Write Back (WB)
  // ====================================================================================
  //
  // add your write back logic here.
  //
  logic [31:0] res_mux_result;
  mux2 res_mux (
      .d0(PipeRegMEWB.Alu_Result),
      .d1(PipeRegMEWB.MemReadData),
      .s (PipeRegMEWB.WrRegDataSrc),
      .y (res_mux_result)
  );

  mux4 wrs_mux (
      .d00(res_mux_result),
      .d01(PipeRegMEWB.Pc_Four),
      .d10(PipeRegMEWB.Imm_Out),
      .d11(PipeRegMEWB.Pc_Imm),
      .s  (PipeRegMEWB.RWSel),
      .y  (wb_data)
  );

  // ====================================================================================
  //                               End of Write Back (WB)
  // ====================================================================================
  // ====================================================================================
  //                                   other logic
  // ====================================================================================
  //
  // add your hazard detection logic here
  //
  Hazard_detector hazard_unit (
      .clock(clock),
      .reset(reset),
      .if_id_rs1(PipeRegIFID.Curr_Instr[19:15]),
      .if_id_rs2(PipeRegIFID.Curr_Instr[24:20]),
      .id_ex_rd(PipeRegIDEX.rd),
      .id_ex_memread(PipeRegIDEX.MemRead),
      .stall(stall)
  );

  //
  // add your forwarding logic here
  //
  ForwardingUnit forwarding_unit (
      .rs1(PipeRegIDEX.RS_One),
      .rs2(PipeRegIDEX.RS_Two),
      .ex_mem_rd(PipeRegEXMEM.rd),
      .mem_wb_rd(PipeRegMEWB.rd),
      .ex_mem_regwrite(PipeRegEXMEM.RegWrite),
      .mem_wb_regwrite(PipeRegMEWB.RegWrite),
      .forward_a(ForwardA),
      .forward_b(ForwardB)
  );
  //
  // possible extra code
  //


endmodule

pipeline_regs

//
package  Pipe_Buf_Reg_PKG;
    // Reg A
    typedef struct packed{
        logic [8:0]     Curr_Pc;
        logic [31:0]    Curr_Instr;
    } if_id_reg;

    // Reg B
    typedef struct packed{
        logic ALU2ndOperandSrc;
        logic WrRegDataSrc;
        logic RegWrite;
        logic MemRead;
        logic MemWrite;
        logic [1:0]     ALUOp;
        logic Branch;
        logic JalrSel;
        logic JalSel;               
        logic [1:0]     RWSel;
        logic [8:0]     Curr_Pc;
        logic [31:0]    RD_One;     // 1st read data from the reg file.
        logic [31:0]    RD_Two;     // 2nd read data from the reg file.
        logic [4:0]     RS_One;     // 1st read register address of Curr_Instr
        logic [4:0]     RS_Two;     // 2nd read register address of Curr_Instr
        logic [4:0]     rd;         // dst register to write of Curr_Instr
        logic [31:0]    ImmG;       // Immediate value generated by Imm_gen
        logic [2:0]     func3;
        logic [6:0]     func7;
        logic [31:0]    Curr_Instr;
    } id_ex_reg;

    // Reg C
    typedef struct packed{
        logic RegWrite;
        logic WrRegDataSrc;
        logic MemRead;
        logic MemWrite;
        logic [1:0]     RWSel;
        logic [31:0]    Pc_Imm;
        logic [31:0]    Pc_Four;
        logic [31:0]    Imm_Out;
        logic [31:0]    Alu_Result;
        logic [31:0]    MemWrData;  // data written to the memory
        logic [4:0]     rd;         // dst register to write of Curr_Instr
        logic [2:0]     func3;
        logic [6:0]     func7;
        logic [31:0]    Curr_Instr;
    } ex_mem_reg;

    // Reg D
    typedef struct packed{
        logic RegWrite;
        logic WrRegDataSrc;
        logic [1:0]     RWSel;
        logic [31:0]    Pc_Imm;
        logic [31:0]    Pc_Four;
        logic [31:0]    Imm_Out;
        logic [31:0]    Alu_Result;
        logic [31:0]    MemReadData;
        logic [4:0]     rd;
        logic [31:0]    Curr_Instr;
    } mem_wb_reg;
endpackage

proc_controller

`timescale 1ns / 1ps
// main controller
module Proc_controller (

    // ======================== Inputs ========================
    input logic [6:0] Opcode,  // 7-bit opcode field of the instruction

    // ======================== Outputs ========================
    // Selection signal for the source of the 2nd ALU operand:
    //0: the 2nd read data from the register file;
    //1: the immediate value generated by Imm_gen
    output logic ALU2ndOperandSrc,

    // Selection signal for the source of value used to write the regiser:
    // 0: ALU; 1: data memory.
    output logic WrRegDataSrc,

    // Write register enable
    output logic WrRegEn,

    // Write memory enable
    output logic WrMemEn,

    // Read memory enable
    output logic RdMemEn,

    // 00: LW/SW/AUIPC;  01: Branch;
    // 10: Rtype/Itype(Itype wrong!); 11: JAL/LUI/JALR//Itype(include jalr)//This is considering the overlapping and intersecting instructions in funct3 in itype and rtype, which are unclear
    output logic [1:0] ALUOp,

    //0: branch is not taken; 1: branch is taken
    output logic Branch,

    //0: Jalr is not taken; 1: jalr is taken
    output logic JalrSel,
    output logic JalSel,//The separation is because the branch selector handles jalr and jal differently, and jal directly adds

    // 00: Register Write Back;
    // 01: PC+4 write back(JAL/JALR);
    // 10: imm-gen write back(LUI);
    // 11: pc+imm-gen write back(AUIPC)
    output logic [1:0] RWSel
);

  logic [10:0] con;
  /*****************************************************instruction's Type*****************************************************/
  //RType
  logic [6:0] RType= 7'b0110011;//include slt
  //IType
  logic [6:0] ori= 7'b0010011;//andi
  logic [6:0] lw= 7'b0000011;//lb,lh,lbu,lhu
  logic [6:0] jalr= 7'b1100111;
  //UType
  logic [6:0] jal= 7'b1101111;//
  logic [6:0] lui= 7'b0110111;//
  //SType
  logic [6:0] sw= 7'b0100011;//sb,sh
  logic [6:0] beq= 7'b1100011;//bne,blt,bge,bltu,bgeu
  /*****************************************************controll signs*****************************************************/
  assign ALU2ndOperandSrc = (Opcode == lw || Opcode == sw || Opcode == ori || Opcode == jalr || Opcode == jal) ? 1'b1 : 1'b0;
  assign WrRegDataSrc = (Opcode == lw) ? 1'b1 : 1'b0;
  assign WrRegEn = (Opcode == RType || Opcode == lw || Opcode == ori || Opcode == lui || Opcode == jal || Opcode == jalr) ? 1'b1 : 1'b0;
  assign RdMemEn = (Opcode == lw) ? 1'b1 : 1'b0;
  assign WrMemEn = (Opcode == sw) ? 1'b1 : 1'b0;
  assign ALUOp = (Opcode == RType) ? 2'b10 : (Opcode == beq) ? 2'b01 : (Opcode == jal || Opcode == jalr || Opcode == lui || Opcode == ori) ? 2'b11 : 2'b00;
  assign Branch = (Opcode == beq) ? 1'b1 : 1'b0;
  assign JalrSel = (Opcode == jalr) ? 1'b1 : 1'b0;
  assign JalSel = (Opcode == jal) ? 1'b1 : 1'b0;
  assign RWSel = (Opcode == jalr || Opcode == jal)? 2'b01 : (Opcode == lui) ? 2'b10 :(Opcode == beq) ? 2'b11 : 2'b00;
endmodule

reg_file

`timescale 1ns / 1ps
// register file
module Reg_file #(
    parameter DATA_WIDTH    = 32,  // number of bits in each register
    parameter ADDRESS_WIDTH = 5, //number of registers = 2^ADDRESS_WIDTH
    parameter NUM_REGS      = 2 ** ADDRESS_WIDTH
)(
   // Inputs
   input  clock,                  //clock
   input  reset,                  //synchronous reset; reset all regs to 0  upon assertion.
   input  write_en,            //write enable
   input  [ADDRESS_WIDTH-1:0] write_addr, //address of the register that supposed to written into
   input  [DATA_WIDTH-1:0]    data_in, // data that supposed to be written into the register file
   input  [ADDRESS_WIDTH-1:0] read_addr1, //first address to be read from
   input  [ADDRESS_WIDTH-1:0] read_addr2, //second address to be read from

   // Outputs
   output logic [DATA_WIDTH-1:0] data_out1, //content of reg_file[read_addr1] is loaded into
   output logic [DATA_WIDTH-1:0] data_out2  //content of reg_file[read_addr2] is loaded into
);


integer i;

logic [DATA_WIDTH-1:0] register_file [NUM_REGS-1:0];

always @( negedge clock )
begin
    if( reset == 1'b1 )
        for (i = 0; i < NUM_REGS ; i = i + 1) begin
            register_file [i] <= 0;
        end
    else if( write_en == 1'b1 && write_addr ) begin
        register_file [ write_addr ] <=    data_in;
    end
end
assign data_out1 = register_file[read_addr1];
assign data_out2 = register_file[read_addr2];


endmodule

riscv_proc

`timescale 1ns / 1ps
// the top module for the RISC-V processor
// basically, you do not need to modify this file.
module riscv #(
    parameter DATA_W = 32
) (
    input logic clock,
    input logic reset,
    output logic [31:0] WB_Data  // The ALU_Result
);

  logic [6:0] opcode;
  logic [6:0] opcode_IDEX;
  logic
      ALU2ndOperandSrc,
      WrRegDataSrc,
      RegWrite,
      MemRead,
      MemWrite,
      Branch,
      JalrSel;
  logic [1:0] RWSel;

  logic [1:0] ALUop;
  logic [1:0] ALUop_Reg;
  logic [6:0] Funct7;
  logic [2:0] Funct3;
  logic [3:0] Operation;

  Proc_controller proc_controller (
      opcode,
      ALU2ndOperandSrc,
      WrRegDataSrc,
      RegWrite,
      MemWrite,
      MemRead,
      ALUop,
      Branch,
      JalrSel,
      JalSel,
      RWSel
  );

  ALU_Controller proc_alu_controller (
      ALUop_Reg,
      Funct7,
      Funct3,
      opcode_IDEX,
      Operation
  );

  Datapath proc_data_path (
      clock,
      reset,
      RegWrite,
      WrRegDataSrc,
      ALU2ndOperandSrc,
      MemWrite,
      MemRead,
      Branch,
      JalrSel,
      JalSel,
      ALUop,
      RWSel,
      Operation,
      opcode,
      opcode_IDEX,
      Funct7,
      Funct3,
      ALUop_Reg,
      WB_Data
  );

endmodule

adder

`timescale 1ns / 1ps
// adder
module adder #(
    parameter WIDTH = 8
) (
    input  logic [WIDTH-1:0] a,
    b,
    output logic [WIDTH-1:0] y
);

  // add your adder logic here
  assign y = a + b;
endmodule

alu_controller

`timescale 1ns / 1ps
// arithmetic logic unit controller
module ALU_Controller (
    input  logic [1:0] alu_op,    // 2-bit opcode field from the Proc_controller
    input  logic [6:0] funct7,    // insn[31:25]
    input  logic [2:0] funct3,    // insn[14:12]
    input  logic [6:0] opcode,    // insn[6:0] //In order to distinguish the funct part and imm part in Itype and UType, for example, utype has no funct, but imm part may have the same value as funct
    output logic [3:0] operation  // operation selection for ALU
);

  // add your code here.
  
  always_comb begin
    
    case (alu_op)
      2'b00: //Store & load
        operation = 4'b0010;

      2'b01://branch 
      case (funct3)
        3'b000: 
          operation = 4'b0110; // beq -> sub
        3'b001: 
          operation = 4'b1001; // bne
        3'b100: 
          operation = 4'b1010; // blt
        3'b101: 
          operation = 4'b1011; // bge
        3'b110: 
          operation = 4'b1100; // bltu
        3'b111: 
          operation = 4'b1101; // bgeu
        default:begin
          operation = 4'b0000;
          $display("---------------------------------->Undfined_ALU_controller_Type\n");
        end 
      endcase

      2'b10://RType
      case (funct3)
        3'b000:
        case (funct7)
          //add
          7'b0000000: 
            operation = 4'b0010;
          //sub
          7'b0100000: 
            operation = 4'b0110;
          //addi
          default: 
            operation = 4'b0000;
        endcase
        3'b111://and
          operation = 4'b0000;
        3'b110://or
          operation = 4'b0001;
        3'b100://xor
          operation = 4'b0111;
        3'b101://mul//funct7=0funct3=101
          operation = 4'b0011;
        3'b010:
        case (funct7)
          //slt
          7'b0000000: operation = 4'b1000;
          default:begin
          operation = 4'b0000;
          $display("---------------------------------->Undfined_ALU_controller_Type\n");
        end 
        endcase
        default:begin
          operation = 4'b0000;
          $display("---------------------------------->Undfined_ALU_controller_Type\n");
        end 
          
      endcase

      2'b11: //IType,Utype
        case (opcode) 
          7'b0010011:
            case (funct3)
              3'b000: operation = 4'b0010;//addi
              3'b111: operation = 4'b0000;//andi
              3'b110: operation = 4'b0001;//ori
              3'b101: operation = 4'b0011;//muli
            default: operation = 4'b0000;
            endcase
          default:operation = 4'b0010;//jalr
        endcase
    endcase
  end

endmodule

alu

`timescale 1ns / 1ps
// arithmetic logic unit
module alu #(
    parameter DATA_WIDTH    = 32,
    parameter OPCODE_LENGTH = 4
) (
    input  logic signed [   DATA_WIDTH - 1 : 0] operand_a,
    input  logic signed [   DATA_WIDTH - 1 : 0] operand_b,
    input  logic [OPCODE_LENGTH - 1 : 0] alu_ctrl,    // Operation
    output logic signed [   DATA_WIDTH - 1 : 0] alu_result,
    output logic                         zero
);

  logic [31:0] s;
  logic [31:0] signed_s, unsigned_operand_a, unsigned_operand_b;

  // modify this
  //Unsigned and signed integers are not considered here
  assign unsigned_operand_a = operand_a;
  assign unsigned_operand_b = operand_b;
  
  always_comb begin
    // modify this
    case (alu_ctrl)
      4'b0000: //and
        alu_result = operand_a & operand_b;
      4'b0001: //or
        alu_result = operand_a | operand_b;
      4'b0010: //add
        alu_result = operand_a + operand_b;
      4'b0110: //sub include beq
        alu_result = operand_a - operand_b;
      4'b0111: //xor
        alu_result = operand_a ^ operand_b;
      4'b0011: //mul,muli
        alu_result = operand_a * operand_b;
      4'b1000: //slt
        alu_result = (operand_a < operand_b) ? 1'b1 : 1'b0;
      4'b1001: //bne
        alu_result = (operand_a != operand_b) ? 1'b0 : 1'b1;
      4'b1010: //blt
        alu_result = (operand_a < operand_b) ? 1'b0 : 1'b1;
      4'b1011: //bge
        alu_result = (operand_a >= operand_b) ? 1'b0 : 1'b1;
      4'b1100: //bltu
        alu_result = (unsigned_operand_a < unsigned_operand_b) ? 1'b0 : 1'b1;
      4'b1101: //bgeu
        alu_result = (unsigned_operand_a >= unsigned_operand_b) ? 1'b0 : 1'b1;
      default: begin
        alu_result = 32'b0;
        $display("---------------------------------->Undfined_ALU_Type\n");
      end    
    endcase
    assign zero = (alu_result == 0) ? 1'b1 : 1'b0;
    //$display("this operation: oper_a:%x oper_b:%x ctrl:%x result:%x", operand_a, operand_b, alu_ctrl, alu_result);
  end

endmodule


branch_unit

`timescale 1ns / 1ps
// Jump unit
module BranchUnit #(
    parameter PC_W = 9
) (
    input  logic [PC_W - 1:0] cur_pc,
    input  logic [      31:0] imm,
    input  logic              jalr_sel,
    input  logic              jal_sel,
    input  logic              branch_taken,   // Branch
    input  logic [      31:0] alu_result,
    output logic [      31:0] pc_plus_imm,    // PC + imm
    output logic [      31:0] pc_plus_4,      // PC + 4
    output logic [      31:0] branch_target,  // BrPC
    output logic              pc_sel
);

  logic [31:0] pc;
  assign pc = {23'b0, cur_pc};
  always_comb begin
    pc_plus_4 = pc + 4;//3'b100;
    pc_plus_imm = (jalr_sel) //The operations performed by jal and jalr are different. The fundamental reason is that jalr uses register rs1, while jal only uses imm.
                             ?  (alu_result >> 1) << 1/*Here, the lowest value of alu_reslut is set to 0, and the value of alu_result is rs1+imm*/ 
                             :   pc + imm ;//{double_pc_plus_imm[31:0],1'b0}//Another way to set the lowest position of sum to 0
    pc_sel = ((branch_taken && alu_result == 0) || jalr_sel || jal_sel) ? 1'b1 : 1'b0;
    branch_target = pc_plus_imm;
  end

endmodule

data_mem

`timescale 1ns / 1ps
// data storage
module datamemory #(
    parameter ADDR_WIDTH = 12,
    parameter DATA_WIDTH = 32
) (
    input  logic                     clock,
    input  logic                     read_en,
    input  logic                     write_en,
    input  logic [ADDR_WIDTH -1 : 0] address,   // read/write address
    input  logic [DATA_WIDTH -1 : 0] data_in,   // write Data
    input  logic [              2:0] funct3,    // insn[14:12]
    output logic [DATA_WIDTH -1 : 0] data_out   // read data
);

  logic [7 : 0] MEM[(2**12) - 1 : 0] ;//The difference between reading or writing a byte, 2 bytes and 4 bytes was not considered at first.
  logic sin01, sin00;//useless"-"
  always @(posedge clock)
    if (write_en) begin
      // modify this. Note the data width
      case(funct3)
        3'b000://sb
          begin
            MEM[address] <= data_in[7:0];
          end
        3'b001://sh
          begin
            MEM[address] <= data_in[7:0];
            MEM[address + 1] <= data_in[15:8];
          end
        3'b010://sw
          begin
            MEM[address] <= data_in[7:0];
            MEM[address + 1] <= data_in[15:8];
            MEM[address + 2] <= data_in[23:16];
            MEM[address + 3] <= data_in[31:24];
        end
      endcase
    end

  // maybe some extra code here
  always_comb begin
    if (read_en) begin
      // modify this
      data_out = 32'b0;
      case(funct3)
        3'b000://lb
          begin
            data_out = {{(24){MEM[address][7]}},MEM[address]};
          end
        3'b001://lh
          begin
            data_out = {{(16){MEM[address+1][7]}}, MEM[address + 1],MEM[address]};
          end
        3'b010://lw
          begin
            //data_out = {MEM[address],MEM[address + 1],MEM[address + 2],MEM[address + 3]};
            data_out[31:24] = MEM[address + 3];//8'b00000001;
            data_out[23:16] = MEM[address + 2];//8'b00110010;
            data_out[15:8] = MEM[address + 1];//8'b01010100;
            data_out[7:0] = MEM[address];//8'b01110110;
        end
        3'b100://lhu
          begin
            data_out[7:0] = MEM[address];//8'b01110110;
          end
        3'b101://lbu
          begin
            data_out[15:8] = MEM[address + 1];//8'b01010100;
            data_out[7:0] = MEM[address];//8'b01110110;
          end
      endcase
    end
  end

endmodule


flipflop

`timescale 1ns / 1ps

//
// Module Name: flipflop
// Description:  An edge-triggered register
//  When reset is `1`, the value of the register is set to 0.
//  When reset is set to 1, the signal that resets the register is all 0s
//  Otherwise:
//  otherwise
//    - if stall is set, the register preserves its original data
//    - else, it is updated by `d`.
//  If stall is set to 1, the register retains its original value, stall is set to 0, the value of d is written to the register
//

// edge-triggered register
module flipflop # (
    parameter WIDTH = 8
)(
    input  logic clock,
    input  logic reset,
    input  logic [WIDTH-1:0] d,
    input  logic stall,
    output logic [WIDTH-1:0] q
);

    always_ff @(posedge clock, posedge reset)
    begin
        if (reset)
            q <= 0;
        else if (!stall)
            q <= d;
    end


endmodule

forwarding_unit

`timescale 1ns / 1ps
// Data Oriented Processing Unit
module ForwardingUnit (
    input logic [4:0] rs1,
    input logic [4:0] rs2,
    input logic [4:0] ex_mem_rd,
    input logic [4:0] mem_wb_rd,
    input logic ex_mem_regwrite,
    input logic mem_wb_regwrite,
    output logic [1:0] forward_a,
    output logic [1:0] forward_b
);

  // define your forwarding logic here.
  // forward_a & forward_b see book P300
  assign forward_a = 
                    ((ex_mem_regwrite) && (ex_mem_rd != 0) && (ex_mem_rd == rs1))     ? 2'b10 :
                    ((mem_wb_regwrite) && (mem_wb_rd != 5'b0) && (rs1 == mem_wb_rd))  ? 2'b01 : 
                                                                                        2'b00 ;
  assign forward_b = 
                    ((ex_mem_regwrite) && (ex_mem_rd != 0) && (rs2 == ex_mem_rd))     ?  2'b10 : 
                    ((mem_wb_regwrite) && (mem_wb_rd != 5'b0) && (rs2 == mem_wb_rd))  ?  2'b01 : 
                                                                                         2'b00 ;

endmodule

hazard_detector

`timescale 1ns / 1ps
// Adventure Detector (blocking generator)
module Hazard_detector (
    input logic clock,
    input logic reset,
    input logic [4:0] if_id_rs1,
    input logic [4:0] if_id_rs2,
    input logic [4:0] id_ex_rd,
    input logic id_ex_memread,
    output logic stall
);

  // define your hazard detection logic here
  logic [1:0] counter;
  always @(negedge clock) begin
    stall <= (id_ex_memread && ((id_ex_rd == if_id_rs1) || (id_ex_rd == if_id_rs2)))? 1'b1 : 1'b0;
  end

endmodule

imm_gen

`timescale 1ns / 1ps
// immediate expansion
module Imm_gen (
    input  logic [31:0] inst_code,
    output logic [31:0] imm_out
);
  
  logic [6:0] test;
  assign test = inst_code[6:0];
  always_comb begin
    imm_out = 32'b0;//init but seem not been used
    case (test)
    7'b0110011://Rtype does not use imm
      imm_out = 32'b0;
    7'b0010011://addi ori etc are all zero extension lw is sign extension? //the answer is signed extend
        imm_out =  {{20{inst_code[31]}},inst_code[31:20]};
    7'b0000011,7'b1100111://lw//jalr
        imm_out = {{(20){inst_code[31]}},inst_code[31:20]};
    7'b0110111://lui
        imm_out =  inst_code[31:12] << 12;
    7'b1101111://jal
        imm_out = {{10{inst_code[31]}},inst_code[31],inst_code[19:12],inst_code[20],inst_code[30:21],1'b0};
    7'b0100011://35://sw
      begin
        imm_out[11:5] = inst_code[31:25];
        imm_out[4:0] = inst_code[11:7];
      end
    7'b1100011://beq
      imm_out = {{(19){inst_code[31]}},inst_code[31],inst_code[7],inst_code[30:25],inst_code[11:8],1'b0} ;//
    default: imm_out = 32'b0;
    endcase
  end
endmodule

insn_mem

`timescale 1ns / 1ps
//
// instruction memory?
module Insn_mem #(
    parameter ADDR_WIDTH = 9,
    parameter INSN_WIDTH = 32
)(
    input  logic [ADDR_WIDTH - 1 : 0] read_address,
    output logic [INSN_WIDTH - 1 : 0] insn
);

    logic [INSN_WIDTH-1 :0] insn_array [(2**(ADDR_WIDTH - 2))-1:0];

    assign insn = insn_array[read_address[ADDR_WIDTH - 1 : 2]];

endmodule

mux2

`timescale 1ns / 1ps
// Two-port multiplexer
module mux2 #(
    parameter WIDTH = 32
) (
    input logic [WIDTH-1:0] d0,
    d1,
    input logic s,
    output logic [WIDTH-1:0] y
);

  assign y = s ? d1 : d0;

endmodule

mux4

`timescale 1ns / 1ps
// Four-port multiplexer
module mux4 #(
    parameter WIDTH = 32
) (
    input logic [WIDTH-1:0] d00,
    input logic [WIDTH-1:0] d01,
    input logic [WIDTH-1:0] d10,
    input logic [WIDTH-1:0] d11,
    input logic [1:0] s,
    output logic [WIDTH-1:0] y
);

  // add your logic here
  assign y = (s == 2'b00) ? d00 :
             (s == 2'b01) ? d01 :
             (s == 2'b10) ? d10 : 
                            d11 ;

endmodule

Tags: risc-v

Posted by quasiman on Mon, 26 Sep 2022 04:28:33 +0530