How to realize an asynchronous FIFO function more completely and simulate the time sequence constraint

Function realization and timing constraint of asynchronous FIFO

1 Introduction

FIFO is the abbreviation of First In First Out in English. It is a First In First Out data buffer. The difference between FIFO and ordinary memory is that there is no external read-write address line. It is very simple to use, but the disadvantage is that it can only write data in sequence and read data in sequence.

FIFO is widely used in data transmission across clock domains. In actual use, we often call XILINX IP core directly, or use the verified FIFO instead of our own implementation. Why don't you write a FIFO for yourself? I think the main points are as follows.

 1. Can a correct FIFO, verified by functional simulation, work normally in the face of stricter timing requirements at higher frequencies?

 2. Can FIFO work normally when the read / write clock frequencies differ greatly?

 3. Is our timing constraint under FIFO reasonable?

The above points are uncertain. Using the officially provided IP core can avoid design errors caused by poor consideration. Therefore, it is not recommended to write your own FIFO. Here, I would like to try to explain how to implement a FIFO more completely from my own understanding, just for discussion. If there is any error, please correct it. There are many blogs about the principle and implementation of FIFO. We won't repeat them here.

2 implementation

2.1 definition of ports and parameters

We have learned from previous blogs about the implementation of FIFO function Asynchronous FIFO principle and code implementation The first part is to determine the parameters and ports of the FIFO. Here we establish a 16x256 FIFO, so the address bit width needs 8 bits, and nine bits are required to judge the null and full pointer. At the same time, we instantiate a dual port ram for storing data.

//**************************************************************************
// ***Name: async_fifo.v
// ***Author: Bean Bear
// ***Date: April 20, 2021
// ***Description: asynchronous fifo module
//**************************************************************************

module async_fifo
//========================< parameters >==========================================
#(
    parameter  DATA_WIDTH         = 16,                  //Data bit width
    parameter  DATA_DEPTH         = 256,                 //FIFO depth
    parameter  ADDR_WIDTH         = 8                    //FIFO address bit width
)
//========================< port >==========================================
(
    input   wire                        rst,
    //Write port
    input   wire                        wr_clk,          //Write clock
    input   wire                        wr_en,           //Write enable
    input   wire      [DATA_WIDTH-1:0]  din,             //Write data bit width is 16 bits
    //Read port
    input   wire                        rd_clk,          //Read clock
    input   wire                        rd_en,           //Read enable
    //output
    output  reg                         dout_vld,        //Output data valid signal
    output  reg       [DATA_WIDTH-1:0]  dout,            //Output Ethernet data
    output                              empty,           //Null reading signal
    output                              full             //Write full signal
);
//========================< signal >========================================== 
reg           [DATA_WIDTH:0]    wr_addr_ptr;                //The write address pointer is one bit more than the write address. MSB is used to detect whether the same circle
reg           [DATA_WIDTH:0]    rd_addr_ptr;                //Read address pointer
wire          [DATA_WIDTH-1:0]  wr_addr;                    //Write address
wire          [DATA_WIDTH-1:0]  rd_addr;                    //Read address

wire          [ADDR_WIDTH:0]    wr_addr_ptr_gray;           //Gray code corresponding to write pointer
reg           [ADDR_WIDTH:0]    wr_addr_ptr_gray_d1;        //Write pointer corresponds to gray code synchronization first shot
reg           [ADDR_WIDTH:0]    wr_addr_ptr_gray_d2;        //Write pointer corresponds to gray code synchronization second beat

wire          [ADDR_WIDTH:0]    rd_addr_ptr_gray;           //Gray code corresponding to read pointer
reg           [ADDR_WIDTH:0]    rd_addr_ptr_gray_d1;        //Read pointer corresponds to gray code to synchronize the first beat
reg           [ADDR_WIDTH:0]    rd_addr_ptr_gray_d2;        //Read pointer corresponds to gray code synchronization second beat

reg           [DATA_WIDTH-1:0]  fifo_ram [DATA_DEPTH-1:0];  //Instantiated dual port ram

2.2 initialize dual port ram and read / write data

What we need to do here is to clear the data in ram when the reset is effective, write the data when the FIFO is not full, and read the data when the FIFO is not empty

//==========================================================================
//==Initializing dual port ram and writing data
//==========================================================================
genvar i;
generate
for(i = 0; i < DATA_DEPTH; i = i + 1 )
begin:fifo_init
always@(posedge wr_clk or posedge rst)
    begin
        if (rst == 1'b1) begin
            fifo_ram[i] <= 'h0;
        end
        else if ((wr_en == 1'b1) && (full == 1'b0)) begin
            fifo_ram[wr_addr] <= din;
        end
        else begin
            fifo_ram[wr_addr] <= fifo_ram[wr_addr];
        end
    end  
end   
endgenerate   
//==========================================================================
//==Generate output data and valid
//==========================================================================
//When the reset is invalid, if the read is valid and not empty, the read data and valid signal are generated
always @(posedge rd_clk or posedge rst) begin
    if (rst == 1'b1) begin
        dout     <= 'h0;
        dout_vld <= 1'b0;
    end
    else if ((rd_en == 1'b1) && (empty == 1'b0)) begin
        dout     <= fifo_ram [rd_addr];
        dout_vld <= 1'b1;
    end
    else begin
        dout     <= 'h0;
        dout_vld <= 1'b0;
    end
end

2.3 read / write address and pointer generation

Here, the read / write address is easy to understand, while the read / write pointer has one more bit than the address. The highest MSB is used to judge that the read space is full. If the highest bits of the read / write pointer are the same, it means that the read space is empty; When it is different, reading is more than writing, which means writing is full.

It should be noted here that when the read-write clock fields are inconsistent, we need to synchronize the pointers before comparison. However, we all know that multi bit control signals cannot be synchronized by beating, and functional errors such as dislocation may occur. Here, we convert the pointer into gray code. At this time, only one bit will be changed at a time. It can be synchronized by shooting, and there will be no function error. The specific reasons will be explained later.

After the code system conversion, the null reading condition is still that the gray code pointer is exactly the same, while the full writing condition is that the gray code write pointer that needs to be written to the clock domain and the gray code read pointer that is synchronized to the write clock domain are different in height, and the other bits are exactly the same.

//==========================================================================
//==Generate empty / full signal
//==========================================================================
//The write pointer is different from the synchronized read pointer in the write clock field when it is full, and the others are the same
assign full  = (wr_addr_ptr_gray == {~rd_addr_ptr_gray_d2[ADDR_WIDTH:ADDR_WIDTH-1],rd_addr_ptr_gray_d2[ADDR_WIDTH-2:0]});
//Read null requires that the read pointer and the synchronized write pointer are identical under the read clock field
assign empty = (rd_addr_ptr_gray == wr_addr_ptr_gray_d2);
//==========================================================================
//==Generate read / write address
//==========================================================================
//Write address
assign wr_addr = wr_addr_ptr[ADDR_WIDTH-1:0];
//Read address
assign rd_addr = rd_addr_ptr[ADDR_WIDTH-1:0];
//==========================================================================
//==Pointer generation and cross clock domain synchronization
//==========================================================================
//Write address pointer generation
always @(posedge wr_clk or posedge rst) begin
    if (rst == 1'b1) begin
        wr_addr_ptr <= 'h0;
    end
    else if ((wr_en == 1'b1) && (full == 1'b0)) begin
        wr_addr_ptr <= wr_addr_ptr + 1'b1;
    end
    else begin
        wr_addr_ptr <= wr_addr_ptr;
    end
end
//Write address pointer gray code conversion
assign wr_addr_ptr_gray = (wr_addr_ptr >> 1) ^ wr_addr_ptr;
//Write address pointer synchronization
always @(posedge rd_clk or posedge rst) begin
    wr_addr_ptr_gray_d1 <= wr_addr_ptr_gray;
    wr_addr_ptr_gray_d2 <= wr_addr_ptr_gray_d1;
end

//Read address pointer generation
always @(posedge rd_clk or posedge rst) begin
    if (rst == 1'b1) begin
        rd_addr_ptr <= 'h0;
    end
    else if ((rd_en == 1'b1) && (empty == 1'b0)) begin
        rd_addr_ptr <= rd_addr_ptr + 1'b1;
    end
    else begin
        rd_addr_ptr <= rd_addr_ptr;
    end
end
//Read address pointer gray code conversion
assign rd_addr_ptr_gray = (rd_addr_ptr >> 1) ^ rd_addr_ptr;
//Read address pointer synchronization
always @(posedge wr_clk or posedge rst) begin
    rd_addr_ptr_gray_d1 <= rd_addr_ptr_gray;
    rd_addr_ptr_gray_d2 <= rd_addr_ptr_gray_d1;
end

3 Simulation

Here we give the simulation code as follows. At the same time, in order to facilitate the simulation, I also give a script I commonly use. The script is named run Do, just enter do run in modelsim Do can complete the simulation. I believe you can easily complete the simulation of asynchronous FIFO through these two files.

//**************************************************************************
// ***Name: tb_async_fifo.v
// ***Author: Bean Bear
// ***Date: April 20, 2021
// ***Description: asynchronous fifo testbench
//**************************************************************************
`timescale 1ns/1ns

module  tb_async_fifo();
//========================< port >==========================================
reg             rst                 ;

reg             wr_clk              ;
reg             wr_en               ;
reg     [15:0]  din                 ;
wire            full                ;

reg             rd_clk              ;
reg             rd_en               ;
wire    [15:0]  dout                ;
wire            dout_vld            ;
wire            empty               ;

//==========================================================================
//==Variable assignment
//==========================================================================
//Clock, reset signal
initial
  begin
    wr_clk  =   1'b1    ;
    rd_clk  =   1'b1    ;
    rst     <=  1'b1    ;
    wr_en   <=  1'b0    ;
    rd_en   <=  1'b0    ;
    #200
    rst     <=  1'b0    ;
    wr_en   <=  1'b1    ;
    #20000
    wr_en   <=  1'b0    ;
    rd_en   <=  1'b1    ;
    #20000
    rd_en   <=  1'b0    ;
    $stop;
  end

always  #10 wr_clk = ~wr_clk;
always  #30 rd_clk = ~rd_clk;

always@(posedge wr_clk or negedge rst)begin
    if(rst == 1'b1)
        din <= 'd0;
    else if(wr_en)
        din <= din + 1'b1;
    else
        din <= din;
end
//==========================================================================
//==Modular instantiation
//==========================================================================
async_fifo async_fifo_inst
(
	.rst                (rst                ),
							
	.wr_clk             (wr_clk             ),
	.wr_en              (wr_en              ),
	.din                (din                ),
	.full               (full               ),
			
	.rd_clk             (rd_clk             ),
	.rd_en              (rd_en              ),
	.dout               (dout               ),
    .dout_vld           (dout_vld           ),
    .empty              (empty              )
);

endmodule
# Exit current simulation

quit -sim

vlib work

# Compile the design file and testbench respectively.

vlog "../src/*.v"
vlog "../tb/*.v"

# Start simulation and select according to different versions

vsim -voptargs=+acc work.tb_async_fifo
#vsim -gui -novopt  work.tb_async_fifo

# Add specified signal
# Add all signals at the top level
# Set the window types

# Open waveform window

view wave
view structure

# Open signal window

view signals

# Add waveform
add wave -position insertpoint sim:/tb_async_fifo/*
add wave -position insertpoint sim:/tb_async_fifo/async_fifo_inst/*

.main clear

# Run for a certain time

run 10us

4 constraints

Because our asynchronous FIFO involves the CDC part, it is easy to think of direct set_false_path from wr_clk to rd_clk, or set_clock_group, which ignores this check through asynchronous processing. However, this is not perfect. With the increase of clock frequency, the requirements of the circuit for timing will become higher. The same layout and wiring may not be a problem at low frequencies, and timing will be violated at high frequencies. Here we give a feasible constraint.

The first is the first level synchronization of the read / write pointer, that is, wr_addr_ptr_gray to wr_addr_ptr_gray_d1 and rd_addr_ptr_gray to rd_addr_ptr_gray_d1 may not be satisfied successfully during the first level synchronization, and this is allowed. Otherwise, we do not need the second level synchronization, so we directly set it to set_false_path.

In the second level of synchronization of the read / write pointer, that is, wr_addr_ptr_gray_d1 to wr_addr_ptr_gray_d2 and rd_addr_ptr_gray_d1 to rd_addr_ptr_gray_d1. In order to prevent inconsistent arrival time of each bit or excessive delay, first of all, there must be no logic block. We can certainly meet this requirement here, but at the same time, we should consider the delay impact of layout and wiring. Therefore, we can set_max_delay is a clk. In practice, we can set it tighter, such as 0.4-0.5 clks.

So far, we have made appropriate constraints. Of course, there are other constraints, but I still do not recommend using this FIFO in practical projects. We have noticed that in vivado2019.1, when we integrate IP cores, the tool will automatically add appropriate constraints, including clock and FIFO constraints. Therefore, when the requirements are not high, the timing can often be met. When the requirements are high, the unverified FIFO is also unreliable. Therefore, we recommend to use the basic IP core that comes with the project.

---

Original tutorial, please indicate the source for reprint PAC bear - how to more completely implement an asynchronous FIFO- function implementation, simulation of time sequence constraints

reference material: IC Foundation (I): asynchronous FIFO principle and code implementation

Tags: Verilog

Posted by billshackle on Wed, 01 Jun 2022 22:43:14 +0530