-
Notifications
You must be signed in to change notification settings - Fork 707
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
The first optimization for Altera FPGA is to move the instruction queue to LUTRAM. The reason why the optimization previously done for Xilinx is not working, is that in that case asynchronous RAM primitives are used, and Altera does not support asynchronous RAM. Therefore, this optimization consists in using synchronous RAM for the instruction queue and FIFOs inside wt axi adapter. The main changes to the existing code are: New RAM module to infer synchronous RAM in altera with independent read and write ports (SyncDpRam_ind_r_w.sv) Changes inside cva6_fifo_v3 to adapt to the use of synchronous RAM instead of asynchronous: When the FIFO is not empty, next data is always read and available at the output hiding the reading latency introduced by synchronous RAM (similar to fall-through approach). This is a simplification that is possible because in a FIFO we always know what is the next address to be read. When data is read right after write, we can’t use the previous method because there is a latency to first write the data in the FIFO, and then to read it. For this reason, in the new design there is an auxiliary register used to hide this latency. This is used only if the FIFO is empty, so we detect when the word written is first word, and keep it in this register. If the next cycle comes a read, the data out is taken from the aux register. Afterwards the data is already available in the RAM and can be read continuously as in the first case. All this is only used inf FpgaAlteraEn parameter is enabled, otherwise the previous implementation with asynchronous RAM applies (when FpgaEn is set), or the register based implementation (when FpgaEn is not set).
- Loading branch information
1 parent
f54b9d4
commit 33c5d77
Showing
4 changed files
with
128 additions
and
25 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
59 changes: 59 additions & 0 deletions
59
vendor/pulp-platform/fpga-support/rtl/SyncDpRam_ind_r_w.sv
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
// Copyright 2024 PlanV Technologies | ||
// | ||
// Licensed under the Solderpad Hardware Licence, Version 2.0 (the "License"); | ||
// you may not use this file except in compliance with the License. | ||
// SPDX-License-Identifier: Apache-2.0 WITH SHL-2.0 | ||
// You may obtain a copy of the License at https://solderpad.org/licenses | ||
// | ||
// Inferable, Synchronous Dual-Port RAM, there are a write port and a read port fully independent | ||
// | ||
// | ||
// This module is designed to work with both Xilinx, Microchip and Altera FPGA tools by following the respective | ||
// guidelines: | ||
// - Xilinx UG901 Vivado Design Suite User Guide: Synthesis | ||
// - Inferring Microchip PolarFire RAM Blocks | ||
// - Altera Quartus II Handbook Volume 1: Design and Synthesis (p. 768) | ||
// | ||
// Current Maintainers:: Angela Gonzalez - PlanV Technologies | ||
|
||
module SyncDpRam_ind_r_w | ||
#( | ||
parameter ADDR_WIDTH = 10, | ||
parameter DATA_DEPTH = 1024, // usually 2**ADDR_WIDTH, but can be lower | ||
parameter DATA_WIDTH = 32 | ||
)( | ||
input logic Clk_CI, | ||
|
||
// Write port | ||
input logic WrEn_SI, | ||
input logic [ADDR_WIDTH-1:0] WrAddr_DI, | ||
input logic [DATA_WIDTH-1:0] WrData_DI, | ||
|
||
// Read port | ||
input logic [ADDR_WIDTH-1:0] RdAddr_DI, | ||
output logic [DATA_WIDTH-1:0] RdData_DO | ||
); | ||
|
||
// logic [DATA_WIDTH-1:0] mem [DATA_DEPTH-1:0]= '{default:0}; | ||
(* ramstyle = "mlab" *) logic [DATA_WIDTH-1:0] mem [DATA_DEPTH-1:0]= '{default:0}; | ||
|
||
// WRITE | ||
always_ff @(posedge Clk_CI) | ||
begin | ||
if (WrEn_SI) begin | ||
mem[WrAddr_DI] <= WrData_DI; | ||
end | ||
RdData_DO = mem[RdAddr_DI]; | ||
end | ||
|
||
//////////////////////////// | ||
// assertions | ||
//////////////////////////// | ||
|
||
// pragma translate_off | ||
assert property | ||
(@(posedge Clk_CI) (longint'(2)**longint'(ADDR_WIDTH) >= longint'(DATA_DEPTH))) | ||
else $error("depth out of bounds"); | ||
// pragma translate_on | ||
|
||
endmodule |