3
0
Fork 0
mirror of https://github.com/YosysHQ/yosys synced 2025-06-06 06:03:23 +00:00

Reorganising documentation

Also changing to furo theme.
This commit is contained in:
Krystine Sherwin 2023-08-03 09:20:29 +12:00
parent 4f1cd66829
commit 045c04096e
No known key found for this signature in database
40 changed files with 661 additions and 1282 deletions

View file

@ -0,0 +1,9 @@
Using Yosys (advanced)
======================
.. toctree::
:maxdepth: 2
more_scripting
memory_mapping
yosys_flows

View file

@ -0,0 +1,654 @@
.. _chapter:memorymap:
Memory mapping
==============
Documentation for the Yosys ``memory_libmap`` memory mapper. Note that not all supported patterns
are included in this document, of particular note is that combinations of multiple patterns should
generally work. For example, `Write port with byte enables`_ could be used in conjunction with any
of the simple dual port (SDP) models. In general if a hardware memory definition does not support a
given configuration, additional logic will be instantiated to guarantee behaviour is consistent with
simulation.
See also: `passes/memory/memlib.md <https://github.com/YosysHQ/yosys/blob/master/passes/memory/memlib.md>`_
Additional notes
----------------
Memory kind selection
~~~~~~~~~~~~~~~~~~~~~
The memory inference code will automatically pick target memory primitive based on memory geometry
and features used. Depending on the target, there can be up to four memory primitive classes
available for selection:
- FF RAM (aka logic): no hardware primitive used, memory lowered to a bunch of FFs and multiplexers
- Can handle arbitrary number of write ports, as long as all write ports are in the same clock domain
- Can handle arbitrary number and kind of read ports
- LUT RAM (aka distributed RAM): uses LUT storage as RAM
- Supported on most FPGAs (with notable exception of ice40)
- Usually has one synchronous write port, one or more asynchronous read ports
- Small
- Will never be used for ROMs (lowering to plain LUTs is always better)
- Block RAM: dedicated memory tiles
- Supported on basically all FPGAs
- Supports only synchronous reads
- Two ports with separate clocks
- Usually supports true dual port (with notable exception of ice40 that only supports SDP)
- Usually supports asymmetric memories and per-byte write enables
- Several kilobits in size
- Huge RAM:
- Only supported on several targets:
- Some Xilinx UltraScale devices (UltraRAM)
- Two ports, both with mutually exclusive synchronous read and write
- Single clock
- Initial data must be all-0
- Some ice40 devices (SPRAM)
- Single port with mutually exclusive synchronous read and write
- Does not support initial data
- Nexus (large RAM)
- Two ports, both with mutually exclusive synchronous read and write
- Single clock
- Will not be automatically selected by memory inference code, needs explicit opt-in via
ram_style attribute
In general, you can expect the automatic selection process to work roughly like this:
- If any read port is asynchronous, only LUT RAM (or FF RAM) can be used.
- If there is more than one write port, only block RAM can be used, and this needs to be a
hardware-supported true dual port pattern
- … unless all write ports are in the same clock domain, in which case FF RAM can also be used,
but this is generally not what you want for anything but really small memories
- Otherwise, either FF RAM, LUT RAM, or block RAM will be used, depending on memory size
This process can be overridden by attaching a ram_style attribute to the memory:
- `(* ram_style = "logic" *)` selects FF RAM
- `(* ram_style = "distributed" *)` selects LUT RAM
- `(* ram_style = "block" *)` selects block RAM
- `(* ram_style = "huge" *)` selects huge RAM
It is an error if this override cannot be realized for the given target.
Many alternate spellings of the attribute are also accepted, for compatibility with other software.
Initial data
~~~~~~~~~~~~
Most FPGA targets support initializing all kinds of memory to user-provided values. If explicit
initialization is not used the initial memory value is undefined. Initial data can be provided by
either initial statements writing memory cells one by one of ``$readmemh`` or ``$readmemb`` system
tasks. For an example pattern, see `Synchronous read port with initial value`_.
Write port with byte enables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Byte enables can be used with any supported pattern
- To ensure that multiple writes will be merged into one port, they need to have disjoint bit
ranges, have the same address, and the same clock
- Any write enable granularity will be accepted (down to per-bit write enables), but using smaller
granularity than natively supported by the target is very likely to be inefficient (eg. using
4-bit bytes on ECP5 will result in either padding the bytes with 5 dummy bits to native 9-bit
units or splitting the RAM into two block RAMs)
.. code:: verilog
reg [31 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk) begin
if (write_enable[0])
mem[write_addr][7:0] <= write_data[7:0];
if (write_enable[1])
mem[write_addr][15:8] <= write_data[15:8];
if (write_enable[2])
mem[write_addr][23:16] <= write_data[23:16];
if (write_enable[3])
mem[write_addr][31:24] <= write_data[31:24];
if (read_enable)
read_data <= mem[read_addr];
end
Simple dual port (SDP) memory patterns
--------------------------------------
Asynchronous-read SDP
~~~~~~~~~~~~~~~~~~~~~
- This will result in LUT RAM on supported targets
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk)
if (write_enable)
mem[write_addr] <= write_data;
assign read_data = mem[read_addr];
Synchronous SDP with clock domain crossing
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Will result in block RAM or LUT RAM depending on size
- No behavior guarantees in case of simultaneous read and write to the same address
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge write_clk) begin
if (write_enable)
mem[write_addr] <= write_data;
end
always @(posedge read_clk) begin
if (read_enable)
read_data <= mem[read_addr];
end
Synchronous SDP read first
~~~~~~~~~~~~~~~~~~~~~~~~~~
- The read and write parts can be in the same or different processes.
- Will result in block RAM or LUT RAM depending on size
- As long as the same clock is used for both, yosys will ensure read-first behavior. This may
require extra circuitry on some targets for block RAM. If this is not necessary, use one of the
patterns below.
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk) begin
if (write_enable)
mem[write_addr] <= write_data;
if (read_enable)
read_data <= mem[read_addr];
end
Synchronous SDP with undefined collision behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Like above, but the read value is undefined when read and write ports target the same address in
the same cycle
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk) begin
if (write_enable)
mem[write_addr] <= write_data;
if (read_enable) begin
read_data <= mem[read_addr];
// 👇 this if block 👇
if (write_enable && read_addr == write_addr)
read_data <= 'x;
end
end
- Or below, using the no_rw_check attribute
.. code:: verilog
(* no_rw_check *)
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk) begin
if (write_enable)
mem[write_addr] <= write_data;
if (read_enable)
read_data <= mem[read_addr];
end
Synchronous SDP with write-first behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Will result in block RAM or LUT RAM depending on size
- May use additional circuitry for block RAM if write-first is not natively supported. Will always
use additional circuitry for LUT RAM.
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk) begin
if (write_enable)
mem[write_addr] <= write_data;
if (read_enable) begin
read_data <= mem[read_addr];
if (write_enable && read_addr == write_addr)
read_data <= write_data;
end
end
Synchronous SDP with write-first behavior (alternate pattern)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- This pattern is supported for compatibility, but is much less flexible than the above
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk) begin
if (write_enable)
mem[write_addr] <= write_data;
read_addr_reg <= read_addr;
end
assign read_data = mem[read_addr_reg];
Single-port RAM memory patterns
-------------------------------
Asynchronous-read single-port RAM
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Will result in single-port LUT RAM on supported targets
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk)
if (write_enable)
mem[addr] <= write_data;
assign read_data = mem[addr];
Synchronous single-port RAM with mutually exclusive read/write
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Will result in single-port block RAM or LUT RAM depending on size
- This is the correct pattern to infer ice40 SPRAM (with manual ram_style selection)
- On targets that don't support read/write block RAM ports (eg. ice40), will result in SDP block RAM instead
- For block RAM, will use "NO_CHANGE" mode if available
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk) begin
if (write_enable)
mem[addr] <= write_data;
else if (read_enable)
read_data <= mem[addr];
end
Synchronous single-port RAM with read-first behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Will only result in single-port block RAM when read-first behavior is natively supported;
otherwise, SDP RAM with additional circuitry will be used
- Many targets (Xilinx, ECP5, …) can only natively support read-first/write-first single-port RAM
(or TDP RAM) where the write_enable signal implies the read_enable signal (ie. can never write
without reading). The memory inference code will run a simple SAT solver on the control signals to
determine if this is the case, and insert emulation circuitry if it cannot be easily proven.
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk) begin
if (write_enable)
mem[addr] <= write_data;
if (read_enable)
read_data <= mem[addr];
end
Synchronous single-port RAM with write-first behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Will result in single-port block RAM or LUT RAM when supported
- Block RAMs will require extra circuitry if write-first behavior not natively supported
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk) begin
if (write_enable)
mem[addr] <= write_data;
if (read_enable)
if (write_enable)
read_data <= write_data;
else
read_data <= mem[addr];
end
Synchronous read port with initial value
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Initial read port values can be combined with any other supported pattern
- If block RAM is used and initial read port values are not natively supported by the target, small
emulation circuit will be inserted
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
reg [DATA_WIDTH - 1 : 0] read_data;
initial read_data = 'h1234;
always @(posedge clk) begin
if (write_enable)
mem[write_addr] <= write_data;
if (read_enable)
read_data <= mem[read_addr];
end
Read register reset patterns
----------------------------
Resets can be combined with any other supported pattern (except that synchronous reset and
asynchronous reset cannot both be used on a single read port). If block RAM is used and the
selected reset (synchronous or asynchronous) is used but not natively supported by the target, small
emulation circuitry will be inserted.
Synchronous reset, reset priority over enable
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk) begin
if (write_enable)
mem[write_addr] <= write_data;
if (read_reset)
read_data <= {sval};
else if (read_enable)
read_data <= mem[read_addr];
end
Synchronous reset, enable priority over reset
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk) begin
if (write_enable)
mem[write_addr] <= write_data;
if (read_enable)
if (read_reset)
read_data <= 'h1234;
else
read_data <= mem[read_addr];
end
Synchronous read port with asynchronous reset
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk) begin
if (write_enable)
mem[write_addr] <= write_data;
end
always @(posedge clk, posedge reset_read) begin
if (reset_read)
read_data <= 'h1234;
else if (read_enable)
read_data <= mem[read_addr];
end
Asymmetric memory patterns
--------------------------
To construct an asymmetric memory (memory with read/write ports of differing widths):
- Declare the memory with the width of the narrowest intended port
- Split all wide ports into multiple narrow ports
- To ensure the wide ports will be correctly merged:
- For the address, use a concatenation of actual address in the high bits and a constant in the
low bits
- Ensure the actual address is identical for all ports belonging to the wide port
- Ensure that clock is identical
- For read ports, ensure that enable/reset signals are identical (for write ports, the enable
signal may vary — this will result in using the byte enable functionality)
Asymmetric memory is supported on all targets, but may require emulation circuitry where not
natively supported. Note that when the memory is larger than the underlying block RAM primitive,
hardware asymmetric memory support is likely not to be used even if present as it is more expensive.
Wide synchronous read port
~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: verilog
reg [7:0] mem [0:255];
wire [7:0] write_addr;
wire [5:0] read_addr;
wire [7:0] write_data;
reg [31:0] read_data;
always @(posedge clk) begin
if (write_enable)
mem[write_addr] <= write_data;
if (read_enable) begin
read_data[7:0] <= mem[{read_addr, 2'b00}];
read_data[15:8] <= mem[{read_addr, 2'b01}];
read_data[23:16] <= mem[{read_addr, 2'b10}];
read_data[31:24] <= mem[{read_addr, 2'b11}];
end
end
Wide asynchronous read port
~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Note: the only target natively supporting this pattern is Xilinx UltraScale
.. code:: verilog
reg [7:0] mem [0:511];
wire [8:0] write_addr;
wire [5:0] read_addr;
wire [7:0] write_data;
wire [63:0] read_data;
always @(posedge clk) begin
if (write_enable)
mem[write_addr] <= write_data;
end
assign read_data[7:0] = mem[{read_addr, 3'b000}];
assign read_data[15:8] = mem[{read_addr, 3'b001}];
assign read_data[23:16] = mem[{read_addr, 3'b010}];
assign read_data[31:24] = mem[{read_addr, 3'b011}];
assign read_data[39:32] = mem[{read_addr, 3'b100}];
assign read_data[47:40] = mem[{read_addr, 3'b101}];
assign read_data[55:48] = mem[{read_addr, 3'b110}];
assign read_data[63:56] = mem[{read_addr, 3'b111}];
Wide write port
~~~~~~~~~~~~~~~
.. code:: verilog
reg [7:0] mem [0:255];
wire [5:0] write_addr;
wire [7:0] read_addr;
wire [31:0] write_data;
reg [7:0] read_data;
always @(posedge clk) begin
if (write_enable[0])
mem[{write_addr, 2'b00}] <= write_data[7:0];
if (write_enable[1])
mem[{write_addr, 2'b01}] <= write_data[15:8];
if (write_enable[2])
mem[{write_addr, 2'b10}] <= write_data[23:16];
if (write_enable[3])
mem[{write_addr, 2'b11}] <= write_data[31:24];
if (read_enable)
read_data <= mem[read_addr];
end
True dual port (TDP) patterns
-----------------------------
- Many different variations of true dual port memory can be created by combining two single-port RAM
patterns on the same memory
- When TDP memory is used, memory inference code has much less maneuver room to create requested
semantics compared to individual single-port patterns (which can end up lowered to SDP memory
where necessary) — supported patterns depend strongly on the target
- In particular, when both ports have the same clock, it's likely that "undefined collision" mode
needs to be manually selected to enable TDP memory inference
- The examples below are non-exhaustive — many more combinations of port types are possible
- Note: if two write ports are in the same process, this defines a priority relation between them
(if both ports are active in the same clock, the later one wins). On almost all targets, this will
result in a bit of extra circuitry to ensure the priority semantics. If this is not what you want,
put them in separate processes.
- Priority is not supported when using the verific front end and any priority semantics are ignored.
TDP with different clocks, exclusive read/write
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk_a) begin
if (write_enable_a)
mem[addr_a] <= write_data_a;
else if (read_enable_a)
read_data_a <= mem[addr_a];
end
always @(posedge clk_b) begin
if (write_enable_b)
mem[addr_b] <= write_data_b;
else if (read_enable_b)
read_data_b <= mem[addr_b];
end
TDP with same clock, read-first behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- This requires hardware inter-port read-first behavior, and will only work on some targets (Xilinx, Nexus)
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk) begin
if (write_enable_a)
mem[addr_a] <= write_data_a;
if (read_enable_a)
read_data_a <= mem[addr_a];
end
always @(posedge clk) begin
if (write_enable_b)
mem[addr_b] <= write_data_b;
if (read_enable_b)
read_data_b <= mem[addr_b];
end
TDP with multiple read ports
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- The combination of a single write port with an arbitrary amount of read ports is supported on all
targets — if a multi-read port primitive is available (like Xilinx RAM64M), it'll be used as
appropriate. Otherwise, the memory will be automatically split into multiple primitives.
.. code:: verilog
reg [31:0] mem [0:31];
always @(posedge clk) begin
if (write_enable)
mem[write_addr] <= write_data;
end
assign read_data_a = mem[read_addr_a];
assign read_data_b = mem[read_addr_b];
assign read_data_c = mem[read_addr_c];
Not yet supported patterns
--------------------------
Synchronous SDP with write-first behavior via blocking assignments
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Would require modifications to the Yosys Verilog frontend.
- Use `Synchronous SDP with write-first behavior`_ instead
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @(posedge clk) begin
if (write_enable)
mem[write_addr] = write_data;
if (read_enable)
read_data <= mem[read_addr];
end
Asymmetric memories via part selection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Would require major changes to the Verilog frontend.
- Build wide ports out of narrow ports instead (see `Wide synchronous read port`_)
.. code:: verilog
reg [31:0] mem [2**ADDR_WIDTH - 1 : 0];
wire [1:0] byte_lane;
wire [7:0] write_data;
always @(posedge clk) begin
if (write_enable)
mem[write_addr][byte_lane * 8 +: 8] <= write_data;
if (read_enable)
read_data <= mem[read_addr];
end
Undesired patterns
------------------
Asynchronous writes
~~~~~~~~~~~~~~~~~~~
- Not supported in modern FPGAs
- Not supported in yosys code anyhow
.. code:: verilog
reg [DATA_WIDTH - 1 : 0] mem [2**ADDR_WIDTH - 1 : 0];
always @* begin
if (write_enable)
mem[write_addr] = write_data;
end
assign read_data = mem[read_addr];

View file

@ -0,0 +1,8 @@
More scripting
--------------
.. toctree::
opt_passes
selections
troubleshooting

View file

@ -0,0 +1,332 @@
.. _chapter:opt:
Optimization passes
===================
.. TODO: copypaste
Yosys employs a number of optimizations to generate better and cleaner results.
This chapter outlines these optimizations.
Simple optimizations
--------------------
The Yosys pass opt runs a number of simple optimizations. This includes removing
unused signals and cells and const folding. It is recommended to run this pass
after each major step in the synthesis script. At the time of this writing the
opt pass executes the following passes that each perform a simple optimization:
- Once at the beginning of opt:
- opt_expr
- opt_merge -nomux
- Repeat until result is stable:
- opt_muxtree
- opt_reduce
- opt_merge
- opt_rmdff
- opt_clean
- opt_expr
The following section describes each of the opt\_ passes.
The opt_expr pass
~~~~~~~~~~~~~~~~~
This pass performs const folding on the internal combinational cell types
described in :ref:`chapter:celllib`. This means a cell with all
constant inputs is replaced with the constant value this cell drives. In some
cases this pass can also optimize cells with some constant inputs.
.. table:: Const folding rules for $_AND\_ cells as used in opt_expr.
:name: tab:opt_expr_and
:align: center
========= ========= ===========
A-Input B-Input Replacement
========= ========= ===========
any 0 0
0 any 0
1 1 1
--------- --------- -----------
X/Z X/Z X
1 X/Z X
X/Z 1 X
--------- --------- -----------
any X/Z 0
X/Z any 0
--------- --------- -----------
:math:`a` 1 :math:`a`
1 :math:`b` :math:`b`
========= ========= ===========
.. How to format table?
:numref:`Table %s <tab:opt_expr_and>` shows the replacement rules used for
optimizing an $_AND\_ gate. The first three rules implement the obvious const
folding rules. Note that any' might include dynamic values calculated by other
parts of the circuit. The following three lines propagate undef (X) states.
These are the only three cases in which it is allowed to propagate an undef
according to Sec. 5.1.10 of IEEE Std. 1364-2005 :cite:p:`Verilog2005`.
The next two lines assume the value 0 for undef states. These two rules are only
used if no other substitutions are possible in the current module. If other
substitutions are possible they are performed first, in the hope that the any'
will change to an undef value or a 1 and therefore the output can be set to
undef.
The last two lines simply replace an $_AND\_ gate with one constant-1 input with
a buffer.
Besides this basic const folding the opt_expr pass can replace 1-bit wide $eq
and $ne cells with buffers or not-gates if one input is constant.
The opt_expr pass is very conservative regarding optimizing $mux cells, as these
cells are often used to model decision-trees and breaking these trees can
interfere with other optimizations.
The opt_muxtree pass
~~~~~~~~~~~~~~~~~~~~
This pass optimizes trees of multiplexer cells by analyzing the select inputs.
Consider the following simple example:
.. code:: verilog
:number-lines:
module uut(a, y); input a; output [1:0] y = a ? (a ? 1 : 2) : 3; endmodule
The output can never be 2, as this would require ``a`` to be 1 for the outer
multiplexer and 0 for the inner multiplexer. The opt_muxtree pass detects this
contradiction and replaces the inner multiplexer with a constant 1, yielding the
logic for ``y = a ? 1 : 3``.
The opt_reduce pass
~~~~~~~~~~~~~~~~~~~
This is a simple optimization pass that identifies and consolidates identical
input bits to $reduce_and and $reduce_or cells. It also sorts the input bits to
ease identification of shareable $reduce_and and $reduce_or cells in other
passes.
This pass also identifies and consolidates identical inputs to multiplexer
cells. In this case the new shared select bit is driven using a $reduce_or cell
that combines the original select bits.
Lastly this pass consolidates trees of $reduce_and cells and trees of $reduce_or
cells to single large $reduce_and or $reduce_or cells.
These three simple optimizations are performed in a loop until a stable result
is produced.
The opt_rmdff pass
~~~~~~~~~~~~~~~~~~
This pass identifies single-bit d-type flip-flops ($_DFF\_, $dff, and $adff
cells) with a constant data input and replaces them with a constant driver.
The opt_clean pass
~~~~~~~~~~~~~~~~~~
This pass identifies unused signals and cells and removes them from the design.
It also creates an ``\unused_bits`` attribute on wires with unused bits. This
attribute can be used for debugging or by other optimization passes.
The opt_merge pass
~~~~~~~~~~~~~~~~~~
This pass performs trivial resource sharing. This means that this pass
identifies cells with identical inputs and replaces them with a single instance
of the cell.
The option -nomux can be used to disable resource sharing for multiplexer cells
($mux and $pmux. This can be useful as it prevents multiplexer trees to be
merged, which might prevent opt_muxtree to identify possible optimizations.
FSM extraction and encoding
---------------------------
The fsm pass performs finite-state-machine (FSM) extraction and recoding. The
fsm pass simply executes the following other passes:
- Identify and extract FSMs:
- fsm_detect
- fsm_extract
- Basic optimizations:
- fsm_opt
- opt_clean
- fsm_opt
- Expanding to nearby gate-logic (if called with -expand):
- fsm_expand
- opt_clean
- fsm_opt
- Re-code FSM states (unless called with -norecode):
- fsm_recode
- Print information about FSMs:
- fsm_info
- Export FSMs in KISS2 file format (if called with -export):
- fsm_export
- Map FSMs to RTL cells (unless called with -nomap):
- fsm_map
The fsm_detect pass identifies FSM state registers and marks them using the
``\fsm_encoding = "auto"`` attribute. The fsm_extract extracts all FSMs marked
using the ``\fsm_encoding`` attribute (unless ``\fsm_encoding`` is set to
"none") and replaces the corresponding RTL cells with a $fsm cell. All other
fsm\_ passes operate on these $fsm cells. The fsm_map call finally replaces the
$fsm cells with RTL cells.
Note that these optimizations operate on an RTL netlist. I.e. the fsm pass
should be executed after the proc pass has transformed all RTLIL::Process
objects to RTL cells.
The algorithms used for FSM detection and extraction are influenced by a more
general reported technique :cite:p:`fsmextract`.
FSM detection
~~~~~~~~~~~~~
The fsm_detect pass identifies FSM state registers. It sets the ``\fsm_encoding
= "auto"`` attribute on any (multi-bit) wire that matches the following
description:
- Does not already have the ``\fsm_encoding`` attribute.
- Is not an output of the containing module.
- Is driven by single $dff or $adff cell.
- The ``\D``-Input of this $dff or $adff cell is driven by a multiplexer tree
that only has constants or the old state value on its leaves.
- The state value is only used in the said multiplexer tree or by simple
relational cells that compare the state value to a constant (usually $eq
cells).
This heuristic has proven to work very well. It is possible to overwrite it by
setting ``\fsm_encoding = "auto"`` on registers that should be considered FSM
state registers and setting ``\fsm_encoding = "none"`` on registers that match
the above criteria but should not be considered FSM state registers.
Note however that marking state registers with ``\fsm_encoding`` that are not
suitable for FSM recoding can cause synthesis to fail or produce invalid
results.
FSM extraction
~~~~~~~~~~~~~~
The fsm_extract pass operates on all state signals marked with the
(``\fsm_encoding != "none"``) attribute. For each state signal the following
information is determined:
- The state registers
- The asynchronous reset state if the state registers use asynchronous reset
- All states and the control input signals used in the state transition
functions
- The control output signals calculated from the state signals and control
inputs
- A table of all state transitions and corresponding control inputs- and
outputs
The state registers (and asynchronous reset state, if applicable) is simply
determined by identifying the driver for the state signal.
From there the $mux-tree driving the state register inputs is recursively
traversed. All select inputs are control signals and the leaves of the $mux-tree
are the states. The algorithm fails if a non-constant leaf that is not the state
signal itself is found.
The list of control outputs is initialized with the bits from the state signal.
It is then extended by adding all values that are calculated by cells that
compare the state signal with a constant value.
In most cases this will cover all uses of the state register, thus rendering the
state encoding arbitrary. If however a design uses e.g. a single bit of the
state value to drive a control output directly, this bit of the state signal
will be transformed to a control output of the same value.
Finally, a transition table for the FSM is generated. This is done by using the
ConstEval C++ helper class (defined in kernel/consteval.h) that can be used to
evaluate parts of the design. The ConstEval class can be asked to calculate a
given set of result signals using a set of signal-value assignments. It can also
be passed a list of stop-signals that abort the ConstEval algorithm if the value
of a stop-signal is needed in order to calculate the result signals.
The fsm_extract pass uses the ConstEval class in the following way to create a
transition table. For each state:
1. Create a ConstEval object for the module containing the FSM
2. Add all control inputs to the list of stop signals
3. Set the state signal to the current state
4. Try to evaluate the next state and control output
5. If step 4 was not successful:
- Recursively goto step 4 with the offending stop-signal set to 0.
- Recursively goto step 4 with the offending stop-signal set to 1.
6. If step 4 was successful: Emit transition
Finally a $fsm cell is created with the generated transition table and added to
the module. This new cell is connected to the control signals and the old
drivers for the control outputs are disconnected.
FSM optimization
~~~~~~~~~~~~~~~~
The fsm_opt pass performs basic optimizations on $fsm cells (not including state
recoding). The following optimizations are performed (in this order):
- Unused control outputs are removed from the $fsm cell. The attribute
``\unused_bits`` (that is usually set by the opt_clean pass) is used to
determine which control outputs are unused.
- Control inputs that are connected to the same driver are merged.
- When a control input is driven by a control output, the control input is
removed and the transition table altered to give the same performance without
the external feedback path.
- Entries in the transition table that yield the same output and only differ in
the value of a single control input bit are merged and the different bit is
removed from the sensitivity list (turned into a don't-care bit).
- Constant inputs are removed and the transition table is altered to give an
unchanged behaviour.
- Unused inputs are removed.
FSM recoding
~~~~~~~~~~~~
The fsm_recode pass assigns new bit pattern to the states. Usually this also
implies a change in the width of the state signal. At the moment of this writing
only one-hot encoding with all-zero for the reset state is supported.
The fsm_recode pass can also write a text file with the changes performed by it
that can be used when verifying designs synthesized by Yosys using Synopsys
Formality .
Logic optimization
------------------
Yosys can perform multi-level combinational logic optimization on gate-level
netlists using the external program ABC . The abc pass extracts the
combinational gate-level parts of the design, passes it through ABC, and
re-integrates the results. The abc pass can also be used to perform other
operations using ABC, such as technology mapping (see :ref:`sec:techmap_extern`
for details).

View file

@ -0,0 +1,6 @@
Selections
~~~~~~~~~~
See :doc:`/cmd/select`
Also :doc:`/cmd/show`

View file

@ -0,0 +1,4 @@
Troubleshooting
~~~~~~~~~~~~~~~
See :doc:`/cmd/bugpoint`

View file

@ -0,0 +1,8 @@
Flows, command types, and order
-------------------------------
Synthesis granularity
~~~~~~~~~~~~~~~~~~~~~
Formal verification
~~~~~~~~~~~~~~~~~~~