mirror of
https://github.com/YosysHQ/yosys
synced 2025-09-20 16:34:51 +00:00
Docs: Reflow line length
This commit is contained in:
parent
829e02ec5b
commit
40ba92e956
20 changed files with 782 additions and 785 deletions
|
@ -10,13 +10,12 @@ fine-grained optimisation and LUT mapping.
|
|||
Yosys has two different commands, which both use this logic toolbox, but use it
|
||||
in different ways.
|
||||
|
||||
The `abc` pass can be used for both ASIC (e.g. :yoscrypt:`abc
|
||||
-liberty`) and FPGA (:yoscrypt:`abc -lut`) mapping, but this page will focus on
|
||||
FPGA mapping.
|
||||
The `abc` pass can be used for both ASIC (e.g. :yoscrypt:`abc -liberty`) and
|
||||
FPGA (:yoscrypt:`abc -lut`) mapping, but this page will focus on FPGA mapping.
|
||||
|
||||
The `abc9` pass generally provides superior mapping quality due to
|
||||
being aware of combination boxes and DFF and LUT timings, giving it a more
|
||||
global view of the mapping problem.
|
||||
The `abc9` pass generally provides superior mapping quality due to being aware
|
||||
of combination boxes and DFF and LUT timings, giving it a more global view of
|
||||
the mapping problem.
|
||||
|
||||
.. _ABC: https://github.com/berkeley-abc/abc
|
||||
|
||||
|
|
|
@ -98,8 +98,8 @@ our internal cell library will be mapped to:
|
|||
:name: mycells-lib
|
||||
:caption: :file:`mycells.lib`
|
||||
|
||||
Recall that the Yosys built-in logic gate types are `$_NOT_`, `$_AND_`,
|
||||
`$_OR_`, `$_XOR_`, and `$_MUX_` with an assortment of dff memory types.
|
||||
Recall that the Yosys built-in logic gate types are `$_NOT_`, `$_AND_`, `$_OR_`,
|
||||
`$_XOR_`, and `$_MUX_` with an assortment of dff memory types.
|
||||
:ref:`mycells-lib` defines our target cells as ``BUF``, ``NOT``, ``NAND``,
|
||||
``NOR``, and ``DFF``. Mapping between these is performed with the commands
|
||||
`dfflibmap` and `abc` as follows:
|
||||
|
@ -117,8 +117,8 @@ The final version of our ``counter`` module looks like this:
|
|||
|
||||
``counter`` after hardware cell mapping
|
||||
|
||||
Before finally being output as a verilog file with `write_verilog`,
|
||||
which can then be loaded into another tool:
|
||||
Before finally being output as a verilog file with `write_verilog`, which can
|
||||
then be loaded into another tool:
|
||||
|
||||
.. literalinclude:: /code_examples/intro/counter.ys
|
||||
:language: yoscrypt
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
The extract pass
|
||||
----------------
|
||||
|
||||
- Like the `techmap` pass, the `extract` pass is called with a
|
||||
map file. It compares the circuits inside the modules of the map file with the
|
||||
design and looks for sub-circuits in the design that match any of the modules
|
||||
in the map file.
|
||||
- If a match is found, the `extract` pass will replace the matching
|
||||
subcircuit with an instance of the module from the map file.
|
||||
- Like the `techmap` pass, the `extract` pass is called with a map file. It
|
||||
compares the circuits inside the modules of the map file with the design and
|
||||
looks for sub-circuits in the design that match any of the modules in the map
|
||||
file.
|
||||
- If a match is found, the `extract` pass will replace the matching subcircuit
|
||||
with an instance of the module from the map file.
|
||||
- In a way the `extract` pass is the inverse of the techmap pass.
|
||||
|
||||
.. todo:: add/expand supporting text, also mention custom pattern matching and
|
||||
|
@ -68,23 +68,23 @@ The wrap-extract-unwrap method
|
|||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Often a coarse-grain element has a constant bit-width, but can be used to
|
||||
implement operations with a smaller bit-width. For example, a 18x25-bit multiplier
|
||||
can also be used to implement 16x20-bit multiplication.
|
||||
implement operations with a smaller bit-width. For example, a 18x25-bit
|
||||
multiplier can also be used to implement 16x20-bit multiplication.
|
||||
|
||||
A way of mapping such elements in coarse grain synthesis is the
|
||||
wrap-extract-unwrap method:
|
||||
|
||||
wrap
|
||||
Identify candidate-cells in the circuit and wrap them in a cell with a
|
||||
constant wider bit-width using `techmap`. The wrappers use the same
|
||||
parameters as the original cell, so the information about the original width
|
||||
of the ports is preserved. Then use the `connwrappers` command to
|
||||
connect up the bit-extended in- and outputs of the wrapper cells.
|
||||
constant wider bit-width using `techmap`. The wrappers use the same parameters
|
||||
as the original cell, so the information about the original width of the ports
|
||||
is preserved. Then use the `connwrappers` command to connect up the
|
||||
bit-extended in- and outputs of the wrapper cells.
|
||||
|
||||
extract
|
||||
Now all operations are encoded using the same bit-width as the coarse grain
|
||||
element. The `extract` command can be used to replace circuits with
|
||||
cells of the target architecture.
|
||||
element. The `extract` command can be used to replace circuits with cells of
|
||||
the target architecture.
|
||||
|
||||
unwrap
|
||||
The remaining wrapper cell can be unwrapped using `techmap`.
|
||||
|
|
|
@ -25,9 +25,8 @@ following description:
|
|||
- Does not already have the ``\fsm_encoding`` attribute.
|
||||
- Is not an output of the containing module.
|
||||
- Is driven by single `$dff` or `$adff` cell.
|
||||
- The ``\D``-Input of this `$dff` or `$adff` cell is driven by a
|
||||
multiplexer tree that only has constants or the old state value on its
|
||||
leaves.
|
||||
- The ``\D``-Input of this `$dff` or `$adff` cell is driven by a multiplexer
|
||||
tree that only has constants or the old state value on its leaves.
|
||||
- The state value is only used in the said multiplexer tree or by simple
|
||||
relational cells that compare the state value to a constant (usually `$eq`
|
||||
cells).
|
||||
|
@ -87,8 +86,8 @@ given set of result signals using a set of signal-value assignments. It can also
|
|||
be passed a list of stop-signals that abort the ConstEval algorithm if the value
|
||||
of a stop-signal is needed in order to calculate the result signals.
|
||||
|
||||
The `fsm_extract` pass uses the ConstEval class in the following way to
|
||||
create a transition table. For each state:
|
||||
The `fsm_extract` pass uses the ConstEval class in the following way to create a
|
||||
transition table. For each state:
|
||||
|
||||
1. Create a ConstEval object for the module containing the FSM
|
||||
2. Add all control inputs to the list of stop signals
|
||||
|
@ -108,13 +107,12 @@ drivers for the control outputs are disconnected.
|
|||
FSM optimization
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
The `fsm_opt` pass performs basic optimizations on `$fsm` cells (not
|
||||
including state recoding). The following optimizations are performed (in this
|
||||
order):
|
||||
The `fsm_opt` pass performs basic optimizations on `$fsm` cells (not including
|
||||
state recoding). The following optimizations are performed (in this order):
|
||||
|
||||
- Unused control outputs are removed from the `$fsm` cell. The attribute
|
||||
``\unused_bits`` (that is usually set by the `opt_clean` pass) is
|
||||
used to determine which control outputs are unused.
|
||||
``\unused_bits`` (that is usually set by the `opt_clean` pass) is used to
|
||||
determine which control outputs are unused.
|
||||
|
||||
- Control inputs that are connected to the same driver are merged.
|
||||
|
||||
|
@ -134,11 +132,10 @@ order):
|
|||
FSM recoding
|
||||
~~~~~~~~~~~~
|
||||
|
||||
The `fsm_recode` pass assigns new bit pattern to the states. Usually
|
||||
this also implies a change in the width of the state signal. At the moment of
|
||||
this writing only one-hot encoding with all-zero for the reset state is
|
||||
supported.
|
||||
The `fsm_recode` pass assigns new bit pattern to the states. Usually this also
|
||||
implies a change in the width of the state signal. At the moment of this writing
|
||||
only one-hot encoding with all-zero for the reset state is supported.
|
||||
|
||||
The `fsm_recode` pass can also write a text file with the changes
|
||||
performed by it that can be used when verifying designs synthesized by Yosys
|
||||
using Synopsys Formality.
|
||||
The `fsm_recode` pass can also write a text file with the changes performed by
|
||||
it that can be used when verifying designs synthesized by Yosys using Synopsys
|
||||
Formality.
|
||||
|
|
|
@ -8,17 +8,16 @@ coarse-grain optimizations before being mapped to hard blocks and fine-grain
|
|||
cells. Most commands in Yosys will target either coarse-grain representation or
|
||||
fine-grain representation, with only a select few compatible with both states.
|
||||
|
||||
Commands such as `proc`, `fsm`, and `memory` rely on
|
||||
the additional information in the coarse-grain representation, along with a
|
||||
number of optimizations such as `wreduce`, `share`, and
|
||||
`alumacc`. `opt` provides optimizations which are useful in
|
||||
both states, while `techmap` is used to convert coarse-grain cells
|
||||
to the corresponding fine-grain representation.
|
||||
Commands such as `proc`, `fsm`, and `memory` rely on the additional information
|
||||
in the coarse-grain representation, along with a number of optimizations such as
|
||||
`wreduce`, `share`, and `alumacc`. `opt` provides optimizations which are
|
||||
useful in both states, while `techmap` is used to convert coarse-grain cells to
|
||||
the corresponding fine-grain representation.
|
||||
|
||||
Single-bit cells (logic gates, FFs) as well as LUTs, half-adders, and
|
||||
full-adders make up the bulk of the fine-grain representation and are necessary
|
||||
for commands such as `abc`\ /`abc9`, `simplemap`,
|
||||
`dfflegalize`, and `memory_map`.
|
||||
for commands such as `abc`\ /`abc9`, `simplemap`, `dfflegalize`, and
|
||||
`memory_map`.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 3
|
||||
|
|
|
@ -5,10 +5,10 @@ The `memory` command
|
|||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
In the RTL netlist, memory reads and writes are individual cells. This makes
|
||||
consolidating the number of ports for a memory easier. The `memory`
|
||||
pass transforms memories to an implementation. Per default that is logic for
|
||||
address decoders and registers. It also is a macro command that calls the other
|
||||
common ``memory_*`` passes in a sensible order:
|
||||
consolidating the number of ports for a memory easier. The `memory` pass
|
||||
transforms memories to an implementation. Per default that is logic for address
|
||||
decoders and registers. It also is a macro command that calls the other common
|
||||
``memory_*`` passes in a sensible order:
|
||||
|
||||
.. literalinclude:: /code_examples/macro_commands/memory.ys
|
||||
:language: yoscrypt
|
||||
|
@ -22,11 +22,11 @@ Some quick notes:
|
|||
- `memory_dff` merges registers into the memory read- and write cells.
|
||||
- `memory_collect` collects all read and write cells for a memory and
|
||||
transforms them into one multi-port memory cell.
|
||||
- `memory_map` takes the multi-port memory cell and transforms it to
|
||||
address decoder logic and registers.
|
||||
- `memory_map` takes the multi-port memory cell and transforms it to address
|
||||
decoder logic and registers.
|
||||
|
||||
For more information about `memory`, such as disabling certain sub
|
||||
commands, see :doc:`/cmd/memory`.
|
||||
For more information about `memory`, such as disabling certain sub commands, see
|
||||
:doc:`/cmd/memory`.
|
||||
|
||||
Example
|
||||
-------
|
||||
|
@ -75,20 +75,20 @@ For example:
|
|||
techmap -map my_memory_map.v
|
||||
memory_map
|
||||
|
||||
`memory_libmap` attempts to convert memory cells (`$mem_v2` etc) into
|
||||
hardware supported memory using a provided library (:file:`my_memory_map.txt` in the
|
||||
`memory_libmap` attempts to convert memory cells (`$mem_v2` etc) into hardware
|
||||
supported memory using a provided library (:file:`my_memory_map.txt` in the
|
||||
example above). Where necessary, emulation logic is added to ensure functional
|
||||
equivalence before and after this conversion. :yoscrypt:`techmap -map
|
||||
my_memory_map.v` then uses `techmap` to map to hardware primitives. Any
|
||||
leftover memory cells unable to be converted are then picked up by
|
||||
`memory_map` and mapped to DFFs and address decoders.
|
||||
my_memory_map.v` then uses `techmap` to map to hardware primitives. Any leftover
|
||||
memory cells unable to be converted are then picked up by `memory_map` and
|
||||
mapped to DFFs and address decoders.
|
||||
|
||||
.. note::
|
||||
|
||||
More information about what mapping options are available and associated
|
||||
costs of each can be found by enabling debug outputs. This can be done with
|
||||
the `debug` command, or by using the ``-g`` flag when calling Yosys
|
||||
to globally enable debug messages.
|
||||
the `debug` command, or by using the ``-g`` flag when calling Yosys to
|
||||
globally enable debug messages.
|
||||
|
||||
For more on the lib format for `memory_libmap`, see
|
||||
`passes/memory/memlib.md
|
||||
|
@ -110,13 +110,15 @@ Notes
|
|||
Memory kind selection
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The memory inference code will automatically pick target memory primitive based on memory geometry
|
||||
and features used. Depending on the target, there can be up to four memory primitive classes
|
||||
available for selection:
|
||||
The memory inference code will automatically pick target memory primitive based
|
||||
on memory geometry and features used. Depending on the target, there can be up
|
||||
to four memory primitive classes available for selection:
|
||||
|
||||
- FF RAM (aka logic): no hardware primitive used, memory lowered to a bunch of FFs and multiplexers
|
||||
- FF RAM (aka logic): no hardware primitive used, memory lowered to a bunch of
|
||||
FFs and multiplexers
|
||||
|
||||
- Can handle arbitrary number of write ports, as long as all write ports are in the same clock domain
|
||||
- Can handle arbitrary number of write ports, as long as all write ports are
|
||||
in the same clock domain
|
||||
- Can handle arbitrary number and kind of read ports
|
||||
|
||||
- LUT RAM (aka distributed RAM): uses LUT storage as RAM
|
||||
|
@ -131,7 +133,8 @@ available for selection:
|
|||
- Supported on basically all FPGAs
|
||||
- Supports only synchronous reads
|
||||
- Two ports with separate clocks
|
||||
- Usually supports true dual port (with notable exception of ice40 that only supports SDP)
|
||||
- Usually supports true dual port (with notable exception of ice40 that only
|
||||
supports SDP)
|
||||
- Usually supports asymmetric memories and per-byte write enables
|
||||
- Several kilobits in size
|
||||
|
||||
|
@ -155,19 +158,22 @@ available for selection:
|
|||
- Two ports, both with mutually exclusive synchronous read and write
|
||||
- Single clock
|
||||
|
||||
- Will not be automatically selected by memory inference code, needs explicit opt-in via
|
||||
ram_style attribute
|
||||
- Will not be automatically selected by memory inference code, needs explicit
|
||||
opt-in via ram_style attribute
|
||||
|
||||
In general, you can expect the automatic selection process to work roughly like this:
|
||||
In general, you can expect the automatic selection process to work roughly like
|
||||
this:
|
||||
|
||||
- If any read port is asynchronous, only LUT RAM (or FF RAM) can be used.
|
||||
- If there is more than one write port, only block RAM can be used, and this needs to be a
|
||||
hardware-supported true dual port pattern
|
||||
- If there is more than one write port, only block RAM can be used, and this
|
||||
needs to be a hardware-supported true dual port pattern
|
||||
|
||||
- … unless all write ports are in the same clock domain, in which case FF RAM can also be used,
|
||||
but this is generally not what you want for anything but really small memories
|
||||
- … unless all write ports are in the same clock domain, in which case FF RAM
|
||||
can also be used, but this is generally not what you want for anything but
|
||||
really small memories
|
||||
|
||||
- Otherwise, either FF RAM, LUT RAM, or block RAM will be used, depending on memory size
|
||||
- Otherwise, either FF RAM, LUT RAM, or block RAM will be used, depending on
|
||||
memory size
|
||||
|
||||
This process can be overridden by attaching a ram_style attribute to the memory:
|
||||
|
||||
|
@ -178,15 +184,17 @@ This process can be overridden by attaching a ram_style attribute to the memory:
|
|||
|
||||
It is an error if this override cannot be realized for the given target.
|
||||
|
||||
Many alternate spellings of the attribute are also accepted, for compatibility with other software.
|
||||
Many alternate spellings of the attribute are also accepted, for compatibility
|
||||
with other software.
|
||||
|
||||
Initial data
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Most FPGA targets support initializing all kinds of memory to user-provided values. If explicit
|
||||
initialization is not used the initial memory value is undefined. Initial data can be provided by
|
||||
either initial statements writing memory cells one by one of ``$readmemh`` or ``$readmemb`` system
|
||||
tasks. For an example pattern, see `sr_init`_.
|
||||
Most FPGA targets support initializing all kinds of memory to user-provided
|
||||
values. If explicit initialization is not used the initial memory value is
|
||||
undefined. Initial data can be provided by either initial statements writing
|
||||
memory cells one by one of ``$readmemh`` or ``$readmemb`` system tasks. For an
|
||||
example pattern, see `sr_init`_.
|
||||
|
||||
.. _wbe:
|
||||
|
||||
|
@ -194,12 +202,13 @@ Write port with byte enables
|
|||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- Byte enables can be used with any supported pattern
|
||||
- To ensure that multiple writes will be merged into one port, they need to have disjoint bit
|
||||
ranges, have the same address, and the same clock
|
||||
- Any write enable granularity will be accepted (down to per-bit write enables), but using smaller
|
||||
granularity than natively supported by the target is very likely to be inefficient (eg. using
|
||||
4-bit bytes on ECP5 will result in either padding the bytes with 5 dummy bits to native 9-bit
|
||||
units or splitting the RAM into two block RAMs)
|
||||
- To ensure that multiple writes will be merged into one port, they need to have
|
||||
disjoint bit ranges, have the same address, and the same clock
|
||||
- Any write enable granularity will be accepted (down to per-bit write enables),
|
||||
but using smaller granularity than natively supported by the target is very
|
||||
likely to be inefficient (eg. using 4-bit bytes on ECP5 will result in either
|
||||
padding the bytes with 5 dummy bits to native 9-bit units or splitting the RAM
|
||||
into two block RAMs)
|
||||
|
||||
.. code:: verilog
|
||||
|
||||
|
@ -240,7 +249,8 @@ Synchronous SDP with clock domain crossing
|
|||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- Will result in block RAM or LUT RAM depending on size
|
||||
- No behavior guarantees in case of simultaneous read and write to the same address
|
||||
- No behavior guarantees in case of simultaneous read and write to the same
|
||||
address
|
||||
|
||||
.. code:: verilog
|
||||
|
||||
|
@ -261,9 +271,9 @@ Synchronous SDP read first
|
|||
|
||||
- The read and write parts can be in the same or different processes.
|
||||
- Will result in block RAM or LUT RAM depending on size
|
||||
- As long as the same clock is used for both, yosys will ensure read-first behavior. This may
|
||||
require extra circuitry on some targets for block RAM. If this is not necessary, use one of the
|
||||
patterns below.
|
||||
- As long as the same clock is used for both, yosys will ensure read-first
|
||||
behavior. This may require extra circuitry on some targets for block RAM. If
|
||||
this is not necessary, use one of the patterns below.
|
||||
|
||||
.. code:: verilog
|
||||
|
||||
|
@ -281,8 +291,8 @@ Synchronous SDP read first
|
|||
Synchronous SDP with undefined collision behavior
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- Like above, but the read value is undefined when read and write ports target the same address in
|
||||
the same cycle
|
||||
- Like above, but the read value is undefined when read and write ports target
|
||||
the same address in the same cycle
|
||||
|
||||
.. code:: verilog
|
||||
|
||||
|
@ -322,8 +332,8 @@ Synchronous SDP with write-first behavior
|
|||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- Will result in block RAM or LUT RAM depending on size
|
||||
- May use additional circuitry for block RAM if write-first is not natively supported. Will always
|
||||
use additional circuitry for LUT RAM.
|
||||
- May use additional circuitry for block RAM if write-first is not natively
|
||||
supported. Will always use additional circuitry for LUT RAM.
|
||||
|
||||
.. code:: verilog
|
||||
|
||||
|
@ -343,7 +353,8 @@ Synchronous SDP with write-first behavior
|
|||
Synchronous SDP with write-first behavior (alternate pattern)
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- This pattern is supported for compatibility, but is much less flexible than the above
|
||||
- This pattern is supported for compatibility, but is much less flexible than
|
||||
the above
|
||||
|
||||
.. code:: verilog
|
||||
|
||||
|
@ -378,8 +389,10 @@ Synchronous single-port RAM with mutually exclusive read/write
|
|||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- Will result in single-port block RAM or LUT RAM depending on size
|
||||
- This is the correct pattern to infer ice40 SPRAM (with manual ram_style selection)
|
||||
- On targets that don't support read/write block RAM ports (eg. ice40), will result in SDP block RAM instead
|
||||
- This is the correct pattern to infer ice40 SPRAM (with manual ram_style
|
||||
selection)
|
||||
- On targets that don't support read/write block RAM ports (eg. ice40), will
|
||||
result in SDP block RAM instead
|
||||
- For block RAM, will use "NO_CHANGE" mode if available
|
||||
|
||||
.. code:: verilog
|
||||
|
@ -396,12 +409,14 @@ Synchronous single-port RAM with mutually exclusive read/write
|
|||
Synchronous single-port RAM with read-first behavior
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- Will only result in single-port block RAM when read-first behavior is natively supported;
|
||||
otherwise, SDP RAM with additional circuitry will be used
|
||||
- Many targets (Xilinx, ECP5, …) can only natively support read-first/write-first single-port RAM
|
||||
(or TDP RAM) where the write_enable signal implies the read_enable signal (ie. can never write
|
||||
without reading). The memory inference code will run a simple SAT solver on the control signals to
|
||||
determine if this is the case, and insert emulation circuitry if it cannot be easily proven.
|
||||
- Will only result in single-port block RAM when read-first behavior is natively
|
||||
supported; otherwise, SDP RAM with additional circuitry will be used
|
||||
- Many targets (Xilinx, ECP5, …) can only natively support
|
||||
read-first/write-first single-port RAM (or TDP RAM) where the write_enable
|
||||
signal implies the read_enable signal (ie. can never write without reading).
|
||||
The memory inference code will run a simple SAT solver on the control signals
|
||||
to determine if this is the case, and insert emulation circuitry if it cannot
|
||||
be easily proven.
|
||||
|
||||
.. code:: verilog
|
||||
|
||||
|
@ -418,7 +433,8 @@ Synchronous single-port RAM with write-first behavior
|
|||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- Will result in single-port block RAM or LUT RAM when supported
|
||||
- Block RAMs will require extra circuitry if write-first behavior not natively supported
|
||||
- Block RAMs will require extra circuitry if write-first behavior not natively
|
||||
supported
|
||||
|
||||
.. code:: verilog
|
||||
|
||||
|
@ -440,8 +456,8 @@ Synchronous read port with initial value
|
|||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- Initial read port values can be combined with any other supported pattern
|
||||
- If block RAM is used and initial read port values are not natively supported by the target, small
|
||||
emulation circuit will be inserted
|
||||
- If block RAM is used and initial read port values are not natively supported
|
||||
by the target, small emulation circuit will be inserted
|
||||
|
||||
.. code:: verilog
|
||||
|
||||
|
@ -459,10 +475,11 @@ Synchronous read port with initial value
|
|||
Read register reset patterns
|
||||
----------------------------
|
||||
|
||||
Resets can be combined with any other supported pattern (except that synchronous reset and
|
||||
asynchronous reset cannot both be used on a single read port). If block RAM is used and the
|
||||
selected reset (synchronous or asynchronous) is used but not natively supported by the target, small
|
||||
emulation circuitry will be inserted.
|
||||
Resets can be combined with any other supported pattern (except that synchronous
|
||||
reset and asynchronous reset cannot both be used on a single read port). If
|
||||
block RAM is used and the selected reset (synchronous or asynchronous) is used
|
||||
but not natively supported by the target, small emulation circuitry will be
|
||||
inserted.
|
||||
|
||||
Synchronous reset, reset priority over enable
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
@ -520,22 +537,26 @@ Synchronous read port with asynchronous reset
|
|||
Asymmetric memory patterns
|
||||
--------------------------
|
||||
|
||||
To construct an asymmetric memory (memory with read/write ports of differing widths):
|
||||
To construct an asymmetric memory (memory with read/write ports of differing
|
||||
widths):
|
||||
|
||||
- Declare the memory with the width of the narrowest intended port
|
||||
- Split all wide ports into multiple narrow ports
|
||||
- To ensure the wide ports will be correctly merged:
|
||||
|
||||
- For the address, use a concatenation of actual address in the high bits and a constant in the
|
||||
low bits
|
||||
- Ensure the actual address is identical for all ports belonging to the wide port
|
||||
- For the address, use a concatenation of actual address in the high bits and
|
||||
a constant in the low bits
|
||||
- Ensure the actual address is identical for all ports belonging to the wide
|
||||
port
|
||||
- Ensure that clock is identical
|
||||
- For read ports, ensure that enable/reset signals are identical (for write ports, the enable
|
||||
signal may vary — this will result in using the byte enable functionality)
|
||||
- For read ports, ensure that enable/reset signals are identical (for write
|
||||
ports, the enable signal may vary — this will result in using the byte
|
||||
enable functionality)
|
||||
|
||||
Asymmetric memory is supported on all targets, but may require emulation circuitry where not
|
||||
natively supported. Note that when the memory is larger than the underlying block RAM primitive,
|
||||
hardware asymmetric memory support is likely not to be used even if present as it is more expensive.
|
||||
Asymmetric memory is supported on all targets, but may require emulation
|
||||
circuitry where not natively supported. Note that when the memory is larger
|
||||
than the underlying block RAM primitive, hardware asymmetric memory support is
|
||||
likely not to be used even if present as it is more expensive.
|
||||
|
||||
.. _wide_sr:
|
||||
|
||||
|
@ -615,20 +636,25 @@ Wide write port
|
|||
True dual port (TDP) patterns
|
||||
-----------------------------
|
||||
|
||||
- Many different variations of true dual port memory can be created by combining two single-port RAM
|
||||
patterns on the same memory
|
||||
- When TDP memory is used, memory inference code has much less maneuver room to create requested
|
||||
semantics compared to individual single-port patterns (which can end up lowered to SDP memory
|
||||
where necessary) — supported patterns depend strongly on the target
|
||||
- In particular, when both ports have the same clock, it's likely that "undefined collision" mode
|
||||
needs to be manually selected to enable TDP memory inference
|
||||
- The examples below are non-exhaustive — many more combinations of port types are possible
|
||||
- Note: if two write ports are in the same process, this defines a priority relation between them
|
||||
(if both ports are active in the same clock, the later one wins). On almost all targets, this will
|
||||
result in a bit of extra circuitry to ensure the priority semantics. If this is not what you want,
|
||||
put them in separate processes.
|
||||
- Many different variations of true dual port memory can be created by combining
|
||||
two single-port RAM patterns on the same memory
|
||||
- When TDP memory is used, memory inference code has much less maneuver room to
|
||||
create requested semantics compared to individual single-port patterns (which
|
||||
can end up lowered to SDP memory where necessary) — supported patterns depend
|
||||
strongly on the target
|
||||
- In particular, when both ports have the same clock, it's likely that
|
||||
"undefined collision" mode needs to be manually selected to enable TDP memory
|
||||
inference
|
||||
- The examples below are non-exhaustive — many more combinations of port types
|
||||
are possible
|
||||
- Note: if two write ports are in the same process, this defines a priority
|
||||
relation between them (if both ports are active in the same clock, the later
|
||||
one wins). On almost all targets, this will result in a bit of extra circuitry
|
||||
to ensure the priority semantics. If this is not what you want, put them in
|
||||
separate processes.
|
||||
|
||||
- Priority is not supported when using the verific front end and any priority semantics are ignored.
|
||||
- Priority is not supported when using the verific front end and any priority
|
||||
semantics are ignored.
|
||||
|
||||
TDP with different clocks, exclusive read/write
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
@ -654,7 +680,8 @@ TDP with different clocks, exclusive read/write
|
|||
TDP with same clock, read-first behavior
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- This requires hardware inter-port read-first behavior, and will only work on some targets (Xilinx, Nexus)
|
||||
- This requires hardware inter-port read-first behavior, and will only work on
|
||||
some targets (Xilinx, Nexus)
|
||||
|
||||
.. code:: verilog
|
||||
|
||||
|
@ -677,9 +704,10 @@ TDP with same clock, read-first behavior
|
|||
TDP with multiple read ports
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
- The combination of a single write port with an arbitrary amount of read ports is supported on all
|
||||
targets — if a multi-read port primitive is available (like Xilinx RAM64M), it'll be used as
|
||||
appropriate. Otherwise, the memory will be automatically split into multiple primitives.
|
||||
- The combination of a single write port with an arbitrary amount of read ports
|
||||
is supported on all targets — if a multi-read port primitive is available
|
||||
(like Xilinx RAM64M), it'll be used as appropriate. Otherwise, the memory
|
||||
will be automatically split into multiple primitives.
|
||||
|
||||
.. code:: verilog
|
||||
|
||||
|
|
|
@ -9,9 +9,9 @@ This chapter outlines these optimizations.
|
|||
The `opt` macro command
|
||||
--------------------------------
|
||||
|
||||
The Yosys pass `opt` runs a number of simple optimizations. This
|
||||
includes removing unused signals and cells and const folding. It is recommended
|
||||
to run this pass after each major step in the synthesis script. As listed in
|
||||
The Yosys pass `opt` runs a number of simple optimizations. This includes
|
||||
removing unused signals and cells and const folding. It is recommended to run
|
||||
this pass after each major step in the synthesis script. As listed in
|
||||
:doc:`/cmd/opt`, this macro command calls the following ``opt_*`` commands:
|
||||
|
||||
.. literalinclude:: /code_examples/macro_commands/opt.ys
|
||||
|
@ -69,17 +69,17 @@ undef.
|
|||
The last two lines simply replace an `$_AND_` gate with one constant-1 input
|
||||
with a buffer.
|
||||
|
||||
Besides this basic const folding the `opt_expr` pass can replace 1-bit
|
||||
wide `$eq` and `$ne` cells with buffers or not-gates if one input is
|
||||
constant. Equality checks may also be reduced in size if there are redundant
|
||||
bits in the arguments (i.e. bits which are constant on both inputs). This can,
|
||||
for example, result in a 32-bit wide constant like ``255`` being reduced to the
|
||||
8-bit value of ``8'11111111`` if the signal being compared is only 8-bit as in
|
||||
Besides this basic const folding the `opt_expr` pass can replace 1-bit wide
|
||||
`$eq` and `$ne` cells with buffers or not-gates if one input is constant.
|
||||
Equality checks may also be reduced in size if there are redundant bits in the
|
||||
arguments (i.e. bits which are constant on both inputs). This can, for example,
|
||||
result in a 32-bit wide constant like ``255`` being reduced to the 8-bit value
|
||||
of ``8'11111111`` if the signal being compared is only 8-bit as in
|
||||
:ref:`addr_gen_clean` of :doc:`/getting_started/example_synth`.
|
||||
|
||||
The `opt_expr` pass is very conservative regarding optimizing `$mux`
|
||||
cells, as these cells are often used to model decision-trees and breaking these
|
||||
trees can interfere with other optimizations.
|
||||
The `opt_expr` pass is very conservative regarding optimizing `$mux` cells, as
|
||||
these cells are often used to model decision-trees and breaking these trees can
|
||||
interfere with other optimizations.
|
||||
|
||||
.. literalinclude:: /code_examples/opt/opt_expr.ys
|
||||
:language: Verilog
|
||||
|
@ -100,9 +100,9 @@ identifies cells with identical inputs and replaces them with a single instance
|
|||
of the cell.
|
||||
|
||||
The option ``-nomux`` can be used to disable resource sharing for multiplexer
|
||||
cells (`$mux` and `$pmux`.) This can be useful as it prevents multiplexer
|
||||
trees to be merged, which might prevent `opt_muxtree` to identify
|
||||
possible optimizations.
|
||||
cells (`$mux` and `$pmux`.) This can be useful as it prevents multiplexer trees
|
||||
to be merged, which might prevent `opt_muxtree` to identify possible
|
||||
optimizations.
|
||||
|
||||
.. literalinclude:: /code_examples/opt/opt_merge.ys
|
||||
:language: Verilog
|
||||
|
@ -128,9 +128,9 @@ Consider the following simple example:
|
|||
:caption: example verilog for demonstrating `opt_muxtree`
|
||||
|
||||
The output can never be ``c``, as this would require ``a`` to be 1 for the outer
|
||||
multiplexer and 0 for the inner multiplexer. The `opt_muxtree` pass
|
||||
detects this contradiction and replaces the inner multiplexer with a constant 1,
|
||||
yielding the logic for ``y = a ? b : d``.
|
||||
multiplexer and 0 for the inner multiplexer. The `opt_muxtree` pass detects this
|
||||
contradiction and replaces the inner multiplexer with a constant 1, yielding the
|
||||
logic for ``y = a ? b : d``.
|
||||
|
||||
.. figure:: /_images/code_examples/opt/opt_muxtree.*
|
||||
:class: width-helper invert-helper
|
||||
|
@ -141,9 +141,9 @@ Simplifying large MUXes and AND/OR gates - `opt_reduce`
|
|||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This is a simple optimization pass that identifies and consolidates identical
|
||||
input bits to `$reduce_and` and `$reduce_or` cells. It also sorts the input
|
||||
bits to ease identification of shareable `$reduce_and` and `$reduce_or`
|
||||
cells in other passes.
|
||||
input bits to `$reduce_and` and `$reduce_or` cells. It also sorts the input bits
|
||||
to ease identification of shareable `$reduce_and` and `$reduce_or` cells in
|
||||
other passes.
|
||||
|
||||
This pass also identifies and consolidates identical inputs to multiplexer
|
||||
cells. In this case the new shared select bit is driven using a `$reduce_or`
|
||||
|
@ -162,8 +162,8 @@ This pass identifies mutually exclusive cells of the same type that:
|
|||
a. share an input signal, and
|
||||
b. drive the same `$mux`, `$_MUX_`, or `$pmux` multiplexing cell,
|
||||
|
||||
allowing the cell to be merged and the multiplexer to be moved from
|
||||
multiplexing its output to multiplexing the non-shared input signals.
|
||||
allowing the cell to be merged and the multiplexer to be moved from multiplexing
|
||||
its output to multiplexing the non-shared input signals.
|
||||
|
||||
.. literalinclude:: /code_examples/opt/opt_share.ys
|
||||
:language: Verilog
|
||||
|
@ -176,16 +176,16 @@ multiplexing its output to multiplexing the non-shared input signals.
|
|||
|
||||
Before and after `opt_share`
|
||||
|
||||
When running `opt` in full, the original `$mux` (labeled ``$3``) is
|
||||
optimized away by `opt_expr`.
|
||||
When running `opt` in full, the original `$mux` (labeled ``$3``) is optimized
|
||||
away by `opt_expr`.
|
||||
|
||||
Performing DFF optimizations - `opt_dff`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This pass identifies single-bit d-type flip-flops (`$_DFF_`, `$dff`, and
|
||||
`$adff` cells) with a constant data input and replaces them with a constant
|
||||
driver. It can also merge clock enables and synchronous reset multiplexers,
|
||||
removing unused control inputs.
|
||||
This pass identifies single-bit d-type flip-flops (`$_DFF_`, `$dff`, and `$adff`
|
||||
cells) with a constant data input and replaces them with a constant driver. It
|
||||
can also merge clock enables and synchronous reset multiplexers, removing unused
|
||||
control inputs.
|
||||
|
||||
Called with ``-nodffe`` and ``-nosdff``, this pass is used to prepare a design
|
||||
for :doc:`/using_yosys/synthesis/fsm`.
|
||||
|
@ -200,20 +200,20 @@ attribute can be used for debugging or by other optimization passes.
|
|||
When to use `opt` or `clean`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Usually it does not hurt to call `opt` after each regular command in
|
||||
the synthesis script. But it increases the synthesis time, so it is favourable
|
||||
to only call `opt` when an improvement can be achieved.
|
||||
Usually it does not hurt to call `opt` after each regular command in the
|
||||
synthesis script. But it increases the synthesis time, so it is favourable to
|
||||
only call `opt` when an improvement can be achieved.
|
||||
|
||||
It is generally a good idea to call `opt` before inherently expensive
|
||||
commands such as `sat` or `freduce`, as the possible gain is
|
||||
much higher in these cases as the possible loss.
|
||||
It is generally a good idea to call `opt` before inherently expensive commands
|
||||
such as `sat` or `freduce`, as the possible gain is much higher in these cases
|
||||
as the possible loss.
|
||||
|
||||
The `clean` command, which is an alias for `opt_clean` with
|
||||
fewer outputs, on the other hand is very fast and many commands leave a mess
|
||||
(dangling signal wires, etc). For example, most commands do not remove any wires
|
||||
or cells. They just change the connections and depend on a later call to clean
|
||||
to get rid of the now unused objects. So the occasional ``;;``, which itself is
|
||||
an alias for `clean`, is a good idea in every synthesis script, e.g:
|
||||
The `clean` command, which is an alias for `opt_clean` with fewer outputs, on
|
||||
the other hand is very fast and many commands leave a mess (dangling signal
|
||||
wires, etc). For example, most commands do not remove any wires or cells. They
|
||||
just change the connections and depend on a later call to clean to get rid of
|
||||
the now unused objects. So the occasional ``;;``, which itself is an alias for
|
||||
`clean`, is a good idea in every synthesis script, e.g:
|
||||
|
||||
.. code-block:: yoscrypt
|
||||
|
||||
|
|
|
@ -5,23 +5,23 @@ Converting process blocks
|
|||
:language: yoscrypt
|
||||
|
||||
The Verilog frontend converts ``always``-blocks to RTL netlists for the
|
||||
expressions and "processess" for the control- and memory elements. The
|
||||
`proc` command then transforms these "processess" to netlists of RTL
|
||||
multiplexer and register cells. It also is a macro command that calls the other
|
||||
``proc_*`` commands in a sensible order:
|
||||
expressions and "processess" for the control- and memory elements. The `proc`
|
||||
command then transforms these "processess" to netlists of RTL multiplexer and
|
||||
register cells. It also is a macro command that calls the other ``proc_*``
|
||||
commands in a sensible order:
|
||||
|
||||
.. literalinclude:: /code_examples/macro_commands/proc.ys
|
||||
:language: yoscrypt
|
||||
:start-after: #end:
|
||||
:caption: Passes called by `proc`
|
||||
|
||||
After all the ``proc_*`` commands, `opt_expr` is called. This can be
|
||||
disabled by calling :yoscrypt:`proc -noopt`. For more information about
|
||||
`proc`, such as disabling certain sub commands, see :doc:`/cmd/proc`.
|
||||
After all the ``proc_*`` commands, `opt_expr` is called. This can be disabled by
|
||||
calling :yoscrypt:`proc -noopt`. For more information about `proc`, such as
|
||||
disabling certain sub commands, see :doc:`/cmd/proc`.
|
||||
|
||||
Many commands can not operate on modules with "processess" in them. Usually a
|
||||
call to `proc` is the first command in the actual synthesis procedure
|
||||
after design elaboration.
|
||||
call to `proc` is the first command in the actual synthesis procedure after
|
||||
design elaboration.
|
||||
|
||||
Example
|
||||
^^^^^^^
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue