3
0
Fork 0
mirror of https://github.com/YosysHQ/yosys synced 2025-09-20 16:34:51 +00:00

Docs: Reflow line length

This commit is contained in:
Krystine Sherwin 2024-05-03 13:38:01 +12:00
parent 829e02ec5b
commit 40ba92e956
No known key found for this signature in database
20 changed files with 782 additions and 785 deletions

View file

@ -10,13 +10,12 @@ fine-grained optimisation and LUT mapping.
Yosys has two different commands, which both use this logic toolbox, but use it
in different ways.
The `abc` pass can be used for both ASIC (e.g. :yoscrypt:`abc
-liberty`) and FPGA (:yoscrypt:`abc -lut`) mapping, but this page will focus on
FPGA mapping.
The `abc` pass can be used for both ASIC (e.g. :yoscrypt:`abc -liberty`) and
FPGA (:yoscrypt:`abc -lut`) mapping, but this page will focus on FPGA mapping.
The `abc9` pass generally provides superior mapping quality due to
being aware of combination boxes and DFF and LUT timings, giving it a more
global view of the mapping problem.
The `abc9` pass generally provides superior mapping quality due to being aware
of combination boxes and DFF and LUT timings, giving it a more global view of
the mapping problem.
.. _ABC: https://github.com/berkeley-abc/abc

View file

@ -98,8 +98,8 @@ our internal cell library will be mapped to:
:name: mycells-lib
:caption: :file:`mycells.lib`
Recall that the Yosys built-in logic gate types are `$_NOT_`, `$_AND_`,
`$_OR_`, `$_XOR_`, and `$_MUX_` with an assortment of dff memory types.
Recall that the Yosys built-in logic gate types are `$_NOT_`, `$_AND_`, `$_OR_`,
`$_XOR_`, and `$_MUX_` with an assortment of dff memory types.
:ref:`mycells-lib` defines our target cells as ``BUF``, ``NOT``, ``NAND``,
``NOR``, and ``DFF``. Mapping between these is performed with the commands
`dfflibmap` and `abc` as follows:
@ -117,8 +117,8 @@ The final version of our ``counter`` module looks like this:
``counter`` after hardware cell mapping
Before finally being output as a verilog file with `write_verilog`,
which can then be loaded into another tool:
Before finally being output as a verilog file with `write_verilog`, which can
then be loaded into another tool:
.. literalinclude:: /code_examples/intro/counter.ys
:language: yoscrypt

View file

@ -1,12 +1,12 @@
The extract pass
----------------
- Like the `techmap` pass, the `extract` pass is called with a
map file. It compares the circuits inside the modules of the map file with the
design and looks for sub-circuits in the design that match any of the modules
in the map file.
- If a match is found, the `extract` pass will replace the matching
subcircuit with an instance of the module from the map file.
- Like the `techmap` pass, the `extract` pass is called with a map file. It
compares the circuits inside the modules of the map file with the design and
looks for sub-circuits in the design that match any of the modules in the map
file.
- If a match is found, the `extract` pass will replace the matching subcircuit
with an instance of the module from the map file.
- In a way the `extract` pass is the inverse of the techmap pass.
.. todo:: add/expand supporting text, also mention custom pattern matching and
@ -68,23 +68,23 @@ The wrap-extract-unwrap method
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Often a coarse-grain element has a constant bit-width, but can be used to
implement operations with a smaller bit-width. For example, a 18x25-bit multiplier
can also be used to implement 16x20-bit multiplication.
implement operations with a smaller bit-width. For example, a 18x25-bit
multiplier can also be used to implement 16x20-bit multiplication.
A way of mapping such elements in coarse grain synthesis is the
wrap-extract-unwrap method:
wrap
Identify candidate-cells in the circuit and wrap them in a cell with a
constant wider bit-width using `techmap`. The wrappers use the same
parameters as the original cell, so the information about the original width
of the ports is preserved. Then use the `connwrappers` command to
connect up the bit-extended in- and outputs of the wrapper cells.
constant wider bit-width using `techmap`. The wrappers use the same parameters
as the original cell, so the information about the original width of the ports
is preserved. Then use the `connwrappers` command to connect up the
bit-extended in- and outputs of the wrapper cells.
extract
Now all operations are encoded using the same bit-width as the coarse grain
element. The `extract` command can be used to replace circuits with
cells of the target architecture.
element. The `extract` command can be used to replace circuits with cells of
the target architecture.
unwrap
The remaining wrapper cell can be unwrapped using `techmap`.

View file

@ -25,9 +25,8 @@ following description:
- Does not already have the ``\fsm_encoding`` attribute.
- Is not an output of the containing module.
- Is driven by single `$dff` or `$adff` cell.
- The ``\D``-Input of this `$dff` or `$adff` cell is driven by a
multiplexer tree that only has constants or the old state value on its
leaves.
- The ``\D``-Input of this `$dff` or `$adff` cell is driven by a multiplexer
tree that only has constants or the old state value on its leaves.
- The state value is only used in the said multiplexer tree or by simple
relational cells that compare the state value to a constant (usually `$eq`
cells).
@ -87,8 +86,8 @@ given set of result signals using a set of signal-value assignments. It can also
be passed a list of stop-signals that abort the ConstEval algorithm if the value
of a stop-signal is needed in order to calculate the result signals.
The `fsm_extract` pass uses the ConstEval class in the following way to
create a transition table. For each state:
The `fsm_extract` pass uses the ConstEval class in the following way to create a
transition table. For each state:
1. Create a ConstEval object for the module containing the FSM
2. Add all control inputs to the list of stop signals
@ -108,13 +107,12 @@ drivers for the control outputs are disconnected.
FSM optimization
~~~~~~~~~~~~~~~~
The `fsm_opt` pass performs basic optimizations on `$fsm` cells (not
including state recoding). The following optimizations are performed (in this
order):
The `fsm_opt` pass performs basic optimizations on `$fsm` cells (not including
state recoding). The following optimizations are performed (in this order):
- Unused control outputs are removed from the `$fsm` cell. The attribute
``\unused_bits`` (that is usually set by the `opt_clean` pass) is
used to determine which control outputs are unused.
``\unused_bits`` (that is usually set by the `opt_clean` pass) is used to
determine which control outputs are unused.
- Control inputs that are connected to the same driver are merged.
@ -134,11 +132,10 @@ order):
FSM recoding
~~~~~~~~~~~~
The `fsm_recode` pass assigns new bit pattern to the states. Usually
this also implies a change in the width of the state signal. At the moment of
this writing only one-hot encoding with all-zero for the reset state is
supported.
The `fsm_recode` pass assigns new bit pattern to the states. Usually this also
implies a change in the width of the state signal. At the moment of this writing
only one-hot encoding with all-zero for the reset state is supported.
The `fsm_recode` pass can also write a text file with the changes
performed by it that can be used when verifying designs synthesized by Yosys
using Synopsys Formality.
The `fsm_recode` pass can also write a text file with the changes performed by
it that can be used when verifying designs synthesized by Yosys using Synopsys
Formality.

View file

@ -8,17 +8,16 @@ coarse-grain optimizations before being mapped to hard blocks and fine-grain
cells. Most commands in Yosys will target either coarse-grain representation or
fine-grain representation, with only a select few compatible with both states.
Commands such as `proc`, `fsm`, and `memory` rely on
the additional information in the coarse-grain representation, along with a
number of optimizations such as `wreduce`, `share`, and
`alumacc`. `opt` provides optimizations which are useful in
both states, while `techmap` is used to convert coarse-grain cells
to the corresponding fine-grain representation.
Commands such as `proc`, `fsm`, and `memory` rely on the additional information
in the coarse-grain representation, along with a number of optimizations such as
`wreduce`, `share`, and `alumacc`. `opt` provides optimizations which are
useful in both states, while `techmap` is used to convert coarse-grain cells to
the corresponding fine-grain representation.
Single-bit cells (logic gates, FFs) as well as LUTs, half-adders, and
full-adders make up the bulk of the fine-grain representation and are necessary
for commands such as `abc`\ /`abc9`, `simplemap`,
`dfflegalize`, and `memory_map`.
for commands such as `abc`\ /`abc9`, `simplemap`, `dfflegalize`, and
`memory_map`.
.. toctree::
:maxdepth: 3

View file

@ -5,10 +5,10 @@ The `memory` command
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
In the RTL netlist, memory reads and writes are individual cells. This makes
consolidating the number of ports for a memory easier. The `memory`
pass transforms memories to an implementation. Per default that is logic for
address decoders and registers. It also is a macro command that calls the other
common ``memory_*`` passes in a sensible order:
consolidating the number of ports for a memory easier. The `memory` pass
transforms memories to an implementation. Per default that is logic for address
decoders and registers. It also is a macro command that calls the other common
``memory_*`` passes in a sensible order:
.. literalinclude:: /code_examples/macro_commands/memory.ys
:language: yoscrypt
@ -22,11 +22,11 @@ Some quick notes:
- `memory_dff` merges registers into the memory read- and write cells.
- `memory_collect` collects all read and write cells for a memory and
transforms them into one multi-port memory cell.
- `memory_map` takes the multi-port memory cell and transforms it to
address decoder logic and registers.
- `memory_map` takes the multi-port memory cell and transforms it to address
decoder logic and registers.
For more information about `memory`, such as disabling certain sub
commands, see :doc:`/cmd/memory`.
For more information about `memory`, such as disabling certain sub commands, see
:doc:`/cmd/memory`.
Example
-------
@ -75,20 +75,20 @@ For example:
techmap -map my_memory_map.v
memory_map
`memory_libmap` attempts to convert memory cells (`$mem_v2` etc) into
hardware supported memory using a provided library (:file:`my_memory_map.txt` in the
`memory_libmap` attempts to convert memory cells (`$mem_v2` etc) into hardware
supported memory using a provided library (:file:`my_memory_map.txt` in the
example above). Where necessary, emulation logic is added to ensure functional
equivalence before and after this conversion. :yoscrypt:`techmap -map
my_memory_map.v` then uses `techmap` to map to hardware primitives. Any
leftover memory cells unable to be converted are then picked up by
`memory_map` and mapped to DFFs and address decoders.
my_memory_map.v` then uses `techmap` to map to hardware primitives. Any leftover
memory cells unable to be converted are then picked up by `memory_map` and
mapped to DFFs and address decoders.
.. note::
More information about what mapping options are available and associated
costs of each can be found by enabling debug outputs. This can be done with
the `debug` command, or by using the ``-g`` flag when calling Yosys
to globally enable debug messages.
the `debug` command, or by using the ``-g`` flag when calling Yosys to
globally enable debug messages.
For more on the lib format for `memory_libmap`, see
`passes/memory/memlib.md
@ -110,13 +110,15 @@ Notes
Memory kind selection
~~~~~~~~~~~~~~~~~~~~~
The memory inference code will automatically pick target memory primitive based on memory geometry
and features used. Depending on the target, there can be up to four memory primitive classes
available for selection:
The memory inference code will automatically pick target memory primitive based
on memory geometry and features used. Depending on the target, there can be up
to four memory primitive classes available for selection:
- FF RAM (aka logic): no hardware primitive used, memory lowered to a bunch of FFs and multiplexers
- FF RAM (aka logic): no hardware primitive used, memory lowered to a bunch of
FFs and multiplexers
- Can handle arbitrary number of write ports, as long as all write ports are in the same clock domain
- Can handle arbitrary number of write ports, as long as all write ports are
in the same clock domain
- Can handle arbitrary number and kind of read ports
- LUT RAM (aka distributed RAM): uses LUT storage as RAM
@ -131,7 +133,8 @@ available for selection:
- Supported on basically all FPGAs
- Supports only synchronous reads
- Two ports with separate clocks
- Usually supports true dual port (with notable exception of ice40 that only supports SDP)
- Usually supports true dual port (with notable exception of ice40 that only
supports SDP)
- Usually supports asymmetric memories and per-byte write enables
- Several kilobits in size
@ -155,19 +158,22 @@ available for selection:
- Two ports, both with mutually exclusive synchronous read and write
- Single clock
- Will not be automatically selected by memory inference code, needs explicit opt-in via
ram_style attribute
- Will not be automatically selected by memory inference code, needs explicit
opt-in via ram_style attribute
In general, you can expect the automatic selection process to work roughly like this:
In general, you can expect the automatic selection process to work roughly like
this:
- If any read port is asynchronous, only LUT RAM (or FF RAM) can be used.
- If there is more than one write port, only block RAM can be used, and this needs to be a
hardware-supported true dual port pattern
- If there is more than one write port, only block RAM can be used, and this
needs to be a hardware-supported true dual port pattern
- … unless all write ports are in the same clock domain, in which case FF RAM can also be used,
but this is generally not what you want for anything but really small memories
- … unless all write ports are in the same clock domain, in which case FF RAM
can also be used, but this is generally not what you want for anything but
really small memories
- Otherwise, either FF RAM, LUT RAM, or block RAM will be used, depending on memory size
- Otherwise, either FF RAM, LUT RAM, or block RAM will be used, depending on
memory size
This process can be overridden by attaching a ram_style attribute to the memory:
@ -178,15 +184,17 @@ This process can be overridden by attaching a ram_style attribute to the memory:
It is an error if this override cannot be realized for the given target.
Many alternate spellings of the attribute are also accepted, for compatibility with other software.
Many alternate spellings of the attribute are also accepted, for compatibility
with other software.
Initial data
~~~~~~~~~~~~
Most FPGA targets support initializing all kinds of memory to user-provided values. If explicit
initialization is not used the initial memory value is undefined. Initial data can be provided by
either initial statements writing memory cells one by one of ``$readmemh`` or ``$readmemb`` system
tasks. For an example pattern, see `sr_init`_.
Most FPGA targets support initializing all kinds of memory to user-provided
values. If explicit initialization is not used the initial memory value is
undefined. Initial data can be provided by either initial statements writing
memory cells one by one of ``$readmemh`` or ``$readmemb`` system tasks. For an
example pattern, see `sr_init`_.
.. _wbe:
@ -194,12 +202,13 @@ Write port with byte enables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Byte enables can be used with any supported pattern
- To ensure that multiple writes will be merged into one port, they need to have disjoint bit
ranges, have the same address, and the same clock
- Any write enable granularity will be accepted (down to per-bit write enables), but using smaller
granularity than natively supported by the target is very likely to be inefficient (eg. using
4-bit bytes on ECP5 will result in either padding the bytes with 5 dummy bits to native 9-bit
units or splitting the RAM into two block RAMs)
- To ensure that multiple writes will be merged into one port, they need to have
disjoint bit ranges, have the same address, and the same clock
- Any write enable granularity will be accepted (down to per-bit write enables),
but using smaller granularity than natively supported by the target is very
likely to be inefficient (eg. using 4-bit bytes on ECP5 will result in either
padding the bytes with 5 dummy bits to native 9-bit units or splitting the RAM
into two block RAMs)
.. code:: verilog
@ -240,7 +249,8 @@ Synchronous SDP with clock domain crossing
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Will result in block RAM or LUT RAM depending on size
- No behavior guarantees in case of simultaneous read and write to the same address
- No behavior guarantees in case of simultaneous read and write to the same
address
.. code:: verilog
@ -261,9 +271,9 @@ Synchronous SDP read first
- The read and write parts can be in the same or different processes.
- Will result in block RAM or LUT RAM depending on size
- As long as the same clock is used for both, yosys will ensure read-first behavior. This may
require extra circuitry on some targets for block RAM. If this is not necessary, use one of the
patterns below.
- As long as the same clock is used for both, yosys will ensure read-first
behavior. This may require extra circuitry on some targets for block RAM. If
this is not necessary, use one of the patterns below.
.. code:: verilog
@ -281,8 +291,8 @@ Synchronous SDP read first
Synchronous SDP with undefined collision behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Like above, but the read value is undefined when read and write ports target the same address in
the same cycle
- Like above, but the read value is undefined when read and write ports target
the same address in the same cycle
.. code:: verilog
@ -322,8 +332,8 @@ Synchronous SDP with write-first behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Will result in block RAM or LUT RAM depending on size
- May use additional circuitry for block RAM if write-first is not natively supported. Will always
use additional circuitry for LUT RAM.
- May use additional circuitry for block RAM if write-first is not natively
supported. Will always use additional circuitry for LUT RAM.
.. code:: verilog
@ -343,7 +353,8 @@ Synchronous SDP with write-first behavior
Synchronous SDP with write-first behavior (alternate pattern)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- This pattern is supported for compatibility, but is much less flexible than the above
- This pattern is supported for compatibility, but is much less flexible than
the above
.. code:: verilog
@ -378,8 +389,10 @@ Synchronous single-port RAM with mutually exclusive read/write
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Will result in single-port block RAM or LUT RAM depending on size
- This is the correct pattern to infer ice40 SPRAM (with manual ram_style selection)
- On targets that don't support read/write block RAM ports (eg. ice40), will result in SDP block RAM instead
- This is the correct pattern to infer ice40 SPRAM (with manual ram_style
selection)
- On targets that don't support read/write block RAM ports (eg. ice40), will
result in SDP block RAM instead
- For block RAM, will use "NO_CHANGE" mode if available
.. code:: verilog
@ -396,12 +409,14 @@ Synchronous single-port RAM with mutually exclusive read/write
Synchronous single-port RAM with read-first behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Will only result in single-port block RAM when read-first behavior is natively supported;
otherwise, SDP RAM with additional circuitry will be used
- Many targets (Xilinx, ECP5, …) can only natively support read-first/write-first single-port RAM
(or TDP RAM) where the write_enable signal implies the read_enable signal (ie. can never write
without reading). The memory inference code will run a simple SAT solver on the control signals to
determine if this is the case, and insert emulation circuitry if it cannot be easily proven.
- Will only result in single-port block RAM when read-first behavior is natively
supported; otherwise, SDP RAM with additional circuitry will be used
- Many targets (Xilinx, ECP5, …) can only natively support
read-first/write-first single-port RAM (or TDP RAM) where the write_enable
signal implies the read_enable signal (ie. can never write without reading).
The memory inference code will run a simple SAT solver on the control signals
to determine if this is the case, and insert emulation circuitry if it cannot
be easily proven.
.. code:: verilog
@ -418,7 +433,8 @@ Synchronous single-port RAM with write-first behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Will result in single-port block RAM or LUT RAM when supported
- Block RAMs will require extra circuitry if write-first behavior not natively supported
- Block RAMs will require extra circuitry if write-first behavior not natively
supported
.. code:: verilog
@ -440,8 +456,8 @@ Synchronous read port with initial value
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Initial read port values can be combined with any other supported pattern
- If block RAM is used and initial read port values are not natively supported by the target, small
emulation circuit will be inserted
- If block RAM is used and initial read port values are not natively supported
by the target, small emulation circuit will be inserted
.. code:: verilog
@ -459,10 +475,11 @@ Synchronous read port with initial value
Read register reset patterns
----------------------------
Resets can be combined with any other supported pattern (except that synchronous reset and
asynchronous reset cannot both be used on a single read port). If block RAM is used and the
selected reset (synchronous or asynchronous) is used but not natively supported by the target, small
emulation circuitry will be inserted.
Resets can be combined with any other supported pattern (except that synchronous
reset and asynchronous reset cannot both be used on a single read port). If
block RAM is used and the selected reset (synchronous or asynchronous) is used
but not natively supported by the target, small emulation circuitry will be
inserted.
Synchronous reset, reset priority over enable
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -520,22 +537,26 @@ Synchronous read port with asynchronous reset
Asymmetric memory patterns
--------------------------
To construct an asymmetric memory (memory with read/write ports of differing widths):
To construct an asymmetric memory (memory with read/write ports of differing
widths):
- Declare the memory with the width of the narrowest intended port
- Split all wide ports into multiple narrow ports
- To ensure the wide ports will be correctly merged:
- For the address, use a concatenation of actual address in the high bits and a constant in the
low bits
- Ensure the actual address is identical for all ports belonging to the wide port
- For the address, use a concatenation of actual address in the high bits and
a constant in the low bits
- Ensure the actual address is identical for all ports belonging to the wide
port
- Ensure that clock is identical
- For read ports, ensure that enable/reset signals are identical (for write ports, the enable
signal may vary — this will result in using the byte enable functionality)
- For read ports, ensure that enable/reset signals are identical (for write
ports, the enable signal may vary — this will result in using the byte
enable functionality)
Asymmetric memory is supported on all targets, but may require emulation circuitry where not
natively supported. Note that when the memory is larger than the underlying block RAM primitive,
hardware asymmetric memory support is likely not to be used even if present as it is more expensive.
Asymmetric memory is supported on all targets, but may require emulation
circuitry where not natively supported. Note that when the memory is larger
than the underlying block RAM primitive, hardware asymmetric memory support is
likely not to be used even if present as it is more expensive.
.. _wide_sr:
@ -615,20 +636,25 @@ Wide write port
True dual port (TDP) patterns
-----------------------------
- Many different variations of true dual port memory can be created by combining two single-port RAM
patterns on the same memory
- When TDP memory is used, memory inference code has much less maneuver room to create requested
semantics compared to individual single-port patterns (which can end up lowered to SDP memory
where necessary) — supported patterns depend strongly on the target
- In particular, when both ports have the same clock, it's likely that "undefined collision" mode
needs to be manually selected to enable TDP memory inference
- The examples below are non-exhaustive — many more combinations of port types are possible
- Note: if two write ports are in the same process, this defines a priority relation between them
(if both ports are active in the same clock, the later one wins). On almost all targets, this will
result in a bit of extra circuitry to ensure the priority semantics. If this is not what you want,
put them in separate processes.
- Many different variations of true dual port memory can be created by combining
two single-port RAM patterns on the same memory
- When TDP memory is used, memory inference code has much less maneuver room to
create requested semantics compared to individual single-port patterns (which
can end up lowered to SDP memory where necessary) — supported patterns depend
strongly on the target
- In particular, when both ports have the same clock, it's likely that
"undefined collision" mode needs to be manually selected to enable TDP memory
inference
- The examples below are non-exhaustive — many more combinations of port types
are possible
- Note: if two write ports are in the same process, this defines a priority
relation between them (if both ports are active in the same clock, the later
one wins). On almost all targets, this will result in a bit of extra circuitry
to ensure the priority semantics. If this is not what you want, put them in
separate processes.
- Priority is not supported when using the verific front end and any priority semantics are ignored.
- Priority is not supported when using the verific front end and any priority
semantics are ignored.
TDP with different clocks, exclusive read/write
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -654,7 +680,8 @@ TDP with different clocks, exclusive read/write
TDP with same clock, read-first behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- This requires hardware inter-port read-first behavior, and will only work on some targets (Xilinx, Nexus)
- This requires hardware inter-port read-first behavior, and will only work on
some targets (Xilinx, Nexus)
.. code:: verilog
@ -677,9 +704,10 @@ TDP with same clock, read-first behavior
TDP with multiple read ports
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- The combination of a single write port with an arbitrary amount of read ports is supported on all
targets — if a multi-read port primitive is available (like Xilinx RAM64M), it'll be used as
appropriate. Otherwise, the memory will be automatically split into multiple primitives.
- The combination of a single write port with an arbitrary amount of read ports
is supported on all targets — if a multi-read port primitive is available
(like Xilinx RAM64M), it'll be used as appropriate. Otherwise, the memory
will be automatically split into multiple primitives.
.. code:: verilog

View file

@ -9,9 +9,9 @@ This chapter outlines these optimizations.
The `opt` macro command
--------------------------------
The Yosys pass `opt` runs a number of simple optimizations. This
includes removing unused signals and cells and const folding. It is recommended
to run this pass after each major step in the synthesis script. As listed in
The Yosys pass `opt` runs a number of simple optimizations. This includes
removing unused signals and cells and const folding. It is recommended to run
this pass after each major step in the synthesis script. As listed in
:doc:`/cmd/opt`, this macro command calls the following ``opt_*`` commands:
.. literalinclude:: /code_examples/macro_commands/opt.ys
@ -69,17 +69,17 @@ undef.
The last two lines simply replace an `$_AND_` gate with one constant-1 input
with a buffer.
Besides this basic const folding the `opt_expr` pass can replace 1-bit
wide `$eq` and `$ne` cells with buffers or not-gates if one input is
constant. Equality checks may also be reduced in size if there are redundant
bits in the arguments (i.e. bits which are constant on both inputs). This can,
for example, result in a 32-bit wide constant like ``255`` being reduced to the
8-bit value of ``8'11111111`` if the signal being compared is only 8-bit as in
Besides this basic const folding the `opt_expr` pass can replace 1-bit wide
`$eq` and `$ne` cells with buffers or not-gates if one input is constant.
Equality checks may also be reduced in size if there are redundant bits in the
arguments (i.e. bits which are constant on both inputs). This can, for example,
result in a 32-bit wide constant like ``255`` being reduced to the 8-bit value
of ``8'11111111`` if the signal being compared is only 8-bit as in
:ref:`addr_gen_clean` of :doc:`/getting_started/example_synth`.
The `opt_expr` pass is very conservative regarding optimizing `$mux`
cells, as these cells are often used to model decision-trees and breaking these
trees can interfere with other optimizations.
The `opt_expr` pass is very conservative regarding optimizing `$mux` cells, as
these cells are often used to model decision-trees and breaking these trees can
interfere with other optimizations.
.. literalinclude:: /code_examples/opt/opt_expr.ys
:language: Verilog
@ -100,9 +100,9 @@ identifies cells with identical inputs and replaces them with a single instance
of the cell.
The option ``-nomux`` can be used to disable resource sharing for multiplexer
cells (`$mux` and `$pmux`.) This can be useful as it prevents multiplexer
trees to be merged, which might prevent `opt_muxtree` to identify
possible optimizations.
cells (`$mux` and `$pmux`.) This can be useful as it prevents multiplexer trees
to be merged, which might prevent `opt_muxtree` to identify possible
optimizations.
.. literalinclude:: /code_examples/opt/opt_merge.ys
:language: Verilog
@ -128,9 +128,9 @@ Consider the following simple example:
:caption: example verilog for demonstrating `opt_muxtree`
The output can never be ``c``, as this would require ``a`` to be 1 for the outer
multiplexer and 0 for the inner multiplexer. The `opt_muxtree` pass
detects this contradiction and replaces the inner multiplexer with a constant 1,
yielding the logic for ``y = a ? b : d``.
multiplexer and 0 for the inner multiplexer. The `opt_muxtree` pass detects this
contradiction and replaces the inner multiplexer with a constant 1, yielding the
logic for ``y = a ? b : d``.
.. figure:: /_images/code_examples/opt/opt_muxtree.*
:class: width-helper invert-helper
@ -141,9 +141,9 @@ Simplifying large MUXes and AND/OR gates - `opt_reduce`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is a simple optimization pass that identifies and consolidates identical
input bits to `$reduce_and` and `$reduce_or` cells. It also sorts the input
bits to ease identification of shareable `$reduce_and` and `$reduce_or`
cells in other passes.
input bits to `$reduce_and` and `$reduce_or` cells. It also sorts the input bits
to ease identification of shareable `$reduce_and` and `$reduce_or` cells in
other passes.
This pass also identifies and consolidates identical inputs to multiplexer
cells. In this case the new shared select bit is driven using a `$reduce_or`
@ -162,8 +162,8 @@ This pass identifies mutually exclusive cells of the same type that:
a. share an input signal, and
b. drive the same `$mux`, `$_MUX_`, or `$pmux` multiplexing cell,
allowing the cell to be merged and the multiplexer to be moved from
multiplexing its output to multiplexing the non-shared input signals.
allowing the cell to be merged and the multiplexer to be moved from multiplexing
its output to multiplexing the non-shared input signals.
.. literalinclude:: /code_examples/opt/opt_share.ys
:language: Verilog
@ -176,16 +176,16 @@ multiplexing its output to multiplexing the non-shared input signals.
Before and after `opt_share`
When running `opt` in full, the original `$mux` (labeled ``$3``) is
optimized away by `opt_expr`.
When running `opt` in full, the original `$mux` (labeled ``$3``) is optimized
away by `opt_expr`.
Performing DFF optimizations - `opt_dff`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This pass identifies single-bit d-type flip-flops (`$_DFF_`, `$dff`, and
`$adff` cells) with a constant data input and replaces them with a constant
driver. It can also merge clock enables and synchronous reset multiplexers,
removing unused control inputs.
This pass identifies single-bit d-type flip-flops (`$_DFF_`, `$dff`, and `$adff`
cells) with a constant data input and replaces them with a constant driver. It
can also merge clock enables and synchronous reset multiplexers, removing unused
control inputs.
Called with ``-nodffe`` and ``-nosdff``, this pass is used to prepare a design
for :doc:`/using_yosys/synthesis/fsm`.
@ -200,20 +200,20 @@ attribute can be used for debugging or by other optimization passes.
When to use `opt` or `clean`
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Usually it does not hurt to call `opt` after each regular command in
the synthesis script. But it increases the synthesis time, so it is favourable
to only call `opt` when an improvement can be achieved.
Usually it does not hurt to call `opt` after each regular command in the
synthesis script. But it increases the synthesis time, so it is favourable to
only call `opt` when an improvement can be achieved.
It is generally a good idea to call `opt` before inherently expensive
commands such as `sat` or `freduce`, as the possible gain is
much higher in these cases as the possible loss.
It is generally a good idea to call `opt` before inherently expensive commands
such as `sat` or `freduce`, as the possible gain is much higher in these cases
as the possible loss.
The `clean` command, which is an alias for `opt_clean` with
fewer outputs, on the other hand is very fast and many commands leave a mess
(dangling signal wires, etc). For example, most commands do not remove any wires
or cells. They just change the connections and depend on a later call to clean
to get rid of the now unused objects. So the occasional ``;;``, which itself is
an alias for `clean`, is a good idea in every synthesis script, e.g:
The `clean` command, which is an alias for `opt_clean` with fewer outputs, on
the other hand is very fast and many commands leave a mess (dangling signal
wires, etc). For example, most commands do not remove any wires or cells. They
just change the connections and depend on a later call to clean to get rid of
the now unused objects. So the occasional ``;;``, which itself is an alias for
`clean`, is a good idea in every synthesis script, e.g:
.. code-block:: yoscrypt

View file

@ -5,23 +5,23 @@ Converting process blocks
:language: yoscrypt
The Verilog frontend converts ``always``-blocks to RTL netlists for the
expressions and "processess" for the control- and memory elements. The
`proc` command then transforms these "processess" to netlists of RTL
multiplexer and register cells. It also is a macro command that calls the other
``proc_*`` commands in a sensible order:
expressions and "processess" for the control- and memory elements. The `proc`
command then transforms these "processess" to netlists of RTL multiplexer and
register cells. It also is a macro command that calls the other ``proc_*``
commands in a sensible order:
.. literalinclude:: /code_examples/macro_commands/proc.ys
:language: yoscrypt
:start-after: #end:
:caption: Passes called by `proc`
After all the ``proc_*`` commands, `opt_expr` is called. This can be
disabled by calling :yoscrypt:`proc -noopt`. For more information about
`proc`, such as disabling certain sub commands, see :doc:`/cmd/proc`.
After all the ``proc_*`` commands, `opt_expr` is called. This can be disabled by
calling :yoscrypt:`proc -noopt`. For more information about `proc`, such as
disabling certain sub commands, see :doc:`/cmd/proc`.
Many commands can not operate on modules with "processess" in them. Usually a
call to `proc` is the first command in the actual synthesis procedure
after design elaboration.
call to `proc` is the first command in the actual synthesis procedure after
design elaboration.
Example
^^^^^^^