3
0
Fork 0
mirror of https://github.com/YosysHQ/yosys synced 2025-10-25 00:44:37 +00:00
Commit graph

256 commits

Author SHA1 Message Date
whitequark
4aa65f406f cxxrtl: treat internal wires used only for debug as constants.
Fixes #2739 (again).
2021-07-17 14:23:57 +00:00
whitequark
2db4137514
Merge pull request #2874 from whitequark/cxxrtl-fix-2589
cxxrtl: run hierarchy pass regardless of (*top*) attribute presence
2021-07-16 11:12:19 +00:00
whitequark
efc43270fa
Merge pull request #2873 from whitequark/cxxrtl-fix-2500
cxxrtl: emit debug items for unused public wires
2021-07-16 11:01:10 +00:00
whitequark
5b003d6e5c cxxrtl: run hierarchy pass regardless of (*top*) attribute presence.
The hierarchy pass does a lot more than just finding the top module,
mainly resolving implicit (positional, wildcard) module connections.

Fixes #2589.
2021-07-16 10:27:47 +00:00
whitequark
09218896d6 cxxrtl: emit debug items for unused public wires.
This greatly improves debug information coverage.

Fixes #2500.
2021-07-16 10:14:40 +00:00
whitequark
b28ca7f5ac cxxrtl: don't expect user cell inputs to be wires.
Ports can be connected to constants, too. (Usually resets.)

Fixes #2521.
2021-07-16 09:51:52 +00:00
whitequark
44a3d924ce cxxrtl: don't mark buffered internal wires as UNUSED for debug.
Public wires may alias buffered internal wires, so keep BUFFERED
wires in debug information even if they are private. Debug items are
only created for public wires, so this does not otherwise affect how
debug information is emitted.

Fixes #2540.
Fixes #2841.
2021-07-16 07:54:49 +00:00
whitequark
54b6cb645f cxxrtl: mark dead local wires as unused even with inlining disabled.
Fixes #2739.
2021-07-15 22:27:27 +00:00
Marcelina Kościelnicka
8bf9cb407d kernel/mem: Add a coalesce_inits helper.
While this helper is already useful to squash sequential initializations
into one in cxxrtl, its main purpose is to squash overlapping masked memory
initializations (when they land) and avoid having to deal with them in
cxxrtl runtime.
2021-07-13 15:59:11 +02:00
Marcelina Kościelnicka
37506d737c cxxrtl: Support memory writes in processes. 2021-07-12 18:27:48 +02:00
Marcelina Kościelnicka
af7fa62251 cxxrtl: Add support for memory read port reset. 2021-07-12 18:27:48 +02:00
Marcelina Kościelnicka
be5cf29699 cxxrtl: Add support for mem read port initial data. 2021-07-12 18:27:48 +02:00
Marcelina Kościelnicka
d5c9595668 cxxrtl: Convert to Mem helpers.
This *only* does conversion, but doesn't add any new functionality —
support for memory read port init/reset is still upcoming.
2021-07-12 18:27:48 +02:00
whitequark
ab76d9cec5 cxxrtl: don't assert on edge sync rules tied to a constant.
These are commonly the result of tying an async reset to an inactive
level.
2021-03-07 14:29:30 +00:00
whitequark
d1de08e38a cxxrtl: allow always sync rules in debug_eval.
These can be produced from `always @*` processes, if `-noproc`
is used.
2021-03-07 14:28:45 +00:00
whitequark
9dd813374e
Merge pull request #2635 from whitequark/cxxrtl-memrd-async-addr
cxxrtl: follow aliases to outlines when emitting $memrd.ADDR
2021-03-05 05:30:19 -08:00
whitequark
06da2e0f18
Merge pull request #2634 from whitequark/cxxrtl-debug-wire-types
cxxrtl: add pass debug flag to show assigned wire types
2021-03-05 04:57:22 -08:00
whitequark
14ce8bdaa6 cxxrtl: follow aliases to outlines when emitting $memrd.ADDR. 2021-03-05 12:09:02 +00:00
whitequark
8471808834 cxxrtl: add pass debug flag to show assigned wire types.
Refs #2543.
2021-03-05 11:58:59 +00:00
whitequark
a9a873a1d2 cxxrtl: don't crash on empty designs. 2021-03-05 11:05:19 +00:00
whitequark
a77fa6709b
Merge pull request #2563 from whitequark/cxxrtl-msvc
cxxrtl: do not use `->template` for non-dependent names
2021-01-26 21:55:12 +00:00
whitequark
4b6e764c46 cxxrtl: do not use ->template for non-dependent names.
This breaks build on MSVC but not GCC/Clang.
2021-01-26 18:09:53 +00:00
Iris Johnson
c8415884d1 Improves the previous commit with a more complete coverage of the cases 2021-01-15 13:59:20 -06:00
Iris Johnson
86607d0fdc Handle sliced bits as clock inputs (fixes #2542) 2021-01-14 16:36:21 -06:00
whitequark
f14074d2c2 cxxrtl: don't crash generating debug information for unused wires. 2020-12-22 06:51:38 +00:00
whitequark
7378194169 cxxrtl: split processes into sync and case nodes.
Similar to the treatment of black boxes, splitting processes into two
scheduling nodes adds sufficient freedom so that netlists with
well-behaved processes (e.g. those emitted by nMigen) can immediately
converge.

Because processes are not emitted into edge-triggered regions, this
approach has comparable performance to -O5 (without -noproc), which
is substantially slower than -O6.
2020-12-22 03:48:09 +00:00
whitequark
b2221c1077 cxxrtl: completely rewrite netlist layout code.
The exact shape of C++ code emitted by CXXRTL has a critical effect
on performance, both compile-time and runtime. CXXRTL's performance
greatly improved when it started localizing and inlining wires, not
only because this assists the optimizer and register allocator, but
also because inlining code into edge-triggered regions cuts the time
spent in eval() by at least a factor of two.

However, the logic of netlist layout has always been ad-hoc, fragile,
and very hard to understand and modify. After commit ece25a45, which
introduced outlining, the same logic started being applied to two
distinct netlists at once instead of one, which barely worked.

This commit does four major changes:
  * There is now a single unambiguous source of truth (per subgraph)
    for the layout of any emitted wire.
  * Netlist layout is now done entirely during analysis using well
    known graph algorithms; no graph operations happen when emitting.
  * Netlist layout now happens completely separately for eval() and
    debug_eval() subgraphs.
  * Unreachable (within subgraph scope) netlist nodes are now neither
    emitted nor considered for wire inlining decisions.
The netlist layout code should also now closely match the described
semantics.

As a part of this large cleanup, it includes many miscellaneous
improvements:
  * The "bare minimum" debug level introduced in commit dd6a761d was
    split into two levels; -g1 now emits debug information *only* for
    inputs and state wires, and -g2 now emits debug information for
    all public members. The old behavior matches -g2. This is done
    to avoid bloat on low optimization levels.
  * Debug aliases and inlined connections are now handled separately,
    and complex RHS never interferes with inlined connections.
  * Aliases to outlined wires now carry a pointer to the outline.
  * Cell sync outputs can now be emitted in debug_eval().
  * Black box debug information now includes comb/sync driver flags.
  * The comment emitted for inlined cells is now accurate.
  * Debug information statistics now has less noise.
  * Netlist layout code is now much better documented.

Due to more precise inlining decisions, unmodified (i.e. with no
Yosys script being used) netlists now have much more logic inlined
into edge-triggered regions. On Minerva SoC SRAM, this improves
runtime by 20-25% across compilers and optimization levels.

Due to more precise reachability analysis, much less C++ code is now
emitted, especially at the maximum debug level. On Minerva SoC SRAM,
this improves clang compile time by 30-50% depending on options.
gcc is not affected.
2020-12-22 03:48:09 +00:00
whitequark
e825cf9d73 cxxrtl: simplify logic choosing wire type. NFCI. 2020-12-21 07:24:52 +00:00
whitequark
6f42b26cea cxxrtl: clarify node use-def construction. NFCI. 2020-12-21 07:24:52 +00:00
whitequark
406f866659 cxxrtl: fix typo. 2020-12-21 07:24:52 +00:00
whitequark
b9721bedf0 cxxrtl: speed up bit repeats (sign extends, etc).
On Minerva SoC SRAM, depending on the compiler, this change improves
overall time by 4-7%.
2020-12-21 02:20:34 +00:00
whitequark
40ca9d038b cxxrtl: speed up commits on clang.
On Minerva SoC SRAM compiled with clang-11, this change cuts commit
time in half (!) and overall time by 20%. When compiled with gcc-10,
there is no difference.
2020-12-21 02:20:30 +00:00
whitequark
3d3ea5099d cxxrtl: use static inline instead of inline in the C API.
In C, non-static inline functions require an implementation elsewhere
(even though the body is right there in the header). It is basically
never desirable to use those as opposed to static inline ones.
2020-12-20 14:48:16 +00:00
whitequark
d889a3df35 cxxrtl: print names of cells inlined in connections. 2020-12-15 11:02:38 +00:00
whitequark
f75bc6c7aa cxxrtl: disable optimization of debug_items().
Implementing outlining has greatly increased the amount of debug
information in a typical build, and consequently exposed performance
issues in C++ compilers, which are similar for both GCC and Clang;
the compile time of Minerva SoC SRAM increased almost twofold.

Although one would expect the slowdown to be caused by the increased
use of templates in `debug_eval()`, it is actually almost entirely
attributable to optimizations and codegen for `debug_items()`.

Fortunately, it is neither possible nor desirable to optimize
`debug_items()`: in most cases it is called exactly once, and its
body is a linear sequence of calls with unique arguments.

This commit turns off optimizations for `debug_items()` on GCC and
Clang, improving -Os compile time of Minerva SoC SRAM by ~40% (!)
2020-12-15 11:02:38 +00:00
whitequark
4d40595d64 cxxrtl: make alias analysis outlining-aware.
Before this commit, if a sequence of wires assigned in a chain would
terminate on a cell, none of the wires would get marked as aliases,
and typically all of the public wires would get outlined. The reason
for this behavior is that alias analysis predates outlining and in
fact runs before it.

After this commit, alias analysis runs after outlining and considers
outlined wires valid aliasees. More importantly, if the chained wires
contain any valid aliasees, then all of the wires are aliased to
the one that is topologically deepest.

Aliased wires incur virtually no overhead for the VCD writer, unlike
outlined wires that would otherwise take their place. On Minerva SoC
SRAM, size of the full VCD dump is reduced by ~65%, and throughput
is increased by ~55%.
2020-12-15 11:02:38 +00:00
whitequark
dd6a761db0 cxxrtl: add a "bare minimum" debug information level.
Useful to reduce overhead when no debug capabilities are necessary
except for access to design state.
2020-12-14 01:27:56 +00:00
whitequark
ece25a45d4 cxxrtl: implement debug information outlining.
Aggressive wire localization and inlining is necessary for CXXRTL to
achieve high performance. However, that comes with a cost: reduced
debug information coverage. Previously, as a workaround, the `-Og`
option could have been used to guarantee complete coverage, at a cost
of a significant performance penalty.

This commit introduces debug information outlining. The main eval()
function is compiled with the user-specified optimization settings.
In tandem, an auxiliary debug_eval() function, compiled from the same
netlist, can be used to reconstruct the values of localized/inlined
signals on demand. To the extent that it is possible, debug_eval()
reuses the results of computations performed by eval(), only filling
in the missing values.

Benchmarking a representative design (Minerva SoC SRAM) shows that:
  * Switching from `-O4`/`-Og` to `-O6` reduces runtime by ~40%.
  * Switching from `-g1` to `-g2`, both used with `-O6`, increases
    compile time by ~25%.
  * Although `-g2` increases the resident size of generated modules,
    this has no effect on runtime.

Because the impact of `-g2` is minimal and the benefits of having
unconditional 100% debug information coverage (and the performance
improvement as well) are major, this commit removes `-Og` and changes
the defaults to `-O6 -g2`.

We'll have our cake and eat it too!
2020-12-14 01:27:27 +00:00
whitequark
3b5a1314cd cxxrtl: rename "elision" to "inlining". NFC.
"Elision" in this context is an unusual and not very descriptive term
whereas "inlining" is common and straightforward. Also, introducing
"inlining" makes it easier to introduce its dual under the obvious
name "outlining".
2020-12-13 15:34:00 +00:00
whitequark
57759c3d1f cxxrtl: fix outdated comment. NFC. 2020-12-13 15:33:58 +00:00
whitequark
ac1a78923a cxxrtl: use IdString::isPublic(). NFC. 2020-12-13 15:33:55 +00:00
whitequark
e4aa8bc96b cxxrtl: don't overwrite buffered inputs.
Before this commit, a cell's input was always assigned like:

    p_cell.p_input = (value...);

If `p_input` is buffered (e.g. if the design is built at -O0), this
is not correct. (In practice, this breaks clocking.) Unfortunately,
the incorrect design was compiled without diagnostics because wire<>
was move-assignable and also implicitly constructible from value<>.

After this commit, cell inputs are no longer incorrectly assumed to
always be unbuffered, and wires are not assignable from values.
2020-12-11 23:32:06 +00:00
whitequark
e89f6ae819 cxxrtl: allow customizing the root module path in the C API. 2020-12-03 01:58:02 +00:00
whitequark
3e13cfe53d
Merge pull request #2468 from whitequark/cxxrtl-assert
cxxrtl: use CXXRTL_ASSERT for RTL contract violations instead of assert
2020-12-02 23:36:22 +00:00
whitequark
3cb109f54b
Merge pull request #2469 from whitequark/cxxrtl-no-clk
cxxrtl: fix crashes caused by a floating or constant clock input
2020-12-02 23:36:03 +00:00
whitequark
7067f0d788 cxxrtl: fix crashes caused by a floating or constant clock input.
E.g. in:

    module test;
        wire clk = 0;
        reg data;
        always @(posedge clk)
            data <= 0;
    endmodule
2020-12-02 21:43:25 +00:00
whitequark
aa0a15a42c cxxrtl: use CXXRTL_ASSERT for RTL contract violations instead of assert.
RTL contract violations and C++ contract violations are different:
the former depend on the netlist and will never violate memory safety
whereas the latter may. When loading a CXXRTL simulation into another
process, RTL contract violations should generally not crash it, while
C++ contract violations should.
2020-12-02 19:41:00 +00:00
whitequark
5beab5bc17 cxxrtl: provide a way to perform unobtrusive power-on reset.
Although it is always possible to destroy and recreate the design to
simulate a power-on reset, this has two drawbacks:
  * Black boxes are also destroyed and recreated, which causes them
    to reacquire their resources, which might be costly and/or erase
    important state.
  * Pointers into the design are invalidated and have to be acquired
    again, which is costly and might be very inconvenient if they are
    captured elsewhere (especially through the C API).
2020-12-02 08:25:27 +00:00
whitequark
65083e9520 cxxrtl: run hierarchy -auto-top if no top module is present.
In most cases, a CXXRTL simulation would use a top module, either
because this module serves as an entry point to the CXXRTL C API,
or because the outputs of a top module are unbuffered, improving
performance. Taking this into account, the CXXRTL backend now runs
`hierarchy -auto-top` if there is no top module. For the few cases
where this behavior is unwanted, it now accepts a `-nohierarchy`
option.

Fixes #2373.
2020-11-02 19:18:56 +00:00
whitequark
2ba05f5c31 cxxrtl: don't assert on non-constant $meminit inputs.
Fixes #2129.
2020-11-01 15:57:20 +00:00