The ID(OVERFLOW) IdString isn't used widely enough that we require a
statically allocated IdString, but I think it's good to have an example
workaround in place in case more collisions come up.
I've used this shell command to obtain the list:
rg -I -t cpp -t yacc -o \
'ID\((\$?[a-zA-Z0-9_]+)\)|ID::($?[a-zA-Z0-9_]+)' -r 'X($1$2)' \
| LC_ALL=C sort -u
This removed the entries X(_TECHMAP_FAIL_) and X(nomem2init).
The vast majority of ID(...) uses are in a context that is overloaded
for StaticIdString or will cause implicit conversion to an IdString
constant reference. For some sufficently overloaded contexts, implicit
conversion may fail, so it's useful to have a method to force obtaining
a `IdString const &` from an ID(...) use.
When turning all literal IdStrings of the codebase into StaticIdStrings
this was needed in exactly one place, for which this commit adds an
`id_string()` call.
The simple XOR `commutative_eat()` implementation produces a lot of collisions.
https://www.preprints.org/manuscript/201710.0192/v1/download is a useful reference on this topic.
Running the included `hashTest.cc` without the hashlib changes, I get 49,580,349 collisions.
The 49,995,000 (i,j) pairs (0 <= i < 10000, i < j < 10000) hash into only 414,651 unique hash values.
We get simple collisions like (0,1) colliding with (2,3).
With the hashlib changes, we get only 707,099 collisions and 49,287,901 unique hash values.
Much better! The `commutative_hash` implementation corresponds to `Sum(4)` in the paper
mentioned above.
`CellTypes::eval()` is more generic but also more limited. `ConstEval::eval()` requires more setup (both in code and at runtime) but has more complete support.
Still unsupported:
- wide muxes (`$_MUX16_` and friends)
Partially supported types have comments in `test_cell.cc`.
Fix `CellTypes::eval() for `$_NMUX_`.
Fix `RTLIL::Cell::fixup_parameters()` for $concat, $bwmux and $bweqx.
Conditionally include help source tracking to preserve ABI.
Docs builds can (and should) use `ENABLE_HELP_SOURCE` so that the generated sphinx docs can perform default grouping and link to source files.
Regular user-builds don't need the source tracking.
`_content` vector owns elements so that when the `ContentListing` is deleted so is the content.
Remove `get_content()` method in favour of `begin()` and `end()` const iterators.
More `const` in general, and iterations over `ContentListing` use `&content` to avoid copying data.
dict is pretty slow when you don't ever need to iterate the container in
order. And the hashfunction for char* in dict hashes for every single
byte in the string, likely doing significantly more work than std::hash.
Checking only happens at compile time if -std=c++20 (or greater) is enabled. Otherwise
the checking happens at run time.
This requires the format string to be a compile-time constant (when compiling with
C++20), so fix a few places where that isn't true.
The format string behavior is a bit more lenient than C printf. For %d/%u
you can pass any integer type and it will be converted and output without
truncating bits, i.e. any length specifier is ignored and the conversion is
always treated as 'll'. Any truncation needs to be done by casting the argument itself.
For %f/%g you can pass anything that converts to double, including integers.
Performance results with clang 19 -O3 on Linux:
```
hyperfine './yosys -dp "read_rtlil /usr/local/google/home/rocallahan/Downloads/jpeg.synth.il; dump"'
```
C++17 before: Time (mean ± σ): 101.3 ms ± 0.8 ms [User: 85.6 ms, System: 15.6 ms]
C++17 after: Time (mean ± σ): 98.4 ms ± 1.2 ms [User: 82.1 ms, System: 16.1 ms]
C++20 before: Time (mean ± σ): 100.9 ms ± 1.1 ms [User: 87.0 ms, System: 13.8 ms]
C++20 after: Time (mean ± σ): 97.8 ms ± 1.4 ms [User: 83.1 ms, System: 14.7 ms]
The generated code is reasonably efficient. E.g. with clang 19, `stringf()` with a format
with no %% escapes and no other parameters (a weirdly common case) often compiles to a fully
inlined `std::string` construction. In general the format string parsing is often (not always)
compiled away.
If you have a large design with a lot of modules and you use the Verilog
backend to emit modules one at a time to separate files, performance is
very low. The problem is that the Verilog backend calls `design->sort()`
every time, which sorts the contents of all modules, and this is slow
even when everything is already sorted.
We can easily fix this by only sorting the contents of modules that
we're actually going to emit.