Currently the order of extraction can vary based on which ABC runs finish first. That's
nondeterministic, therefore bad. Instead, force the processing to happen in the same order
as `assigned_cells`, i.e. the same order we use when not using parallelism. This should
make everything deterministic.
Note that we still allow ABC runs to complete out of order. Out-of-order results are
just not extracted until all the previous runs have completed and their results
extracted.
1) Change token from ABC_DONE to YOSYS_ABC_DONE to be a bit more robust against false matches.
2) Emit the token from the sourced script so that we don't have to worry about it showing up in the echoing
of the command as it executes. It will only appear in ABC stdout when it executes, i.e. when
our script has completed.
3) `set abcout` doesn't actually switch ABC to line buffering on stdout, since HAVE_SETVBUF is not actually
set in ABC builds in general. So stop using that. ABC does the necessary flushing when
`source` has finished.
With the updated bufnorm code, buffered 'z drivers are used as anchor
points for undirected connections. These are currently not supported by
read/write_xaiger2, so we temporarily replace those by roughly
equivalent $tribuf cells which will be handled as blackboxes that
properly roundtrip through the xaiger2 front and backend.
This ensures that entering and leaving bufnorm followed by `opt_clean`
is equivalent to just running `opt_clean`.
Also make sure that 'z-$buf cells get techmapped in a compatible way.
Doing ABC runs in parallel can actually make things slower when every ABC run requires
spawning an ABC subprocess --- especially when using popen(), which on glibc does not
use vfork(). What seems to happen is that constant fork()ing keeps making the main
process data pages copy-on-write, so the main process code that is setting up each ABC
call takes a lot of minor page-faults, slowing it down.
The solution is pretty straightforward although a little tricky to implement.
We just reuse ABC subprocesses. Instead of passing the ABC script name on the command
line, we spawn an ABC REPL and pipe a command into it to source the script. When that's
done we echo an `ABC_DONE` token instead of exiting. Yosys then puts the ABC process
onto a stack which we can pull from the next time we do an ABC run.
For one of our large designs, this is an additional 5x speedup of the primary AbcPass.
It does 5155 ABC runs, all very small; runtime of the AbcPass goes from 760s to 149s
(not very scientific benchmarking but the effect size is large).
Large circuits can run hundreds or thousands of ABCs in a single AbcPass.
For some circuits, some of those ABC runs can run for hundreds of seconds.
Running ABCs in parallel with each other and in parallel with main-thread
processing (reading and writing BLIF files, copying ABC BLIF output into
the design) can give large speedups.
When FILTERLIB is defined (attempts to compile libparse more or less standalone,) mark the `LibertyParser::error()` as weak so utilities using libparse as a library can override its behavior (the default behavior being exit(1)). As the code is quite performance-critical, I've elected to not modify it to raise an exception or have a callback or similar and simply allow for a link-time replacement.
Currently `assign_map` is rebuilt from the module from scratch every time we invoke ABC.
That doesn't scale when we do thousands of ABC runs over large modules. Instead,
create it once and then maintain incrementally it as we update the module.
- Make IdString parameter pass by const ref to avoid IdString ref counting in the constructor
- Remove extra std::string allocation to remove prefix
- Avoid using `stringf` for concatenation