## Summary
egex_bisim::collect_leaves used to descend through `re.union` and
`re.antimirov_union` at the top of each leaf of the transition regex,
splitting a single bisimulation state into multiple states before they
were merged into the union-find. This contradicts the bisimulation
invariant: **each leaf of a t-regex represents one state, regardless of
its top-level shape**. The fix descends into `ite` only (which is the
actual structural splitter of guarded transitions).
## Why it matters
The split happens to be *sound* for the current algorithm when the goal
is asserting `L(union(A, B)) = empty` (since `L(A) = empty AND L(B) =
empty` is equivalent), but it:
1. Adds spurious merges to the union-find that distort state-class
identities.
2. Slows convergence on hard equivalence queries (and causes early
timeouts in practice).
3. Creates latent unsoundness risk for any extension that interprets
leaves more semantically (XOR pair handling, classical-flag propagation,
future antimirov re-enable, etc.).
## Empirical validation
Run on the 1523-file regex-equivalence corpus, 5s/file timeout, 8
workers:
| metric | pre-fix master | post-fix |
|---|---|---|
| sat | 1008 | 1014 |
| unsat | 368 | 368 |
| timeout | 145 | 139 |
| unknown | 2 | 2 |
| SAT↔UNSAT verdict flips | — | **0** |
| timeout→sat flips | — | 6 |
| commonly-solved wall ratio | 1.000x | **0.902x** |
The 6 `timeout` → `sat` cases all return the *same* `sat` under
pre-fix master if given 60s; they are previously-slow cases not
previously-wrong ones.
Z3 unit tests: 89/89 pass (`test-z3 /a`).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implements the algorithm of Eq(p,q) = Empty(p XOR q)' using a union-find
driven bisimulation closure (per the CAV'26 ERE paper).
### What's added
* **New primitive OP_RE_XOR (re.xor)** wired through seq_decl_plugin:
parser signature, info propagation (nullable, min_length), and
pretty-printer.
* **seq_rewriter**: structural XOR rewrites ( XOR r = empty, XOR empty =
r, ull XOR r = comp(r), comp/comp absorption, complement push, AC
normalisation), nullability (Null(p XOR q) = Null(p) != Null(q)),
derivative (D_a(p XOR q) = D_a(p) XOR D_a(q)), reverse, antimirov
derivative, and `check_deriv_normal_form` coverage.
* **New class seq::regex_bisim** in
`src/ast/rewriter/seq_regex_bisim.{h,cpp}` to keep the bisim logic out
of the already-large `seq_rewriter.cpp`. Uses `basic_union_find` from
`util/union_find.h`, an `obj_map` for the node assignment, and a
50000-step bound (returns `l_undef` on overrun).
* **Integration** in `seq_rewriter::reduce_re_eq` (with a re-entry
guard) and in `seq_regex::propagate_eq` / `propagate_ne` for ground
regexes; on `l_undef` we fall back to the existing axiomatisation.
* **`sls_seq_plugin`**: extend `OP_RE_DIFF` switch arms to also cover
`OP_RE_XOR`.
### Validation
* Full release build with MSVC + Ninja.
* `./test-z3 /a` -- 89/89 tests passing.
* `./test-z3 /seq smt2print_parse` -- PASS.
* Smoke tests with `(a|b)*` vs `(a*b*)*` (equal) and `a*` vs `(a|b)*`
(not equal) return the expected `sat`/`unsat` quickly.
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>