3
0
Fork 0
mirror of https://github.com/Z3Prover/z3 synced 2026-06-16 05:45:42 +00:00

Treat each transition-regex leaf as a single bisimulation state (#9871)

## Summary

egex_bisim::collect_leaves used to descend through `re.union` and
`re.antimirov_union` at the top of each leaf of the transition regex,
splitting a single bisimulation state into multiple states before they
were merged into the union-find. This contradicts the bisimulation
invariant: **each leaf of a t-regex represents one state, regardless of
its top-level shape**. The fix descends into `ite` only (which is the
actual structural splitter of guarded transitions).

## Why it matters

The split happens to be *sound* for the current algorithm when the goal
is asserting `L(union(A, B)) = empty` (since `L(A) = empty AND L(B) =
empty` is equivalent), but it:

1. Adds spurious merges to the union-find that distort state-class
identities.
2. Slows convergence on hard equivalence queries (and causes early
timeouts in practice).
3. Creates latent unsoundness risk for any extension that interprets
leaves more semantically (XOR pair handling, classical-flag propagation,
future antimirov re-enable, etc.).

## Empirical validation

Run on the 1523-file regex-equivalence corpus, 5s/file timeout, 8
workers:

| metric | pre-fix master | post-fix |
|---|---|---|
| sat | 1008 | 1014 |
| unsat | 368 | 368 |
| timeout | 145 | 139 |
| unknown | 2 | 2 |
| SAT↔UNSAT verdict flips | — | **0** |
| timeout→sat flips | — | 6 |
| commonly-solved wall ratio | 1.000x | **0.902x** |

The 6 `timeout` → `sat` cases all return the *same* `sat` under
pre-fix master if given 60s; they are previously-slow cases not
previously-wrong ones.

Z3 unit tests: 89/89 pass (`test-z3 /a`).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
Margus Veanes 2026-06-15 10:59:41 -07:00 committed by GitHub
parent f508854fe5
commit c67bb140dc
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -86,9 +86,15 @@ namespace seq {
}
/*
Collect the leaves of a t-regex der (an ITE / antimirov union /
union-tree with regex leaves) into the output vector. Empty
(re.empty) leaves are dropped.
Collect the leaves of a t-regex der (an ITE-tree whose leaves are
regex expressions) into the output vector. Empty (re.empty) leaves
are dropped.
Each leaf is treated as a single bisimulation state regardless of
its top-level shape (including re.union and re.antimirov_union):
descending into a union at the leaf would split one state into
several, which is semantically unsound for the bisimulation /
union-find merging that follows.
Returns false if we encountered an unexpected node (e.g. a free
variable creeping in) in that case the caller should bail out.
@ -102,9 +108,7 @@ namespace seq {
expr* e = work.back();
work.pop_back();
expr* c = nullptr, * t = nullptr, * f = nullptr;
if (m.is_ite(e, c, t, f) ||
m_util.re.is_union(e, t, f) ||
m_util.re.is_antimirov_union(e, t, f)) {
if (m.is_ite(e, c, t, f)) {
if (seen.insert_if_not_there(t))
work.push_back(t);
if (seen.insert_if_not_there(f))