block_if_empty relied solely on the AST-identity state graph for dead-state
detection, which never closes on intersections/complements whose derivative
product states do not canonicalize. For ground regexes, fall back to the
antimirov NFA reachability check (re_is_empty), the same procedure propagate_eq
already uses for re.none equalities. Resolves str.in_re emptiness timeouts on
inter(., comp(.)) regexes (e.g. z3test 5728, parts of 5721).
Also drop a stale is_antimirov_union reference left in seq_regex_bisim.cpp after
the operator removal (test-z3 did not compile) and update a now-stale comment.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
mk_re_range read clo/chi in the singleton/ordering checks before the
is_empty guard, but an early is_empty (from the length checks) leaves
them uninitialized. Initialize to 0 and move the is_empty return ahead
of the singleton check so uninitialized values are never used.
Also remove the disabled if(false) range-complement block in the
seq_rewriter unit test.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The symbolic derivative of a left-associated concat (a*b)*r2 recurses
through the entire left spine, exceeding m_max_depth and emitting stuck
symbolic re.derivative terms that accumulate across NFA states and cause
an exponential blow-up on contains-pattern intersections. Right-
associate the concat (via derive::mk_concat, a single linear pass that
does not touch the derivative depth counter) before deriving, so the head
stays atomic and recursion is shallow.
Also restore the L(a)subset L(b) subsumption in mk_union_core (collapses
semantically-equal union states in antimirov intersection derivatives),
and fix a latent sort bug in mk_regex_concat where the '..* = .+'
rewrite passed the element sort instead of the regex sort to
mk_full_char, triggering a 're.+' sort-mismatch exception once exercised
by the derivative path.
Result on QF_S/20250410-matching wildcard-matching-regex set: 02 15s->0.6s,
04 15s->0.7s, 05 timeout->4.2s, 59/62 timeout->0.1-0.3s (vs master).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
## Summary
Improves the Diophantine (`dio`) integer-feasibility controller in
`int_solver`, and fixes a latent bug where the user's Gomory-cut
configuration could be silently overridden at runtime. Also includes the
earlier `lia_w` work: randomized hammer gates, the `int_hammer_period` /
`random_hammers` parameters, and the linear `dio_calls_period` recovery.
## Motivation
The controller used a **single field** both as the static
`lp.dio_cuts_enable_gomory` parameter and as the live "is Gomory
running" flag. It started running Gomory (and the gcd test) once
`dio_calls_period` crossed a hard-coded `16`. Because `dio_calls_period`
is also driven by the randomized hammer gate, on instances where `dio`
is only intermittently productive the period could be ratcheted past 16
*by chance*, turning on Gomory + gcd and thrashing — e.g. `dillig/20-14`
went from a 100s solve (deterministic) to a 600s timeout (randomized)
purely from this spurious activation.
## Changes
- **Separate config from runtime state.** Split the shared field into
`m_dio_cuts_enable_gomory` (static config, never mutated) and
`m_run_gomory_with_dio` (runtime flag). Toggling the runtime state can
no longer clobber the user's `dio_cuts_enable_gomory` parameter.
- **Trigger on genuine dio failures, not the period proxy.** Running
Gomory-with-dio now starts after a count of **consecutive `undef` dio
returns** (reset on a dio conflict) rather than when the
randomization-inflated period crosses a threshold — robust to
`random_hammers` gate variance.
- **Parameterize the threshold.** New `lp.dio_gomory_enable_period`
(default 16). Set it very large to never auto-start Gomory, so Gomory
follows `dio_cuts_enable_gomory` only.
- **Try `dio` before Gomory** in `check()` so a productive dio conflict
preempts Gomory on dio-dominated instances.
## Evaluation (QF_LIA, full set, 600s, seed 555 paired)
- Dio-before-Gomory: **+33** problems across the 6 `random_hammers x
int_hammer_period` cells (5/6 cells improve).
- New trigger (`dio_gomory_enable_period=32`, random): **6417** vs the
period-16 baseline **6409**; no short-cutoff regression.
- Linear `dio_calls_period` recovery: keeping it on is worth ~+20 vs
off; `decrease=1` slightly ahead of the default 2.
Default behavior (`dio_gomory_enable_period=16`) is byte-for-byte
equivalent to the previous threshold logic.
## Notes
Debug-only tracing used during analysis (the `dio_calls_period_trace`
parameter plus per-hammer / period-evolution verbose output) is **not**
included.
---------
Signed-off-by: Lev Nachmanson <levnach@hotmail.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
mk_re_concat only merged Sigma* Sigma* when both stars were direct
siblings. Once a literal nests the concatenation, e.g. (Sigma* x Sigma*) ++ Sigma*,
the trailing duplicate Sigma* survived because the left operand is a concat
that ends with Sigma* rather than a bare full_seq. Redundant adjacent stars
then multiply derivative states during bisimulation.
Add two grouping-insensitive rules using the existing
starts_with_full_seq/ends_with_full_seq helpers:
R ++ Sigma* -> R when R ends with Sigma*
Sigma* ++ R -> R when R starts with Sigma*
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Guard regex_to_range_predicate so it only collapses regexes whose
element sort is a character sort. Previously a regex over a non-char
sequence sort (e.g. (re (Seq Int))) could be silently mis-collapsed
into bogus [0, max_char] ranges. Add a negative unit test covering
re.empty/re.full_char over (Seq Int).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In the bisimulation equivalence path, derivative leaves are now
extracted via the new seq_rewriter::brz_derivative_cofactors
(derive::derivative_cofactors), which computes the symbolic
derivative and enumerates its reachable leaves in fully ITE-hoisted
normal form: every if-then-else over the input character (including
ones previously buried under a concat or union) is hoisted to the top
via decompose_ite, infeasible minterms are pruned, and unions are kept
intact as single states. Each leaf is therefore a ground regex free of
(:var 0), so its nullability is always decidable.
This replaces collect_leaves (which only split top-level ITEs and left
buried (:var 0) ITEs inside leaves), the root cause of bisim returning
l_undef and falling through to the slow theory solver.
Validation on the regex-equivalence corpus (1523 files, -T:5, 8 workers):
- vs master: total solved 1394 vs 1378 (+16), soft_timeouts 129 vs 145,
0 soundness disagreements (was 18 -> 5 -> 0).
- vs derive: +242 solved (1394 vs 1152), 25.4% faster on commonly-solved
files, fixes 18 soundness disagreements, only 6 regressions.
- corpus wall time halved (172s vs 332s/349s).
- All 91 unit tests pass, including seq_regex_bisim.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Introduce src/ast/rewriter/regex_range_collapse.{h,cpp}, a translator
between the boolean-combination-of-character-class fragment of regexes
and the range_predicate value type added in Stage 2.
Recognized fragment (translates to range_predicate):
re.empty, re.full_char, re.range, re.union, re.intersection, re.diff
of operands recursively in the fragment. Range bounds are accepted in
three encodings: string constant ("a"), seq.unit of a const char
(seq.unit (Char 97)), and length-1 zstring literal.
NOT translated:
re.complement -- this is sequence-level complement (Sigma* \ L), not
character-class complement. Translating it would incorrectly turn
re.comp(re.range "a" "z") into the character class [^a-z], which would
drop the empty string and all length>=2 strings.
Hook the translator into seq_rewriter at mk_re_union0, mk_re_union,
mk_re_inter0, mk_re_inter, and mk_re_diff so that boolean combinations
of character classes always reduce to a single canonical range-set
form. mk_re_complement is intentionally not hooked.
Materialization uses the canonical (seq.unit (Char N)) bound form
(matching the rest of seq_rewriter) and right-associates the union
with operands sorted by expr_id so the result matches the invariant
expected by merge_regex_sets.
Unit tests in src/test/regex_range_collapse.cpp cover the recognized
fragment, the non-translatable cases, and round-trip identity for
multi-range predicates.
Corpus validation on bench/inputs/regex-equivalence (1523 .smt2):
- 0 soundness regressions vs derive baseline.
- Resolves 4 previously-soft-timeout files (now solved correctly).
- Resolves 1 pre-existing wrong answer (mut_0404: master/derive say
unsat, ground-truth annotation and Stage 3 say sat).
- Wall-time: -2.2% vs Stage-3 starting point, -1.5% vs derive.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Introduce a specialized range-algebra over the unsigned character
domain [0, max_char], with canonical sorted-disjoint-non-adjacent
representation and linear-time union, intersection, complement,
difference, and symmetric difference operations.
This is Stage 1 of the derive-with-ranges plan: the value type only,
with unit tests covering factories, ordering, hashing, hand-picked
instances, and exhaustive de-Morgan / lattice laws over a small
domain (verified by enumerating all 64 subsets).
Integration with seq::derive's path conditions, the OneStep cache,
and the R&psi smart-constructor rewrite are deferred to later stages.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
## Summary
Fixes a Z3 snapshot-regression divergence tracked in Z3Prover/bench
discussion
https://github.com/Z3Prover/bench/discussions/2662.
- **Benchmark:** `iss-5753/small-3.smt2` (Z3Prover/bench corpus,
original issue z3#5753)
- **Regression:** recorded oracle `unsat` (captured 2026-06-05) →
current `master` produces `timeout` at `-T:20`
- **bench workflow run:**
https://github.com/Z3Prover/bench/actions/runs/28078176175
### Divergence
```diff
--- small-3.expected.out (expected)
+++ produced (current z3)
@@ -1 +1 @@
-unsat
+timeout
```
The answer `unsat` is still correct — z3 simply can no longer prove it
within the
budget. This is a **performance regression**, not a wrong/unsound
result.
## Root cause
The benchmark is a quantified Array/Real/Int problem solved via MBQI.
The model
finder's quantifier analyzer (`src/smt/smt_model_finder.cpp`) now
creates a
`ho_var` qinfo for **array-sorted quantified variables used as terms**
(the new
higher-order term-enumeration path introduced in #9908, "Term
enumeration",
commit `5699142f5`). Before that commit such an occurrence set
`m_info->m_is_auf = false` (a safe fallback that left no extra
instantiation
hints), which is the behaviour that was active when the `unsat` oracle
was
captured.
`ho_var::populate_inst_sets` builds instantiation candidates with:
```cpp
for (auto t : tn.enum_terms(srt)) {
...
S->insert(t, generation);
}
```
`term_enumeration::enum_terms` is a **lazy, increasing-cost generator
with no
inherent bound** (see `ast/rewriter/term_enumeration.h`). For this
benchmark the
relevant array sort `(Array Int (Array Int Real))` has many array-valued
uninterpreted productions (`tptpummul`, `trans`, `inv`, ...), so
iterating the
generator to exhaustion inserts an explosively large number of terms
into the
instantiation set, blowing up MBQI instantiation and exhausting the 20s
budget.
Confirmed empirically: running the benchmark with `-v:3` shows
`ho_var::populate_inst_sets: 622 (Array Int (Array Int Real))`, i.e.
this exact
path fires for the divergent benchmark.
Notably, the original commit `5699142f5` declared `unsigned max_count =
20;` for
this very loop but **never applied it**, and that dead variable was
subsequently
removed — leaving the enumeration completely unbounded.
## Fix
Restore the intended bound so at most 20 of the cheapest enumerated
terms are
added to the instantiation set:
```cpp
unsigned max_count = 20;
for (auto t : tn.enum_terms(srt)) {
if (max_count == 0)
break;
--max_count;
...
S->insert(t, generation);
}
```
This is sound (instantiation only ever adds valid ground instances of
the
quantified formula) and is strictly more information than the pre-#9908
baseline
(which added no `ho_var` terms at all), so it does not regress problems
the
feature was meant to help. The change is confined to a single loop in
`src/smt/smt_model_finder.cpp`.
## Validation
Built the checkout in Release mode and re-ran the failing benchmark with
the
same options the snapshot harness uses:
```
$ build/z3 -T:20 inputs/issues/iss-5753/small-3.smt2
unsat # was: timeout
real 0m0.120s
```
The combined output now matches the recorded `small-3.expected.out`
oracle
**byte-for-byte** (`unsat\n`). A smoke run over a sample of other corpus
benchmarks showed no crashes or new divergences, and a basic
`(check-sat)/(get-model)` sanity check still works.
---
This PR is opened as a **draft** for human review by the automated
snapshot-regression fixer.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
> Generated by [Fix a Z3 snapshot-regression
divergence](https://github.com/Z3Prover/bench/actions/runs/28078343154)
· 2.8K AIC · ⌖ 150.7 AIC · ⊞ 39K ·
[◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+snapshot-regression-fixer%22&type=pullrequests)
<!-- gh-aw-agentic-workflow: Fix a Z3 snapshot-regression divergence,
engine: copilot, version: 1.0.60, model: claude-opus-4.8, id:
28078343154, workflow_id: snapshot-regression-fixer, run:
https://github.com/Z3Prover/bench/actions/runs/28078343154 -->
<!-- gh-aw-workflow-id: snapshot-regression-fixer -->
<!-- gh-aw-workflow-call-id: Z3Prover/bench/snapshot-regression-fixer
-->
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This is another PR towards the goal of getting Z3 to compile cleanly
when included via FetchContents into clang-tidy, which uses a pretty
strict set of warnings.
This PR adds
```
"-Wcast-qual"
```
to the set of warnings enabled in the build. This gives warnings like:
```
/Users/daviddetlefs/z3/src/ast/ast.cpp:2897:38: warning: cast from 'app *const *' to 'expr **' drops const qualifier [-Wcast-qual]
```
I fixed these by inserting consts. In some cases, a "const_cast<T>(...)"
was necessary.
This is an **automated workflow self-test** of the `fixer-selftest` /
`snapshot-regression-fixer` pipeline running on the self-hosted
**`rise-runner-1`** runner pool. Its sole purpose is to prove the
end-to-end agentic pipeline works: Copilot inference runs, and the
`create-pull-request` safe output can open a real **draft** pull request
on `Z3Prover/z3` using the configured PAT. It is intentionally
build-free.
### Change
- **File:** `src/ast/simplifiers/elim_unconstrained.cpp` (module-header
`/*++ ... */` comment)
- **Before:** ` - it does not accomodate side constraints.`
- **After:** ` - it does not accommodate side constraints.`
A plain spelling correction in a comment: `accomodate` → `accommodate`.
The diff is a single line.
### Why this is safe
The change is **comment-only** — it touches no code, identifiers, string
literals, build files, or tests, and therefore **cannot affect z3's
behaviour or output**. Because of that, no compilation or testing was
performed and **no rebuild is needed** to be confident it is correct.
### For maintainers
This is a genuine, correct fix, so you are welcome to **merge** it.
Equally, you may simply **close** it — the success of this self-test
does not depend on the PR being merged.
> Generated by [Self-test the agentic PR pipeline with a tiny z3 comment
fix](https://github.com/Z3Prover/bench/actions/runs/27987685681) · 299.1
AIC · ⌖ 53.6 AIC · ⊞ 35.5K ·
[◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+fixer-selftest%22&type=pullrequests)
<!-- gh-aw-agentic-workflow: Self-test the agentic PR pipeline with a
tiny z3 comment fix, engine: copilot, version: 1.0.60, model:
claude-opus-4.8, id: 27987685681, workflow_id: fixer-selftest, run:
https://github.com/Z3Prover/bench/actions/runs/27987685681 -->
<!-- gh-aw-workflow-id: fixer-selftest -->
<!-- gh-aw-workflow-call-id: Z3Prover/bench/fixer-selftest -->
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
## Automated workflow self-test
This is an **automated workflow self-test** of the `fixer-selftest` /
`snapshot-regression-fixer` pipeline running on the self-hosted
**`rise-runner-1`** pool. Its sole purpose is to verify the end-to-end
agentic pipeline: that Copilot inference runs and that the
`create-pull-request` safe output can open a real **draft** PR on
`Z3Prover/z3` using the configured PAT.
## The fix
A single, objectively-correct spelling fix in a **code comment** (the
header comment block of the file — not code, identifiers, or string
literals).
- **File:** `src/ast/simplifiers/solve_eqs.cpp` (line 23, inside the
`/*++ ... --*/` header comment)
- **Before:** `... where bitset reprsents set of free variables.`
- **After:** `... where bitset represents set of free variables.`
The diff is a single line:
```diff
-1. maintain map FV: term -> bit-set where bitset reprsents set of free variables. Assume the number of variables is bounded.
+1. maintain map FV: term -> bit-set where bitset represents set of free variables. Assume the number of variables is bounded.
```
## Why this is safe
The change is **comment-only** and therefore **cannot affect z3's
behaviour or output** — so it needs **no rebuild and no testing** to be
confident it is correct. Nothing outside the comment text was touched
(no code, no string literals, no identifiers, no whitespace elsewhere).
## For maintainers
This is a genuine, correct fix, so feel free to **merge** it — but you
may also simply **close** it. The self-test's success does not depend on
this PR being merged; it only depends on the PR having been opened.
> Generated by [Self-test the agentic PR pipeline with a tiny z3 comment
fix](https://github.com/Z3Prover/bench/actions/runs/27986232656) · 208.5
AIC · ⌖ 75.1 AIC · ⊞ 35.5K ·
[◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+fixer-selftest%22&type=pullrequests)
<!-- gh-aw-agentic-workflow: Self-test the agentic PR pipeline with a
tiny z3 comment fix, engine: copilot, version: 1.0.60, model:
claude-opus-4.8, id: 27986232656, workflow_id: fixer-selftest, run:
https://github.com/Z3Prover/bench/actions/runs/27986232656 -->
<!-- gh-aw-workflow-id: fixer-selftest -->
<!-- gh-aw-workflow-call-id: Z3Prover/bench/fixer-selftest -->
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The `build` job in the Pyodide workflow was failing at wheel smoke-test
import time because `libz3.so` was not produced as a proper wasm side
module (`need the dylink section to be first`, missing exported
symbols). This PR restores the required Pyodide linker mode.
- **Root cause**
- Recent Pyodide linker flag changes enabled wasm EH/longjmp but omitted
`-sSIDE_MODULE=1`, so the generated `libz3.so` was not loadable by
Pyodide’s dynamic loader.
- **Changes**
- **`src/api/python/pyproject.toml`**
- Added `-sSIDE_MODULE=1` to `[tool.pyodide.build].ldflags`.
- **`src/api/python/setup.py`**
- Added `-sSIDE_MODULE=1` to the Pyodide `LDFLAGS` path to keep direct
`setup.py`-driven builds aligned with `pyproject.toml` behavior.
- **Flag delta (core fix)**
```toml
# src/api/python/pyproject.toml
[tool.pyodide.build]
ldflags = "-fwasm-exceptions -sSUPPORT_LONGJMP=wasm -sWASM_BIGINT -sSIDE_MODULE=1"
```
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Z3 incorrectly returns SAT for formulas like `store(store(const(x), i0,
e0), i1, e1) = store(store(const(y), i0, e0), i1, e1) ∧ x ≠ y` when the
index sort has a small finite domain (e.g. `(_ BitVec 2)` = 4 elements).
The quantifier-based encoding of the same constraint correctly returns
UNSAT.
## Changes
### `src/sat/smt/array_solver.cpp` — `add_parent_lambda()`
When a store is added to an array's `m_parent_lambdas` after that array
already has `m_has_default = true` (i.e., a `default(array)` term
already exists in the e-graph), the default axiom for the new store was
silently dropped. This breaks the propagation chain:
```
default(const(x)) = x
↓ [via store default axiom]
default(store(const(x), i, v)) = x
↓ [via store default axiom]
default(store(store(const(x), ...), ...)) = x
↓ [EUF congruence from store equality]
x = y → contradiction
```
Fix: push `default_axiom(lambda)` in `add_parent_lambda` when
`m_has_default` is already set, so late-arriving stores still get their
default axioms instantiated.
```cpp
void solver::add_parent_lambda(theory_var v_child, euf::enode* lambda) {
auto& d = get_var_data(find(v_child));
ctx.push_vec(d.m_parent_lambdas, lambda);
if (should_prop_upward(d))
propagate_select_axioms(d, lambda);
if (d.m_has_default) // new: fire default axiom retroactively
push_axiom(default_axiom(lambda));
}
```
### Known remaining gap
`mk_eq_core` in `array_rewriter.cpp` also has a related issue: the
unconditional store-chain expansion fires only when `domain_size >
num_lhs + num_rhs` (line 893), but the correct threshold is `domain_size
> max(num_lhs, num_rhs)`. With 4-element domain and 2 stores per side,
`2+2 = 4` equals the domain size so the expansion is skipped. The
variant gated by `m_expand_store_eq` already uses `max()` correctly
(line 911); the unconditional path needs the same fix.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Nikolaj Bjorner <nbjorner@microsoft.com>
## Summary
Fix a use-after-free in `func_interp::compress()`.
When a function interpretation had previously grown large enough to
allocate `m_entry_table`, `compress()` could deallocate entries whose
result matched the else-case but leave the hash table intact. Later
`get_entry()` lookups could then return freed `func_entry*` values,
which showed up during model checking as a corrupted expression result
from `model_evaluator`.
## Root cause
`func_interp::compress()` compacted `m_entries` and freed removed
entries, but it did not rebuild or clear `m_entry_table`.
This left stale pointers in the lookup table whenever:
- the table had already been allocated on a larger interpretation, and
- compression removed some entries.
In the reported case, model evaluation rewrote `stack_s!1041` through
`BR_REWRITE1`, fetched a freed `func_entry` result from the stale table,
and then tripped an assertion in `expr::get_sort()` during quantifier
model checking.
## Fix
After compression removes entries, rebuild `m_entry_table` from the
surviving `m_entries`, or clear it when the surviving interpretation is
small.
## Regression coverage
Added a unit regression in `src/test/model_evaluator.cpp` that:
- creates a `func_interp` large enough to allocate `m_entry_table`,
- compresses away almost all entries,
- checks that removed keys no longer resolve, and
- checks that the surviving key still resolves to the correct result.
## Validation
- `../build/z3 ebso-115.smt2` previously hit an assertion in
`rewriter_def.h` / `ast.cpp`; after the fix it no longer asserts.
- `./test-z3 model_evaluator` passes with the new regression.
## Reproducer
I did not produce a smaller SMT2 benchmark in this change. The original
reproducer I used was `ebso-115.smt2`, and the new unit regression
directly exercises the stale-entry-table path in-process.
Co-authored-by: Can Cebeci <t-cancebeci@microsoft.com>