`qe-lite` could produce malformed formulas when expanding bounded
quantifiers under nested binders, leaving outer de Bruijn indices
unshifted after eliminating an inner quantifier (e.g., `(:var 1)`
escaping capture). This change fixes index normalization in that rewrite
path and adds a regression for the reported forall/exists arithmetic
case.
- **Rewrite correctness in bounded quantifier expansion**
- In `src/qe/lite/qe_lite_tactic.cpp`, after substituting bounded
variables in payload conjuncts, apply `inv_var_shifter(num_decls)` so
outer bound variables are reindexed relative to the removed binder.
- This preserves quantifier structure correctness when
`try_expand_bounded_quantifier` eliminates an inner quantifier.
- **Regression coverage for the reported pattern**
- In `src/test/smt_context.cpp`, add a focused quantified arithmetic
formula matching the bug shape:
- outer `forall (x, x4)`
- inner `exists (y)`
- mixed inequalities that trigger qe-lite bounded expansion
- Assert the formula is unsatisfiable, preventing reintroduction of
invalid index handling in this path.
```c++
inst = vs(p, subst_map.size(), subst_map.data());
shift(inst, num_decls, inst); // reindex outer de Bruijn vars after eliminating inner quantifier
```
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
## Summary
Fixes a hang (wall-clock timeout) in the native parallel SMT solver when
a cube is incomplete for a reason that cannot change. Originating
discussion: https://github.com/Z3Prover/bench/discussions/2746
Benchmark: `iss-3707/bug-1.smt2` (`QF_NRA`, runs with
`parallel.enable=true`).
## Divergence
The recorded oracle vs. current z3 (`z3 -T:20`):
```diff
-(incomplete (theory difference-logic))
-unknown
+timeout
```
z3 should terminate with `unknown` (incomplete theory) but instead spins
until the 20s timeout.
## Root cause
In `src/smt/smt_parallel.cpp` the per-cube worker handled an `l_undef`
cube by unconditionally calling `update_max_thread_conflicts()` and
re-splitting/re-checking. That only helps when the cube was abandoned at
the per-cube conflict limit (`max-conflicts-reached`). When the cube is
incomplete for a permanent reason (incomplete theory, quantifiers,
resource limits), the verdict never changes, so the worker re-checks the
same cube forever. The `batch_manager` had no `unknown` terminal state,
so `get_result()` could only end as sat/unsat/exception — there was no
way to settle on `unknown`, hence the hang. This is the `smt_parallel`
analogue of the `parallel_tactical.cpp` regression fixed earlier.
## Fix
Minimal, mirroring the tactic-side fix:
- add an `is_unknown` batch-manager state + `m_reason_unknown`;
- a worker reporting `l_undef` whose `last_failure` is not
`max-conflicts-reached` calls `set_unknown(reason)` and stops
re-splitting;
- `set_sat`/`set_unsat` may still override `is_unknown` so a definitive
answer wins;
- `get_result()` maps `is_unknown -> l_undef` and the reason propagates
to the parent context.
## Validation
Rebuilt z3 (`make -C build -j16`) and re-ran the benchmark 5× with
`-T:20`. Every run finished in well under the timeout with output
matching the oracle byte-for-byte:
```
(incomplete (theory difference-logic))
unknown
```
Created as a **draft** for human review.
> Generated by [Fix a Z3 snapshot-regression
divergence](https://github.com/Z3Prover/bench/actions/runs/28358375255)
· 553.9 AIC · ⌖ 27.2 AIC · ⊞ 9K ·
[◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+snapshot-regression-fixer%22&type=pullrequests)
<!-- gh-aw-agentic-workflow: Fix a Z3 snapshot-regression divergence,
engine: copilot, version: 1.0.63, model: claude-opus-4.8, id:
28358375255, workflow_id: snapshot-regression-fixer, run:
https://github.com/Z3Prover/bench/actions/runs/28358375255 -->
<!-- gh-aw-workflow-id: snapshot-regression-fixer -->
<!-- gh-aw-workflow-call-id: Z3Prover/bench/snapshot-regression-fixer
-->
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
`batch_manager::set_unknown()` in the parallel SMT tactic changed
`m_state` to `is_unknown` but never notified backbone workers or the
core-minimizer worker waiting on `m_bb_cv` / `m_core_min_cv`. Those
threads blocked indefinitely, deadlocking `solve()` at `t.join()`.
### Root cause
```
(declare-fun a (Int) Bool)
(declare-fun b (Int) Bool)
(assert (distinct a b))
(check-sat-using psmt)
```
Every CDCL worker returns `l_undef` with reason `(incomplete (theory
array))`. The first worker calls `set_unknown()` (a soft verdict — other
workers may still find sat/unsat) and exits. Other CDCL workers exit
when `get_cube()` checks `m_state != is_running`. Meanwhile, backbone
workers and the core minimizer are already blocked in
`wait_for_backbone_job()` / `wait_for_core_min_job()`, both of which
condition-wait on CVs that `set_unknown()` never signals. Their
predicates check `m_state != is_running`, but a CV predicate only
re-evaluates on notification or spurious wakeup.
### Fix
- **`src/solver/parallel_tactical.cpp`** — `set_unknown()` now calls
`m_bb_cv.notify_all()` and `m_core_min_cv.notify_all()` after setting
the terminal state, so waiting helper threads observe the change and
exit via the existing `m_state != is_running` guard in their wait
predicates.
### Test
- **`src/test/psmt.cpp`** — new regression covering SAT, UNSAT, and the
theory-incomplete (deadlock) path using `(as-array f)` terms to
reproduce the exact array-theory incompleteness that triggers
`set_unknown()`.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
## Summary
Fixes a `psmt` (parallel SMT tactic) regression where the solver hangs
to a wall-clock timeout instead of returning `unknown` on formulas whose
root cube is genuinely undetermined by an incomplete theory.
- **Originating discussion:**
https://github.com/Z3Prover/bench/discussions/2735
- **Benchmark:** `iss-3044/bug-1.smt2` (from [Z3 issue
#3044](https://github.com/Z3Prover/z3/issues/3044))
```smt2
(declare-fun a (Int) Bool)
(declare-fun b (Int) Bool)
(assert (distinct a b))
(check-sat-using psmt)
```
## Divergence
The recorded oracle (expected) vs. current z3 (combined stdout+stderr,
`-T:20`):
```diff
-(incomplete (theory array))
-unknown
+timeout
```
## Root cause
The rewritten parallel tactic (`src/solver/parallel_tactical.cpp`,
introduced in #9824/#9825) hangs on this input.
In the worker `run()` loop, every `l_undef` cube result was treated as
if the per-cube **conflict limit** had been reached: the worker
escalated the per-thread conflict budget (`update_max_thread_conflicts`)
and re-checked / re-split the same cube. When the `l_undef` actually
comes from **theory incompleteness** (here, the array theory cannot
decide `(distinct a b)` over `Int -> Bool`) rather than the conflict
limit, the verdict never changes, so the worker re-checks the same cube
forever.
Compounding this, the `batch_manager` state machine had **no terminal
`unknown` state** — the only way to finish was for some worker to prove
`sat`/`unsat`, which is impossible for a root-level theory-incomplete
formula. The combination produced an infinite loop and a wall-clock
timeout.
The pre-rewrite parallel tactic avoided this: its `giveup()` detected
reasons starting with `(incomplete` / `(sat.giveup`, reported a soft
undef, and echoed the reason to `verbose_stream()`.
## Fix
All changes are confined to `src/solver/parallel_tactical.cpp` (47
insertions, 4 deletions):
1. **Distinguish genuine incompleteness from conflict-limit
exhaustion.** In the worker `l_undef` case, only `reason_unknown() ==
"max-conflicts-reached"` benefits from escalating the budget /
splitting. For any other reason (incomplete theory, quantifiers,
lambdas, resource limits, ...) re-checking is futile, so the worker
records a sound `unknown` and stops working the branch.
2. **Add a terminal `is_unknown` batch-manager state** (`set_unknown`,
`get_result() -> l_undef`, reason storage). It is a *soft* result: it
does not cancel the other workers, and a definitive `sat`/`unsat`
verdict from another branch may still override it (the
`set_sat`/`set_unsat` guards now permit overriding `is_unknown`). All
`set_unsat` call sites are global formula-unsat (core ⊆ assumptions, or
independent of the tested backbone literal), so the override is sound;
tree-closure unsat remains guarded by `is_running` and cannot fire
because the undef leaf stays open.
3. **Restore the reason output.** The captured `reason_unknown` is
propagated to the result goal and echoed to `verbose_stream()`,
reproducing the `(incomplete (theory array))` line that the sequential
path / old parallel tactic emitted.
## Validation
Rebuilt the `./z3` checkout (`./configure && make -C build -j16`) and
re-ran the benchmark with the freshly built binary using the same
options the snapshot capture uses (`-T:20`, combined stdout+stderr):
```
$ z3 inputs/issues/iss-3044/bug-1.smt2 -T:20
(incomplete (theory array))
unknown
```
This matches the recorded `bug-1.expected.out` oracle **byte-for-byte**,
and the benchmark now completes in ~0.5s (was: timeout). Verified stable
across 8 consecutive runs. Basic `psmt` `sat`/`unsat` checks continue to
produce correct results.
Opened as a **draft** for human review.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
> Generated by [Fix a Z3 snapshot-regression
divergence](https://github.com/Z3Prover/bench/actions/runs/28313246856)
· 5.7K AIC · ⌖ 85.8 AIC · ⊞ 41.2K ·
[◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+snapshot-regression-fixer%22&type=pullrequests)
<!-- gh-aw-agentic-workflow: Fix a Z3 snapshot-regression divergence,
engine: copilot, version: 1.0.60, model: claude-opus-4.8, id:
28313246856, workflow_id: snapshot-regression-fixer, run:
https://github.com/Z3Prover/bench/actions/runs/28313246856 -->
<!-- gh-aw-workflow-id: snapshot-regression-fixer -->
<!-- gh-aw-workflow-call-id: Z3Prover/bench/snapshot-regression-fixer
-->
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
## Problem
The [master WebAssembly
Build](https://github.com/Z3Prover/z3/actions/runs/28306680131) fails
with:
```
../src/solver/parallel_tactical.cpp:59:9: error: redefinition of 'mk_parallel_tactic'
59 | tactic* mk_parallel_tactic(solver* s, params_ref const& /* p */) {
../src/solver/parallel_tactical.cpp:55:9: note: previous definition is here
```
## Cause
Commit 7564ccc3f (an unrelated lar_solver change) accidentally renamed
the dead `mk_parallel_tactic2` stub to `mk_parallel_tactic`, leaving two
identical definitions inside the `#ifdef SINGLE_THREAD` block. The WASM
build defines `SINGLE_THREAD`, so it hits the redefinition.
## Fix
`mk_parallel_tactic2` and its `non_parallel_tactic2` class were never
referenced anywhere. This removes the dead stub and orphaned class,
keeping the single `mk_parallel_tactic` that degrades to
`mk_solver2tactic(s)` in single-threaded mode (added in #9977).
Verified both `SINGLE_THREAD` and multi-threaded paths pass
`-fsyntax-only`.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Capture row as a pointer as lambda strips the reference and the vector was copied by value in lar_solver!
---------
Signed-off-by: Lev Nachmanson <levnach@hotmail.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The `Ubuntu build - cmake - debugGcc` job was failing because the solver
could emit an unexpected `check-assignment` line before normal
satisfiability output. This change removes that stray output so debug
GCC runs no longer contaminate expected CLI/results streams.
- **Root cause**
- `src/math/lp/nra_solver.cpp` printed `check-assignment` from
`solver::check_assignment()` via `IF_VERBOSE(0, ...)`.
- Verbosity level `0` made this effectively unconditional in the failing
path, so debug builds could leak internal diagnostics into user-visible
output.
- **Change**
- Remove the `check-assignment` print from the exception path in
`lp::solver::check_assignment()`.
- Preserve all existing control flow and error handling; only the
unintended output side effect is removed.
- **Effect**
- Debug GCC CMake builds keep their normal `sat`/`unsat` output shape.
- Internal solver diagnostics no longer interfere with output-sensitive
CI checks.
```c++
catch (z3_exception &) {
statistics &st = m_imp->m_nla_core.lp_settings().stats().m_st;
m_imp->m_nlsat->collect_statistics(st);
if (m_imp->m_limit.is_canceled()) {
return l_undef;
}
else {
throw;
}
}
```
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
## Summary
Fixes the `iss-7027/small-30` snapshot regression (`Z3Prover/bench`
discussion #2705) **at its root**, instead of working around it by
retuning the LP heuristics.
- **Benchmark:** `inputs/issues/iss-7027/small-30.smt2` —
`(check-sat-using qe2)` over a single `(distinct ...)` of 33 mixed
Int/Real terms.
- The recorded oracle was `unknown`; current `master` produces
`timeout`.
## Root cause
`unknown`/`timeout` are both wrong here: the formula is a `distinct`
over 33 terms (free Int/Real constants plus the literals `0`/`1`), which
is **trivially `sat`** — there are infinitely many distinct reals.
The real bug is in the `qsat` tactic that backs `qe2`. Running
quantifier elimination on a **quantifier-free** formula has nothing to
eliminate, so `qsat` left an undecided residual goal and
`check-sat-using` reported `unknown`. This reproduces on any ground
formula with free variables, e.g.:
```
(declare-fun a () Int)(assert (> a 0))(check-sat-using qe2) ; -> unknown (should be sat)
```
For `small-30` the QE alternation additionally drove `theory_lra`
integer branch-and-bound down a non-terminating path, surfacing as a
`timeout` under the capture budget (the symptom the `random_hammers`
schedule change happened to expose).
## Fix
Under `check-sat` semantics, top-level free variables are implicitly
existentially quantified. So when the `qsat` input has no quantifiers,
decide satisfiability directly (route through the existing `qsat_sat`
path) instead of producing a residual goal. `qe2`/`qe` now return
`sat`/`unsat` for ground formulas.
QE of genuinely-quantified formulas is unchanged: `apply qe2` on a
quantified goal produces the same projected formula as before (verified
identical to `master`). Only the degenerate quantifier-free case is
affected.
This supersedes the previous approach in this PR (reverting the
`lp.random_hammers` default). That default is left **unchanged**
(`true`), preserving #9958's aggregate QF_LIA benefit. `small-30` now
returns `sat` in ~0.01s regardless of the heuristic schedule, because
the QE machinery no longer runs on this ground instance.
Two changes:
- `src/qe/qsat.cpp`: short-circuit quantifier-free input to the
satisfiability decision path.
- `Z3Prover/bench` `inputs/issues/iss-7027/small-30.expected.out`:
oracle updated `unknown` -> `sat` (to be committed alongside this fix).
## Validation
```
$ z3 small-30.smt2
sat # ~0.01s
$ echo '(declare-fun a () Int)(assert (> a 0))(check-sat-using qe2)' | z3 -in
sat
$ echo '(declare-fun a () Int)(assert (and (> a 0)(< a 0)))(check-sat-using qe2)' | z3 -in
unsat
```
Full unit-test suite (`test-z3 /a`) passes (92/92). Quantified `qe2`
round-trips (`apply qe2`) match `master` byte-for-byte.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add new parallel algorithm as a tactic (parallel_tactical2.cpp)
Don't port over old experiments from smt_parallel that we aren't using
(sls, inprocessing, failed_literal_mode for bb detection)
Fix bugs: lease cancellation/reslimit race condition, involves changing
lease epoch to simple boolean flag
Also, now there is a single shared set of params for the tactic and
smt_parallel
**Test runs for the parallel_tactical2 vs old smt_parallel version:**
run-2747-Z3-threads-4-qflia-30s-stats.md
run-2746-Z3-threads-4-qflia-30s-parallel_tactic-stats.md
run-2745-Z3-threads-1-qfbv-30s-stats.md
run-3013-Z3-threads-4-qfbv-30s-parallel_tactic-stats.md --> note this is
indeed run-3013, I reran after a bugfix in inc_sat_solver
run-2743-Z3-threads-4-qfnia-30s-stats.md
run-2742-Z3-threads-4-qfnia-30s-parallel_tactic-stats.md
**Test runs for the new smt_parallel with bugfixes:**
run-2801-Z3-threads-4-qflia-30s-smtparallel-bugfixes-stats.md,
run-2800-Z3-threads-4-qflia-30s-smtparallel-bugfixes-stats.md
run-2797-Z3-threads-4-qfnia-30s-smtparallel-bugfixes-stats.md
compare to old smt_parallel:
run-2747-Z3-threads-4-qflia-30s-stats.md
run-2743-Z3-threads-4-qfnia-30s-stats.md
Note that there is a slight regression on lia in run-2800. The source of
this appears to be the new new LP largest-cube LIA heuristic param,
which is enabled by default. disabling this param in run-2801 restored
performance (I didn't change this in this PR though, just something to
note)
http://mtzguido.tplinkdns.com:8081/z3/compare_stats.html
---------
Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>
Co-authored-by: Ilana Shapiro <ilanashapiro@Ilanas-MacBook-Pro.local>
Co-authored-by: Ilana Shapiro <ilanashapiro@Ilanas-MBP.localdomain>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Fixes a Z3 snapshot-regression divergence reported in `Z3Prover/bench`
discussion: https://github.com/Z3Prover/bench/discussions/2667
## Divergence
- **benchmark:** `iss-6615/original.smt2` (lives at
`inputs/issues/iss-6615/` in `Z3Prover/bench`)
- **kind:** `diff`
- **z3 under test:** `z3-4.17.0-x64-glibc-2.39` (Nightly)
- **budget:** per-file `20s` — the snapshot capture runs `z3 -T:20
original.smt2`
The recorded oracle is 13× `unknown` (one per `check-sat`, each preceded
by an in-file `(set-option :timeout 100)` soft timeout). Current z3
instead prints a single `timeout`:
```diff
--- original.expected.out (expected)
+++ produced (current z3)
@@ -1,13 +1 @@
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
+timeout
```
## Root cause
The benchmark uses `(set-logic ALL)` with quantifiers over higher-order
(array / lambda) sorts, so MBQI drives `ho_var::populate_inst_sets`
(`src/smt/smt_model_finder.cpp`), which enumerates candidate ground
terms with the bottom-up term-enumeration engine added in #9908
(`src/ast/rewriter/term_enumeration.cpp`):
```cpp
unsigned max_count = 20;
for (auto t : tn.enum_terms(srt)) { // each ++ runs find_next()
if (max_count == 0)
break;
--max_count;
S->insert(t, generation);
}
```
`max_count = 20` bounds the number of **inserted** terms, but it does
**not** bound the work the generator performs to find the *next*
target-sort term. For sorts that admit few cheap target-sort terms but a
large intermediate term space (here `(Array enc_val Int)` and `(Array
String (option enc_val))`), a single advance of the iterator can explore
an explosive number of intermediate terms, each rewritten through
`th_rewriter`.
Crucially, the three driving loops of the engine —
`bottom_up_enumerator::find_next`,
`bottom_up_enumerator::enumerate_operators`, and
`children_iterator::has_next` — never check the resource limit /
cancellation flag. The per-query soft timeout (`:timeout 100`) *does*
fire and cancels `m.limit()` (via `cmd_context`'s `cancel_eh<reslimit>`
+ `scoped_timer`), but the enumeration never observes it, so the query
cannot be interrupted at 100 ms. It spins until the hard *process*
timeout `-T:20` fires, which prints `timeout` for the whole run and
aborts — instead of the solver returning `unknown` per query.
## Fix
Make the enumeration honor cancellation by checking
`m.limit().is_canceled()` at the head of each of the three unbounded
loops in `src/ast/rewriter/term_enumeration.cpp`. When a query is
cancelled (soft timeout / rlimit / Ctrl-C) the enumeration stops
promptly and the solver returns `unknown`, as it did before #9908. When
nothing is cancelled `is_canceled()` is `false`, so the set of
enumerated terms is unchanged — this only adds an interrupt point, it
does not alter which terms are produced.
```diff
bool has_next(unsigned cost) {
while (!m_done) {
+ if (m.limit().is_canceled())
+ return false;
if (has_child_at_cost(cost))
return true;
advance();
}
@@ find_next()
while (true) {
+ if (m.limit().is_canceled()) {
+ m_state = State::Done;
+ return nullptr;
+ }
switch (m_state) {
@@ enumerate_operators()
while (true) {
+ if (m.limit().is_canceled())
+ return nullptr;
```
## Validation
Built this branch in Release mode (base `6fd303c4b`) and ran the exact
snapshot-capture command:
```
$ z3 -T:20 inputs/issues/iss-6615/original.smt2
unknown
unknown
unknown
unknown
unknown
unknown
unknown
unknown
unknown
unknown
unknown
unknown
unknown
real 0m1.49s
```
- Output is **byte-identical** to the recorded
`inputs/issues/iss-6615/original.expected.out` oracle (13× `unknown`).
- The isolated first `check-sat` returns `unknown` in 0.14 s (previously
it did not terminate within 30 s under only the in-file `:timeout 100`).
- Trivial sanity check (`(assert (> x 0)) (check-sat)` → `sat`) is
unaffected.
Opened as a draft for human review.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
> Generated by [Fix a Z3 snapshot-regression
divergence](https://github.com/Z3Prover/bench/actions/runs/28155155541)
· 3.5K AIC · ⌖ 85.5 AIC · ⊞ 41.2K ·
[◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+snapshot-regression-fixer%22&type=pullrequests)
<!-- gh-aw-agentic-workflow: Fix a Z3 snapshot-regression divergence,
engine: copilot, version: 1.0.60, model: claude-opus-4.8, id:
28155155541, workflow_id: snapshot-regression-fixer, run:
https://github.com/Z3Prover/bench/actions/runs/28155155541 -->
<!-- gh-aw-workflow-id: snapshot-regression-fixer -->
<!-- gh-aw-workflow-call-id: Z3Prover/bench/snapshot-regression-fixer
-->
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
## Summary
Improves the Diophantine (`dio`) integer-feasibility controller in
`int_solver`, and fixes a latent bug where the user's Gomory-cut
configuration could be silently overridden at runtime. Also includes the
earlier `lia_w` work: randomized hammer gates, the `int_hammer_period` /
`random_hammers` parameters, and the linear `dio_calls_period` recovery.
## Motivation
The controller used a **single field** both as the static
`lp.dio_cuts_enable_gomory` parameter and as the live "is Gomory
running" flag. It started running Gomory (and the gcd test) once
`dio_calls_period` crossed a hard-coded `16`. Because `dio_calls_period`
is also driven by the randomized hammer gate, on instances where `dio`
is only intermittently productive the period could be ratcheted past 16
*by chance*, turning on Gomory + gcd and thrashing — e.g. `dillig/20-14`
went from a 100s solve (deterministic) to a 600s timeout (randomized)
purely from this spurious activation.
## Changes
- **Separate config from runtime state.** Split the shared field into
`m_dio_cuts_enable_gomory` (static config, never mutated) and
`m_run_gomory_with_dio` (runtime flag). Toggling the runtime state can
no longer clobber the user's `dio_cuts_enable_gomory` parameter.
- **Trigger on genuine dio failures, not the period proxy.** Running
Gomory-with-dio now starts after a count of **consecutive `undef` dio
returns** (reset on a dio conflict) rather than when the
randomization-inflated period crosses a threshold — robust to
`random_hammers` gate variance.
- **Parameterize the threshold.** New `lp.dio_gomory_enable_period`
(default 16). Set it very large to never auto-start Gomory, so Gomory
follows `dio_cuts_enable_gomory` only.
- **Try `dio` before Gomory** in `check()` so a productive dio conflict
preempts Gomory on dio-dominated instances.
## Evaluation (QF_LIA, full set, 600s, seed 555 paired)
- Dio-before-Gomory: **+33** problems across the 6 `random_hammers x
int_hammer_period` cells (5/6 cells improve).
- New trigger (`dio_gomory_enable_period=32`, random): **6417** vs the
period-16 baseline **6409**; no short-cutoff regression.
- Linear `dio_calls_period` recovery: keeping it on is worth ~+20 vs
off; `decrease=1` slightly ahead of the default 2.
Default behavior (`dio_gomory_enable_period=16`) is byte-for-byte
equivalent to the previous threshold logic.
## Notes
Debug-only tracing used during analysis (the `dio_calls_period_trace`
parameter plus per-hammer / period-evolution verbose output) is **not**
included.
---------
Signed-off-by: Lev Nachmanson <levnach@hotmail.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
## Summary
Fixes a Z3 snapshot-regression divergence tracked in Z3Prover/bench
discussion
https://github.com/Z3Prover/bench/discussions/2662.
- **Benchmark:** `iss-5753/small-3.smt2` (Z3Prover/bench corpus,
original issue z3#5753)
- **Regression:** recorded oracle `unsat` (captured 2026-06-05) →
current `master` produces `timeout` at `-T:20`
- **bench workflow run:**
https://github.com/Z3Prover/bench/actions/runs/28078176175
### Divergence
```diff
--- small-3.expected.out (expected)
+++ produced (current z3)
@@ -1 +1 @@
-unsat
+timeout
```
The answer `unsat` is still correct — z3 simply can no longer prove it
within the
budget. This is a **performance regression**, not a wrong/unsound
result.
## Root cause
The benchmark is a quantified Array/Real/Int problem solved via MBQI.
The model
finder's quantifier analyzer (`src/smt/smt_model_finder.cpp`) now
creates a
`ho_var` qinfo for **array-sorted quantified variables used as terms**
(the new
higher-order term-enumeration path introduced in #9908, "Term
enumeration",
commit `5699142f5`). Before that commit such an occurrence set
`m_info->m_is_auf = false` (a safe fallback that left no extra
instantiation
hints), which is the behaviour that was active when the `unsat` oracle
was
captured.
`ho_var::populate_inst_sets` builds instantiation candidates with:
```cpp
for (auto t : tn.enum_terms(srt)) {
...
S->insert(t, generation);
}
```
`term_enumeration::enum_terms` is a **lazy, increasing-cost generator
with no
inherent bound** (see `ast/rewriter/term_enumeration.h`). For this
benchmark the
relevant array sort `(Array Int (Array Int Real))` has many array-valued
uninterpreted productions (`tptpummul`, `trans`, `inv`, ...), so
iterating the
generator to exhaustion inserts an explosively large number of terms
into the
instantiation set, blowing up MBQI instantiation and exhausting the 20s
budget.
Confirmed empirically: running the benchmark with `-v:3` shows
`ho_var::populate_inst_sets: 622 (Array Int (Array Int Real))`, i.e.
this exact
path fires for the divergent benchmark.
Notably, the original commit `5699142f5` declared `unsigned max_count =
20;` for
this very loop but **never applied it**, and that dead variable was
subsequently
removed — leaving the enumeration completely unbounded.
## Fix
Restore the intended bound so at most 20 of the cheapest enumerated
terms are
added to the instantiation set:
```cpp
unsigned max_count = 20;
for (auto t : tn.enum_terms(srt)) {
if (max_count == 0)
break;
--max_count;
...
S->insert(t, generation);
}
```
This is sound (instantiation only ever adds valid ground instances of
the
quantified formula) and is strictly more information than the pre-#9908
baseline
(which added no `ho_var` terms at all), so it does not regress problems
the
feature was meant to help. The change is confined to a single loop in
`src/smt/smt_model_finder.cpp`.
## Validation
Built the checkout in Release mode and re-ran the failing benchmark with
the
same options the snapshot harness uses:
```
$ build/z3 -T:20 inputs/issues/iss-5753/small-3.smt2
unsat # was: timeout
real 0m0.120s
```
The combined output now matches the recorded `small-3.expected.out`
oracle
**byte-for-byte** (`unsat\n`). A smoke run over a sample of other corpus
benchmarks showed no crashes or new divergences, and a basic
`(check-sat)/(get-model)` sanity check still works.
---
This PR is opened as a **draft** for human review by the automated
snapshot-regression fixer.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
> Generated by [Fix a Z3 snapshot-regression
divergence](https://github.com/Z3Prover/bench/actions/runs/28078343154)
· 2.8K AIC · ⌖ 150.7 AIC · ⊞ 39K ·
[◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+snapshot-regression-fixer%22&type=pullrequests)
<!-- gh-aw-agentic-workflow: Fix a Z3 snapshot-regression divergence,
engine: copilot, version: 1.0.60, model: claude-opus-4.8, id:
28078343154, workflow_id: snapshot-regression-fixer, run:
https://github.com/Z3Prover/bench/actions/runs/28078343154 -->
<!-- gh-aw-workflow-id: snapshot-regression-fixer -->
<!-- gh-aw-workflow-call-id: Z3Prover/bench/snapshot-regression-fixer
-->
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This is another PR towards the goal of getting Z3 to compile cleanly
when included via FetchContents into clang-tidy, which uses a pretty
strict set of warnings.
This PR adds
```
"-Wcast-qual"
```
to the set of warnings enabled in the build. This gives warnings like:
```
/Users/daviddetlefs/z3/src/ast/ast.cpp:2897:38: warning: cast from 'app *const *' to 'expr **' drops const qualifier [-Wcast-qual]
```
I fixed these by inserting consts. In some cases, a "const_cast<T>(...)"
was necessary.
This is an **automated workflow self-test** of the `fixer-selftest` /
`snapshot-regression-fixer` pipeline running on the self-hosted
**`rise-runner-1`** runner pool. Its sole purpose is to prove the
end-to-end agentic pipeline works: Copilot inference runs, and the
`create-pull-request` safe output can open a real **draft** pull request
on `Z3Prover/z3` using the configured PAT. It is intentionally
build-free.
### Change
- **File:** `src/ast/simplifiers/elim_unconstrained.cpp` (module-header
`/*++ ... */` comment)
- **Before:** ` - it does not accomodate side constraints.`
- **After:** ` - it does not accommodate side constraints.`
A plain spelling correction in a comment: `accomodate` → `accommodate`.
The diff is a single line.
### Why this is safe
The change is **comment-only** — it touches no code, identifiers, string
literals, build files, or tests, and therefore **cannot affect z3's
behaviour or output**. Because of that, no compilation or testing was
performed and **no rebuild is needed** to be confident it is correct.
### For maintainers
This is a genuine, correct fix, so you are welcome to **merge** it.
Equally, you may simply **close** it — the success of this self-test
does not depend on the PR being merged.
> Generated by [Self-test the agentic PR pipeline with a tiny z3 comment
fix](https://github.com/Z3Prover/bench/actions/runs/27987685681) · 299.1
AIC · ⌖ 53.6 AIC · ⊞ 35.5K ·
[◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+fixer-selftest%22&type=pullrequests)
<!-- gh-aw-agentic-workflow: Self-test the agentic PR pipeline with a
tiny z3 comment fix, engine: copilot, version: 1.0.60, model:
claude-opus-4.8, id: 27987685681, workflow_id: fixer-selftest, run:
https://github.com/Z3Prover/bench/actions/runs/27987685681 -->
<!-- gh-aw-workflow-id: fixer-selftest -->
<!-- gh-aw-workflow-call-id: Z3Prover/bench/fixer-selftest -->
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
## Automated workflow self-test
This is an **automated workflow self-test** of the `fixer-selftest` /
`snapshot-regression-fixer` pipeline running on the self-hosted
**`rise-runner-1`** pool. Its sole purpose is to verify the end-to-end
agentic pipeline: that Copilot inference runs and that the
`create-pull-request` safe output can open a real **draft** PR on
`Z3Prover/z3` using the configured PAT.
## The fix
A single, objectively-correct spelling fix in a **code comment** (the
header comment block of the file — not code, identifiers, or string
literals).
- **File:** `src/ast/simplifiers/solve_eqs.cpp` (line 23, inside the
`/*++ ... --*/` header comment)
- **Before:** `... where bitset reprsents set of free variables.`
- **After:** `... where bitset represents set of free variables.`
The diff is a single line:
```diff
-1. maintain map FV: term -> bit-set where bitset reprsents set of free variables. Assume the number of variables is bounded.
+1. maintain map FV: term -> bit-set where bitset represents set of free variables. Assume the number of variables is bounded.
```
## Why this is safe
The change is **comment-only** and therefore **cannot affect z3's
behaviour or output** — so it needs **no rebuild and no testing** to be
confident it is correct. Nothing outside the comment text was touched
(no code, no string literals, no identifiers, no whitespace elsewhere).
## For maintainers
This is a genuine, correct fix, so feel free to **merge** it — but you
may also simply **close** it. The self-test's success does not depend on
this PR being merged; it only depends on the PR having been opened.
> Generated by [Self-test the agentic PR pipeline with a tiny z3 comment
fix](https://github.com/Z3Prover/bench/actions/runs/27986232656) · 208.5
AIC · ⌖ 75.1 AIC · ⊞ 35.5K ·
[◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+fixer-selftest%22&type=pullrequests)
<!-- gh-aw-agentic-workflow: Self-test the agentic PR pipeline with a
tiny z3 comment fix, engine: copilot, version: 1.0.60, model:
claude-opus-4.8, id: 27986232656, workflow_id: fixer-selftest, run:
https://github.com/Z3Prover/bench/actions/runs/27986232656 -->
<!-- gh-aw-workflow-id: fixer-selftest -->
<!-- gh-aw-workflow-call-id: Z3Prover/bench/fixer-selftest -->
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The `build` job in the Pyodide workflow was failing at wheel smoke-test
import time because `libz3.so` was not produced as a proper wasm side
module (`need the dylink section to be first`, missing exported
symbols). This PR restores the required Pyodide linker mode.
- **Root cause**
- Recent Pyodide linker flag changes enabled wasm EH/longjmp but omitted
`-sSIDE_MODULE=1`, so the generated `libz3.so` was not loadable by
Pyodide’s dynamic loader.
- **Changes**
- **`src/api/python/pyproject.toml`**
- Added `-sSIDE_MODULE=1` to `[tool.pyodide.build].ldflags`.
- **`src/api/python/setup.py`**
- Added `-sSIDE_MODULE=1` to the Pyodide `LDFLAGS` path to keep direct
`setup.py`-driven builds aligned with `pyproject.toml` behavior.
- **Flag delta (core fix)**
```toml
# src/api/python/pyproject.toml
[tool.pyodide.build]
ldflags = "-fwasm-exceptions -sSUPPORT_LONGJMP=wasm -sWASM_BIGINT -sSIDE_MODULE=1"
```
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Z3 incorrectly returns SAT for formulas like `store(store(const(x), i0,
e0), i1, e1) = store(store(const(y), i0, e0), i1, e1) ∧ x ≠ y` when the
index sort has a small finite domain (e.g. `(_ BitVec 2)` = 4 elements).
The quantifier-based encoding of the same constraint correctly returns
UNSAT.
## Changes
### `src/sat/smt/array_solver.cpp` — `add_parent_lambda()`
When a store is added to an array's `m_parent_lambdas` after that array
already has `m_has_default = true` (i.e., a `default(array)` term
already exists in the e-graph), the default axiom for the new store was
silently dropped. This breaks the propagation chain:
```
default(const(x)) = x
↓ [via store default axiom]
default(store(const(x), i, v)) = x
↓ [via store default axiom]
default(store(store(const(x), ...), ...)) = x
↓ [EUF congruence from store equality]
x = y → contradiction
```
Fix: push `default_axiom(lambda)` in `add_parent_lambda` when
`m_has_default` is already set, so late-arriving stores still get their
default axioms instantiated.
```cpp
void solver::add_parent_lambda(theory_var v_child, euf::enode* lambda) {
auto& d = get_var_data(find(v_child));
ctx.push_vec(d.m_parent_lambdas, lambda);
if (should_prop_upward(d))
propagate_select_axioms(d, lambda);
if (d.m_has_default) // new: fire default axiom retroactively
push_axiom(default_axiom(lambda));
}
```
### Known remaining gap
`mk_eq_core` in `array_rewriter.cpp` also has a related issue: the
unconditional store-chain expansion fires only when `domain_size >
num_lhs + num_rhs` (line 893), but the correct threshold is `domain_size
> max(num_lhs, num_rhs)`. With 4-element domain and 2 stores per side,
`2+2 = 4` equals the domain size so the expansion is skipped. The
variant gated by `m_expand_store_eq` already uses `max()` correctly
(line 911); the unconditional path needs the same fix.
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Nikolaj Bjorner <nbjorner@microsoft.com>
## Summary
Fix a use-after-free in `func_interp::compress()`.
When a function interpretation had previously grown large enough to
allocate `m_entry_table`, `compress()` could deallocate entries whose
result matched the else-case but leave the hash table intact. Later
`get_entry()` lookups could then return freed `func_entry*` values,
which showed up during model checking as a corrupted expression result
from `model_evaluator`.
## Root cause
`func_interp::compress()` compacted `m_entries` and freed removed
entries, but it did not rebuild or clear `m_entry_table`.
This left stale pointers in the lookup table whenever:
- the table had already been allocated on a larger interpretation, and
- compression removed some entries.
In the reported case, model evaluation rewrote `stack_s!1041` through
`BR_REWRITE1`, fetched a freed `func_entry` result from the stale table,
and then tripped an assertion in `expr::get_sort()` during quantifier
model checking.
## Fix
After compression removes entries, rebuild `m_entry_table` from the
surviving `m_entries`, or clear it when the surviving interpretation is
small.
## Regression coverage
Added a unit regression in `src/test/model_evaluator.cpp` that:
- creates a `func_interp` large enough to allocate `m_entry_table`,
- compresses away almost all entries,
- checks that removed keys no longer resolve, and
- checks that the surviving key still resolves to the correct result.
## Validation
- `../build/z3 ebso-115.smt2` previously hit an assertion in
`rewriter_def.h` / `ast.cpp`; after the fix it no longer asserts.
- `./test-z3 model_evaluator` passes with the new regression.
## Reproducer
I did not produce a smaller SMT2 benchmark in this change. The original
reproducer I used was `ebso-115.smt2`, and the new unit regression
directly exercises the stale-entry-table path in-process.
Co-authored-by: Can Cebeci <t-cancebeci@microsoft.com>
This is another PR towards the goal of getting Z3 to compile cleanly
when included via FetchContents into clang-tidy, which uses a pretty
strict set of warnings.
The PR adds
```
"-Wsuggest-override"
"-Winconsistent-missing-override"
```
to the CLANG_ONLY_WARNINGS. This exposes a relatively small number of places where method overrides did not use the "override" keyword. The PR fixes those.
(In cmd_util.h, I also made the *_CMD macros be uniform in not ending the class they define with a semicolon; the invocation of the macro can add the semicolon.)
## Summary
z3 already builds under Pyodide (there is a `pyodide.yml` workflow and
an
`IS_PYODIDE` path in `setup.py`), but that path uses `pyodide build` and
produces
a wheel tagged `emscripten_<pyodide-version>_wasm32`, which is pinned to
a single
Pyodide release and rejected by PyPI — so today it's only usable as a CI
artifact.
[PEP 783](https://peps.python.org/pep-0783/) introduced the portable
`pyemscripten_<date>_wasm32` platform tag that **PyPI accepts** and
`micropip` can
install at runtime. This makes `z3-solver` build that wheel via
`cibuildwheel --platform pyodide`.
## Changes
- **`setup.py`** — for the emscripten target, use the wheel platform tag
that
pyodide-build provides verbatim via `_PYTHON_HOST_PLATFORM` (e.g.
`pyemscripten_2026_0_wasm32`) instead of reconstructing an
`emscripten_*` tag.
Falls back to the previous behaviour when that env var is absent.
- **`setup.py` / `CMakeLists.txt` / `pyproject.toml`** — switch the
Pyodide build
from JS-based exceptions (`-fexceptions`,
`-sDISABLE_EXCEPTION_CATCHING=0`) to
**native wasm exception handling** (`-fwasm-exceptions
-sSUPPORT_LONGJMP=wasm`),
matching the ABI of the Pyodide 314 / emscripten 5 main module. With the
old
flags `libz3.so` imports `invoke_*` trampolines the runtime no longer
provides,
so the wheel builds but the first `Z3_mk_config` call fails at runtime
with
`Dynamic linking error: cannot resolve symbol invoke_vi`.
- **`pyproject.toml`** — add `[tool.cibuildwheel]` /
`[tool.pyodide.build]` so
`cibuildwheel --platform pyodide` builds and tests the wheel (`cp314`).
- **`.github/workflows/pyodide-pypi.yml`** (new) — build with
cibuildwheel and
publish to PyPI (trusted publishing) on tags. Existing `pyodide.yml`
unchanged.
## Verification
Built with `cibuildwheel 4.1.0` / `pyodide-build 0.35.0` / `emscripten
5.0.3`,
CPython 3.14 / Pyodide 314:
- Produces `z3_solver-4.17.0.0-py3-none-pyemscripten_2026_0_wasm32.whl`.
- `z3test.py` passes in the Pyodide runtime (node + wasm32).
- Installed via `micropip` and solves SMT problems both under node and
in a
browser (`sat` with a model, `unsat` for a contradiction).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
🕵️♂️ Reviewed by [Alcides Fonseca](https://wiki.alcidesfonseca.com)
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This is another PR towards the goal of getting a clean build with clang,
using the compiler options used in building clang-tidy.
In https://github.com/Z3Prover/z3/pull/9800, we changed the build flags
to eliminate a warning for zero-length arrays. In that PR, I missed this
one: there was one instance of m_arr[] instead of m_arr[0]. In the
clang-tidy build, that gives warnings like:
```
/Users/daviddetlefs/llvm-project/build_dbg/_deps/z3-src/src/model/func_interp.h:43:12: warning: flexible array members are a C99 feature [-Wc99-extensions]
```
The PR fixes this, making it a zero-length array, the idiom used in all
the other similar cases. I also added the compiler flag that produced
this warning, so we can notice such changes in the future.