3
0
Fork 0
mirror of https://github.com/Z3Prover/z3 synced 2026-06-29 11:58:51 +00:00

[snapshot-regression-fix] Fix psmt infinite loop on theory-incomplete cubes (#3044) (#9983)

## Summary

Fixes a `psmt` (parallel SMT tactic) regression where the solver hangs
to a wall-clock timeout instead of returning `unknown` on formulas whose
root cube is genuinely undetermined by an incomplete theory.

- **Originating discussion:**
https://github.com/Z3Prover/bench/discussions/2735
- **Benchmark:** `iss-3044/bug-1.smt2` (from [Z3 issue
#3044](https://github.com/Z3Prover/z3/issues/3044))

```smt2
(declare-fun a (Int) Bool)
(declare-fun b (Int) Bool)
(assert (distinct a b))
(check-sat-using psmt)
```

## Divergence

The recorded oracle (expected) vs. current z3 (combined stdout+stderr,
`-T:20`):

```diff
-(incomplete (theory array))
-unknown
+timeout
```

## Root cause

The rewritten parallel tactic (`src/solver/parallel_tactical.cpp`,
introduced in #9824/#9825) hangs on this input.

In the worker `run()` loop, every `l_undef` cube result was treated as
if the per-cube **conflict limit** had been reached: the worker
escalated the per-thread conflict budget (`update_max_thread_conflicts`)
and re-checked / re-split the same cube. When the `l_undef` actually
comes from **theory incompleteness** (here, the array theory cannot
decide `(distinct a b)` over `Int -> Bool`) rather than the conflict
limit, the verdict never changes, so the worker re-checks the same cube
forever.

Compounding this, the `batch_manager` state machine had **no terminal
`unknown` state** — the only way to finish was for some worker to prove
`sat`/`unsat`, which is impossible for a root-level theory-incomplete
formula. The combination produced an infinite loop and a wall-clock
timeout.

The pre-rewrite parallel tactic avoided this: its `giveup()` detected
reasons starting with `(incomplete` / `(sat.giveup`, reported a soft
undef, and echoed the reason to `verbose_stream()`.

## Fix

All changes are confined to `src/solver/parallel_tactical.cpp` (47
insertions, 4 deletions):

1. **Distinguish genuine incompleteness from conflict-limit
exhaustion.** In the worker `l_undef` case, only `reason_unknown() ==
"max-conflicts-reached"` benefits from escalating the budget /
splitting. For any other reason (incomplete theory, quantifiers,
lambdas, resource limits, ...) re-checking is futile, so the worker
records a sound `unknown` and stops working the branch.
2. **Add a terminal `is_unknown` batch-manager state** (`set_unknown`,
`get_result() -> l_undef`, reason storage). It is a *soft* result: it
does not cancel the other workers, and a definitive `sat`/`unsat`
verdict from another branch may still override it (the
`set_sat`/`set_unsat` guards now permit overriding `is_unknown`). All
`set_unsat` call sites are global formula-unsat (core ⊆ assumptions, or
independent of the tested backbone literal), so the override is sound;
tree-closure unsat remains guarded by `is_running` and cannot fire
because the undef leaf stays open.
3. **Restore the reason output.** The captured `reason_unknown` is
propagated to the result goal and echoed to `verbose_stream()`,
reproducing the `(incomplete (theory array))` line that the sequential
path / old parallel tactic emitted.

## Validation

Rebuilt the `./z3` checkout (`./configure && make -C build -j16`) and
re-ran the benchmark with the freshly built binary using the same
options the snapshot capture uses (`-T:20`, combined stdout+stderr):

```
$ z3 inputs/issues/iss-3044/bug-1.smt2 -T:20
(incomplete (theory array))
unknown
```

This matches the recorded `bug-1.expected.out` oracle **byte-for-byte**,
and the benchmark now completes in ~0.5s (was: timeout). Verified stable
across 8 consecutive runs. Basic `psmt` `sat`/`unsat` checks continue to
produce correct results.

Opened as a **draft** for human review.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>




> Generated by [Fix a Z3 snapshot-regression
divergence](https://github.com/Z3Prover/bench/actions/runs/28313246856)
· 5.7K AIC · ⌖ 85.8 AIC · ⊞ 41.2K ·
[◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+snapshot-regression-fixer%22&type=pullrequests)

<!-- gh-aw-agentic-workflow: Fix a Z3 snapshot-regression divergence,
engine: copilot, version: 1.0.60, model: claude-opus-4.8, id:
28313246856, workflow_id: snapshot-regression-fixer, run:
https://github.com/Z3Prover/bench/actions/runs/28313246856 -->

<!-- gh-aw-workflow-id: snapshot-regression-fixer -->
<!-- gh-aw-workflow-call-id: Z3Prover/bench/snapshot-regression-fixer
-->

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
Lev Nachmanson 2026-06-28 10:20:32 -07:00 committed by GitHub
parent 56bb49f8dc
commit e87aaa6924
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -133,6 +133,7 @@ class parallel_solver {
is_running,
is_sat,
is_unsat,
is_unknown,
is_exception_msg,
is_exception_code
};
@ -187,6 +188,7 @@ class parallel_solver {
unsigned m_exception_code = 0;
std::string m_exception_msg;
std::string m_reason_unknown;
model_ref m_model;
expr_ref_vector m_unsat_core;
std::atomic<bool> m_canceled = false;
@ -485,13 +487,14 @@ class parallel_solver {
m_core_min_jobs.reset();
m_model = nullptr;
m_unsat_core.reset();
m_reason_unknown.clear();
m_canceled = false;
}
void set_sat(ast_translation& l2g, model& mdl) {
std::scoped_lock lock(mux);
IF_VERBOSE(1, verbose_stream() << "Batch manager setting SAT.\n");
if (m_state != state::is_running) return;
if (m_state != state::is_running && m_state != state::is_unknown) return;
m_state = state::is_sat;
m_model = mdl.translate(l2g);
cancel_background_threads();
@ -506,14 +509,27 @@ class parallel_solver {
expr_ref_vector const& core) {
std::scoped_lock lock(mux);
IF_VERBOSE(1, verbose_stream() << "Batch manager setting UNSAT.\n");
if (m_state != state::is_running) return;
if (m_state != state::is_running && m_state != state::is_unknown) return;
m_state = state::is_unsat;
SASSERT(m_unsat_core.empty());
m_unsat_core.reset();
for (expr* c : core)
m_unsat_core.push_back(l2g(c));
cancel_background_threads();
}
// A worker found a cube whose result is genuinely undetermined (e.g. the
// theory solver is incomplete) and that cannot be refined any further.
// Record a sound 'unknown' verdict. This is a soft result: it does not
// cancel the remaining workers, so a definitive sat/unsat verdict from
// another branch may still arrive and override it.
void set_unknown(std::string const& reason) {
std::scoped_lock lock(mux);
IF_VERBOSE(1, verbose_stream() << "Batch manager setting UNKNOWN: " << reason << ".\n");
if (m_state != state::is_running) return;
m_state = state::is_unknown;
m_reason_unknown = reason;
}
void set_exception(std::string const& msg) {
std::scoped_lock lock(mux);
IF_VERBOSE(1, verbose_stream() << "Batch manager setting exception msg: " << msg << ".\n");
@ -941,6 +957,7 @@ class parallel_solver {
throw default_exception("par2: inconsistent end state");
case state::is_sat: return l_true;
case state::is_unsat: return l_false;
case state::is_unknown: return l_undef;
case state::is_exception_msg:
throw default_exception(m_exception_msg.c_str());
case state::is_exception_code:
@ -951,6 +968,8 @@ class parallel_solver {
}
}
std::string const& get_reason_unknown() const { return m_reason_unknown; }
model_ref& get_model() { return m_model; }
expr_ref_vector const& get_unsat_core() const { return m_unsat_core; }
@ -1215,8 +1234,21 @@ class parallel_solver {
switch (r) {
case l_undef: {
update_max_thread_conflicts();
LOG_WORKER(1, " found undef cube\n");
// Escalating the conflict budget (update_max_thread_conflicts)
// or splitting the cube only helps when the cube was abandoned
// because the per-cube conflict limit was reached. For any other
// source of incompleteness (an incomplete theory, quantifiers,
// lambdas, resource limits, ...) the verdict will not change, so
// re-checking the same cube would spin forever. Record a sound
// 'unknown' verdict and stop working this branch instead.
std::string reason = s->reason_unknown();
if (reason != "max-conflicts-reached") {
LOG_WORKER(1, " undef cube is not conflict-limited (" << reason << "); reporting unknown\n");
b.set_unknown(reason);
return;
}
update_max_thread_conflicts();
if (m_config.m_max_cube_depth <= cube.size())
goto check_cube_start;
expr_ref atom = get_split_atom(lease, cube);
@ -2046,6 +2078,10 @@ public:
st.copy(m_stats);
}
std::string reason_unknown() const {
return m_batch_manager.get_reason_unknown();
}
void reset_statistics() {
m_stats.reset();
}
@ -2128,6 +2164,13 @@ public:
case l_undef:
if (!m.inc())
throw tactic_exception(Z3_CANCELED_MSG);
{
std::string reason = ps.reason_unknown();
if (!reason.empty()) {
g->set_reason_unknown(reason);
IF_VERBOSE(0, verbose_stream() << reason << "\n");
}
}
break;
}