3
0
Fork 0
mirror of https://github.com/Z3Prover/z3 synced 2026-06-25 18:10:35 +00:00

[snapshot-regression-fix] Bound ho_var term enumeration to fix MBQI timeout regression (iss-5753/small-3.smt2) (#9939)

## Summary

Fixes a Z3 snapshot-regression divergence tracked in Z3Prover/bench
discussion
https://github.com/Z3Prover/bench/discussions/2662.

- **Benchmark:** `iss-5753/small-3.smt2` (Z3Prover/bench corpus,
original issue z3#5753)
- **Regression:** recorded oracle `unsat` (captured 2026-06-05) →
current `master` produces `timeout` at `-T:20`
- **bench workflow run:**
https://github.com/Z3Prover/bench/actions/runs/28078176175

### Divergence

```diff
--- small-3.expected.out (expected)
+++ produced (current z3)
@@ -1 +1 @@
-unsat
+timeout
```

The answer `unsat` is still correct — z3 simply can no longer prove it
within the
budget. This is a **performance regression**, not a wrong/unsound
result.

## Root cause

The benchmark is a quantified Array/Real/Int problem solved via MBQI.
The model
finder's quantifier analyzer (`src/smt/smt_model_finder.cpp`) now
creates a
`ho_var` qinfo for **array-sorted quantified variables used as terms**
(the new
higher-order term-enumeration path introduced in #9908, "Term
enumeration",
commit `5699142f5`). Before that commit such an occurrence set
`m_info->m_is_auf = false` (a safe fallback that left no extra
instantiation
hints), which is the behaviour that was active when the `unsat` oracle
was
captured.

`ho_var::populate_inst_sets` builds instantiation candidates with:

```cpp
for (auto t : tn.enum_terms(srt)) {
    ...
    S->insert(t, generation);
}
```

`term_enumeration::enum_terms` is a **lazy, increasing-cost generator
with no
inherent bound** (see `ast/rewriter/term_enumeration.h`). For this
benchmark the
relevant array sort `(Array Int (Array Int Real))` has many array-valued
uninterpreted productions (`tptpummul`, `trans`, `inv`, ...), so
iterating the
generator to exhaustion inserts an explosively large number of terms
into the
instantiation set, blowing up MBQI instantiation and exhausting the 20s
budget.

Confirmed empirically: running the benchmark with `-v:3` shows
`ho_var::populate_inst_sets: 622 (Array Int (Array Int Real))`, i.e.
this exact
path fires for the divergent benchmark.

Notably, the original commit `5699142f5` declared `unsigned max_count =
20;` for
this very loop but **never applied it**, and that dead variable was
subsequently
removed — leaving the enumeration completely unbounded.

## Fix

Restore the intended bound so at most 20 of the cheapest enumerated
terms are
added to the instantiation set:

```cpp
unsigned max_count = 20;
for (auto t : tn.enum_terms(srt)) {
    if (max_count == 0)
        break;
    --max_count;
    ...
    S->insert(t, generation);
}
```

This is sound (instantiation only ever adds valid ground instances of
the
quantified formula) and is strictly more information than the pre-#9908
baseline
(which added no `ho_var` terms at all), so it does not regress problems
the
feature was meant to help. The change is confined to a single loop in
`src/smt/smt_model_finder.cpp`.

## Validation

Built the checkout in Release mode and re-ran the failing benchmark with
the
same options the snapshot harness uses:

```
$ build/z3 -T:20 inputs/issues/iss-5753/small-3.smt2
unsat        # was: timeout

real    0m0.120s
```

The combined output now matches the recorded `small-3.expected.out`
oracle
**byte-for-byte** (`unsat\n`). A smoke run over a sample of other corpus
benchmarks showed no crashes or new divergences, and a basic
`(check-sat)/(get-model)` sanity check still works.

---

This PR is opened as a **draft** for human review by the automated
snapshot-regression fixer.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>




> Generated by [Fix a Z3 snapshot-regression
divergence](https://github.com/Z3Prover/bench/actions/runs/28078343154)
· 2.8K AIC · ⌖ 150.7 AIC · ⊞ 39K ·
[◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+snapshot-regression-fixer%22&type=pullrequests)

<!-- gh-aw-agentic-workflow: Fix a Z3 snapshot-regression divergence,
engine: copilot, version: 1.0.60, model: claude-opus-4.8, id:
28078343154, workflow_id: snapshot-regression-fixer, run:
https://github.com/Z3Prover/bench/actions/runs/28078343154 -->

<!-- gh-aw-workflow-id: snapshot-regression-fixer -->
<!-- gh-aw-workflow-call-id: Z3Prover/bench/snapshot-regression-fixer
-->

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
Lev Nachmanson 2026-06-24 10:58:32 -07:00 committed by GitHub
parent 500ca46434
commit d1170d19bd
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -1435,7 +1435,11 @@ namespace smt {
}
}
unsigned max_count = 20;
for (auto t : tn.enum_terms(srt)) {
if (max_count == 0)
break;
--max_count;
unsigned generation = 0; // todo - inherited from sub-term of t?
TRACE(model_finder, tout << "ho_var: adding term " << mk_ismt2_pp(t, m)
<< " to instantiation set of S" << std::endl;);