3
0
Fork 0
mirror of https://github.com/Z3Prover/z3 synced 2026-06-27 19:08:49 +00:00

[snapshot-regression-fix] Honor cancellation/timeout in bottom-up term enumeration (MBQI) (#9956)

Fixes a Z3 snapshot-regression divergence reported in `Z3Prover/bench`
discussion: https://github.com/Z3Prover/bench/discussions/2667

## Divergence

- **benchmark:** `iss-6615/original.smt2` (lives at
`inputs/issues/iss-6615/` in `Z3Prover/bench`)
- **kind:** `diff`
- **z3 under test:** `z3-4.17.0-x64-glibc-2.39` (Nightly)
- **budget:** per-file `20s` — the snapshot capture runs `z3 -T:20
original.smt2`

The recorded oracle is 13× `unknown` (one per `check-sat`, each preceded
by an in-file `(set-option :timeout 100)` soft timeout). Current z3
instead prints a single `timeout`:

```diff
--- original.expected.out (expected)
+++ produced (current z3)
@@ -1,13 +1 @@
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
-unknown
+timeout
```

## Root cause

The benchmark uses `(set-logic ALL)` with quantifiers over higher-order
(array / lambda) sorts, so MBQI drives `ho_var::populate_inst_sets`
(`src/smt/smt_model_finder.cpp`), which enumerates candidate ground
terms with the bottom-up term-enumeration engine added in #9908
(`src/ast/rewriter/term_enumeration.cpp`):

```cpp
unsigned max_count = 20;
for (auto t : tn.enum_terms(srt)) {   // each ++ runs find_next()
    if (max_count == 0)
        break;
    --max_count;
    S->insert(t, generation);
}
```

`max_count = 20` bounds the number of **inserted** terms, but it does
**not** bound the work the generator performs to find the *next*
target-sort term. For sorts that admit few cheap target-sort terms but a
large intermediate term space (here `(Array enc_val Int)` and `(Array
String (option enc_val))`), a single advance of the iterator can explore
an explosive number of intermediate terms, each rewritten through
`th_rewriter`.

Crucially, the three driving loops of the engine —
`bottom_up_enumerator::find_next`,
`bottom_up_enumerator::enumerate_operators`, and
`children_iterator::has_next` — never check the resource limit /
cancellation flag. The per-query soft timeout (`:timeout 100`) *does*
fire and cancels `m.limit()` (via `cmd_context`'s `cancel_eh<reslimit>`
+ `scoped_timer`), but the enumeration never observes it, so the query
cannot be interrupted at 100 ms. It spins until the hard *process*
timeout `-T:20` fires, which prints `timeout` for the whole run and
aborts — instead of the solver returning `unknown` per query.

## Fix

Make the enumeration honor cancellation by checking
`m.limit().is_canceled()` at the head of each of the three unbounded
loops in `src/ast/rewriter/term_enumeration.cpp`. When a query is
cancelled (soft timeout / rlimit / Ctrl-C) the enumeration stops
promptly and the solver returns `unknown`, as it did before #9908. When
nothing is cancelled `is_canceled()` is `false`, so the set of
enumerated terms is unchanged — this only adds an interrupt point, it
does not alter which terms are produced.

```diff
     bool has_next(unsigned cost) {
         while (!m_done) {
+            if (m.limit().is_canceled())
+                return false;
             if (has_child_at_cost(cost))
                 return true;
             advance();
         }
@@ find_next()
         while (true) {
+            if (m.limit().is_canceled()) {
+                m_state = State::Done;
+                return nullptr;
+            }
             switch (m_state) {
@@ enumerate_operators()
         while (true) {
+            if (m.limit().is_canceled())
+                return nullptr;
```

## Validation

Built this branch in Release mode (base `6fd303c4b`) and ran the exact
snapshot-capture command:

```
$ z3 -T:20 inputs/issues/iss-6615/original.smt2
unknown
unknown
unknown
unknown
unknown
unknown
unknown
unknown
unknown
unknown
unknown
unknown
unknown
real 0m1.49s
```

- Output is **byte-identical** to the recorded
`inputs/issues/iss-6615/original.expected.out` oracle (13× `unknown`).
- The isolated first `check-sat` returns `unknown` in 0.14 s (previously
it did not terminate within 30 s under only the in-file `:timeout 100`).
- Trivial sanity check (`(assert (> x 0)) (check-sat)` → `sat`) is
unaffected.

Opened as a draft for human review.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>




> Generated by [Fix a Z3 snapshot-regression
divergence](https://github.com/Z3Prover/bench/actions/runs/28155155541)
· 3.5K AIC · ⌖ 85.5 AIC · ⊞ 41.2K ·
[◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+snapshot-regression-fixer%22&type=pullrequests)

<!-- gh-aw-agentic-workflow: Fix a Z3 snapshot-regression divergence,
engine: copilot, version: 1.0.60, model: claude-opus-4.8, id:
28155155541, workflow_id: snapshot-regression-fixer, run:
https://github.com/Z3Prover/bench/actions/runs/28155155541 -->

<!-- gh-aw-workflow-id: snapshot-regression-fixer -->
<!-- gh-aw-workflow-call-id: Z3Prover/bench/snapshot-regression-fixer
-->

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
Lev Nachmanson 2026-06-25 20:36:06 -07:00 committed by GitHub
parent dcd2c70042
commit e76239ceda
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -255,6 +255,8 @@ public:
bool has_next(unsigned cost) {
while (!m_done) {
if (m.limit().is_canceled())
return false;
if (has_child_at_cost(cost))
return true;
advance();
@ -384,6 +386,10 @@ private:
expr* find_next() {
while (true) {
if (m.limit().is_canceled()) {
m_state = State::Done;
return nullptr;
}
switch (m_state) {
case State::Leaves:
while (m_leaf_idx < m_grammar.leaves().size()) {
@ -438,6 +444,8 @@ private:
expr *enumerate_operators() {
auto const &ops = m_grammar.operators();
while (true) {
if (m.limit().is_canceled())
return nullptr;
// first find terms at m_cost that were already created
if (m_bank_idx < m_bank_size) {