mirrors/z3 - Libre-Chip.org

mirrors/z3

mirror of https://github.com/Z3Prover/z3 synced 2026-07-25 16:32:35 +00:00

Author	SHA1	Message	Date
Nikolaj Bjorner	70df91bc4e	fix #10039 and #10032	2026-07-04 12:43:08 -07:00
Nikolaj Bjorner	3d29d81607	Fix TPTP polymorphism crashes in final-check and model checking Root-caused and fixed 261 debug-assertion crashes found by running Z3 across the TPTP benchmarks (-tptp -T:5 model_validate=true): 1. theory_polymorphism::final_check_eh returned FC_DONE after assigning the negation of its (already-true) theory assumption, which creates a conflict. Returning FC_DONE reported l_true while the context was inconsistent, tripping SASSERT(status != l_true \|\| !inconsistent()) in context::restart. Return FC_CONTINUE so conflict resolution turns it into l_false and the normal research loop runs. 2. model_evaluator::get_macro, polymorphic branch: def = subst(def) assigned an expr_ref temporary to a raw expr*&; the temporary freed the freshly substituted term, leaving def dangling (use-after-free during model evaluation). Pin the substituted def in m_pinned, as the as-array path already does. 3. smt_model_checker::add_instance: relax stale SASSERT(!m.is_model_value(sk_term)); get_inv may legitimately return a model value in polymorphic settings, already handled downstream by get_type_compatible_term. Unit tests: 92 passed, 0 failed. All 261 assertion crashes resolved; the 3 remaining files are controlled ERR_PARSER (exit 103) rejections. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-07-03 20:32:04 -07:00
Nikolaj Bjorner	348bb3b6a4	Fix memory leaks in polymorphism instantiation engine The polymorphism theory routed polymorphic (\) problems through theory_polymorphism, which instantiated axioms during search. Two leaks: 1. In inst::instantiate, insert_ref_map was constructed with an expr_ref argument, so its template parameter D deduced to expr_ref instead of expr. Trail objects are region-allocated and freed without running destructors, so the embedded expr_ref never released its reference, leaking one AST subtree per instantiation. Pass e_inst.get() so D is expr, matching the raw hashtable + manual inc_ref/dec_ref pattern. 2. trail_stack's destructor does not call reset(), so level-0 trail items (including the inc_ref balancing entries for m_from_instantiation) were never undone when the theory was destroyed. Added a ~theory_polymorphism destructor that calls m_trail.reset(). Also keeps a defensive alias check in util::unify and a fresh per-iteration substitution in inst::instantiate. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-07-03 16:34:57 -07:00
Nikolaj Bjorner	6d5e09e2fa	polymorphism: prevent cyclic substitutions in unify to fix stack overflow When merging two type substitutions, util::unify(substitution, substitution, substitution) inserted bindings without an occurs-check. Merging maps such as A \|-> list(B) and B \|-> list(A) produced a self-referential binding B \|-> list(list(B)), and applying that substitution recursed forever, causing a stack overflow during the first polymorphic instantiation round. This was exposed by encoding TPTP $tType quantification as polymorphism (8ee8a3cda): mutually-recursive polymorphic types in THF problems (e.g. COM/DAT/ITP Coq-derived files) triggered 60 stack-overflow crashes during check_sat. Add occurs-checks so a binding that would make the substitution cyclic causes the merge to fail (the instantiation is soundly skipped). Values are resolved against the current substitution before insertion, preserving the acyclic invariant. Verified: the 60 previously-crashing TPTP files now terminate cleanly; 92/92 unit tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-07-03 16:34:57 -07:00
Nikolaj Bjorner	5bd485eb03	TPTP: encode $tType quantification as polymorphism; guard dependent types Fixes soundness/completeness of the TPTP frontend for polymorphic (TF1/TH1) problems, reducing TPTP-v9.2.1 BUG verdicts from 28 to 4. * Treat regular-forall "! [A: $tType] : ..." as genuine type quantification (bind A via mk_type_var) instead of monomorphizing it to the universe sort. This is the standard THF/TH1 way to quantify over types, and monomorphizing it silently prevented theory_polymorphism from instantiating the axioms. * Use the plain smt solver (mk_smt_solver_factory) for problems that contain type variables. The strategic solver's tactic preprocessing eliminates the unconstrained conjecture instance before the core can link it to its polymorphic axiom, yielding a spurious CounterSatisfiable. * Detect value-indexed dependent type families ("T > $tType"), which the parametric-polymorphism encoding cannot represent soundly, and downgrade both sat and unsat verdicts to GaveUp (previously produced unsound Theorems, e.g. SEV600/601/602). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-07-03 16:34:56 -07:00
Nikolaj Bjorner	6b7725dcb8	Fix use-after-free in polymorphism substitution over sorts In polymorphism::substitution::operator()(sort*), each substituted sub-sort was held only in a local sort_ref that was destroyed at the end of the loop iteration, while its raw pointer was retained in the parameter vector passed to mk_sort. When the sub-sort's refcount dropped to zero, its memory was freed and then reused by the next allocation, producing a self-referential sort. Structural sort traversals such as has_type_var (which has no cycle detection) then recursed infinitely, manifesting as a stack overflow. Pin each intermediate sub-sort in a sort_ref_vector so it stays alive until after mk_sort has taken its own references. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-07-03 16:34:55 -07:00
Nikolaj Bjorner	c8e5dd0ca5	TPTP: quoted numeric tokens are distinct-objects/functors, not numerals A double-quoted TPTP token such as "138" is a distinct object, and a single-quoted '138' is a functor name; neither is an arithmetic numeral. parse_name() strips the quotes, so the subsequent is_nonempty_digit_string check was converting them into Int literals (then boxed Int->U), mis-encoding distinct objects as equal numbers. Guard both numeral checks (parse_term_primary and the atomic-formula parser) with !m_last_name_quoted so only bare unquoted digit tokens become numerals. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-07-03 16:34:55 -07:00
Lev Nachmanson	cc5a2dae5e	[snapshot-regression-fix] bv_rewriter: keep (= var concat) intact so DER can eliminate the bound variable (iss-4525/bug-7) (#10034 ) ## Summary Fixes the snapshot-regression divergence reported in Z3Prover/bench discussion #2977 — https://github.com/Z3Prover/bench/discussions/2977 — for benchmark `iss-4525/bug-7.smt2`. ## Divergence The benchmark's second query `(check-sat-using (then simplify ctx-solver-simplify))` regressed from `sat` to `unknown`: ```diff --- bug-7.expected.out (expected) +++ produced (current z3) @@ -1,2 +1,2 @@ sat -sat +unknown ``` The input sets `:rewriter.split_concat_eq true` and `:smt.threads 3`, and its core assertion has the shape `(not (forall ((q11 (_ BitVec 21)) ...) (not (= q11 q9 q11 (concat #b01111000010 s)))))`. ## Root cause With `split_concat_eq` enabled, `bv_rewriter::mk_eq_concat` rewrites an equality `(= x (concat ...))` into per-slice extract equalities, e.g. `(= (extract 9 0 x) s) ∧ (= (extract 20 10 x) #b01111000010)`. When `x` is a bound (de Bruijn) variable, this is harmful: destructive equality resolution (`der.cpp`) only recognises the pattern `(= VAR t)` to eliminate a bound variable. After the split, the variable only appears under `extract`, so DER can no longer eliminate it and a residual quantifier survives `simplify`. Discharging that residual quantifier is then left to the solver invoked inside `ctx-solver-simplify`. That solver is where the observable regression actually lives: with `smt.threads ≥ 2` the parallel solver (`smt_parallel.cpp`) now returns `unknown` on the quantified cube instead of solving it (the older, oracle-era parallel solver kept splitting and proved it), so `ctx-solver-simplify` can no longer reduce `(not (forall ...))` to `true` and reports `unknown`. Reproduced with an A/B comparison of an oracle-era build (`sat` / correct) vs. current tip (`unknown`); the sequential path (`threads=1`) is unaffected. Rather than touch the parallel solver — whose current early-exit behaviour is a deliberate termination fix and is risky to revert — this change removes the condition that creates the residual quantifier in the first place, so the goal is solved by `simplify` alone and no longer depends on the parallel solver's completeness. ## Fix In `bv_rewriter::is_concat_split_target`, exclude a bare variable from being a split target: ```diff - m_split_concat_eq \|\| + (m_split_concat_eq && !is_var(t)) \|\| m_util.is_concat(t) \|\| m_util.is_numeral(t) \|\| m_util.is_bv_or(t); ``` `split_concat_eq` is only a bit-blasting heuristic, so skipping it for `(= var concat)` is sound and restores DER-based variable elimination. Ground terms are `app` nodes (never `var` nodes), so default behaviour (`split_concat_eq` is off by default) and all ground uses are completely unchanged — only the explicitly-enabled option with a bound-variable operand is affected. ## Validation - Rebuilt the checkout (`./configure && make -C build`) with the fix. - Re-ran the benchmark with the capture options (`z3 -T:20 inputs/issues/iss-4525/bug-7.smt2`): output is now `sat` / `sat`, an exact match to the recorded `bug-7.expected.out` oracle, deterministic across repeated runs. - Confirmed the mechanism: `(apply (then simplify))` with `split_concat_eq` enabled now empties the goal (DER eliminates the bound variable), whereas before it left a residual quantifier. - Confirmed `split_concat_eq` still splits ground `(= (concat a b) c)` equalities into extract-equalities (intended behaviour preserved). - Ran the relevant `test-z3` unit suites — all pass: `ast`, `bit_vector`, `fixed_bit_vector`, `simplifier`, `bit_blaster`, `var_subst`, `arith_rewriter`, `seq_rewriter`, `factor_rewriter`, `quant_solve`, `euf_bv_plugin`. Opened as a draft for human review. Note the transparency caveat above: the deeper behavioural regression is in the parallel solver's handling of quantified cubes; this patch resolves the reported divergence robustly at the rewriter/DER layer instead of altering that solver. > Generated by [Fix a Z3 snapshot-regression divergence](https://github.com/Z3Prover/bench/actions/runs/28646063005) · 989.2 AIC · ⌖ 40.3 AIC · ⊞ 8.9K · [◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+snapshot-regression-fixer%22&type=pullrequests) <!-- gh-aw-agentic-workflow: Fix a Z3 snapshot-regression divergence, engine: copilot, version: 1.0.63, model: claude-opus-4.8, id: 28646063005, workflow_id: snapshot-regression-fixer, run: https://github.com/Z3Prover/bench/actions/runs/28646063005 --> <!-- gh-aw-workflow-id: snapshot-regression-fixer --> <!-- gh-aw-workflow-call-id: Z3Prover/bench/snapshot-regression-fixer --> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-07-03 13:38:56 -07:00
Nikolaj Bjorner	a07b71cabe	bugfix for empty ranges Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-07-03 13:34:54 -07:00
Lev Nachmanson	7e9eef845f	smt: skip m_watches probe for unwatched literals in relevancy propagator (#10035 )	2026-07-03 09:53:04 -07:00
Nikolaj Bjorner	f15584cdae	update epsilon encoding	2026-07-02 15:23:36 -07:00
Lev Nachmanson	2a8f66f22b	[snapshot-regression-fix] Keep symbolic re.range non-empty; fix soundness regression on range membership (#10017 ) ## Summary Fixes a soundness regression in the sequence/regex rewriter: a symbolic character range such as `(re.range x x)` was unsoundly collapsed to `re.empty`, causing a satisfiable membership constraint to be reported `unsat`. This was surfaced by the `snapshot-regression` corpus in `Z3Prover/bench`. - Originating discussion: https://github.com/Z3Prover/bench/discussions/2761 - Benchmark: `iss-5873/bug-2.smt2` (in `Z3Prover/bench`, under `inputs/issues/iss-5873/`) - z3 under test at capture: `z3-4.17.0-x64-glibc-2.39` (Nightly) ## Divergence The recorded oracle expects `sat`; current z3 returns `unsat`: ```diff --- bug-2.expected.out (expected) +++ produced (current z3) @@ -1,3 +1,4 @@ -sat -((tmp_str0 "\u{0}")) +unsat +(error "line 12 column 10: check annotation that says sat") +(error "line 14 column 22: model is not available") (:reason-unknown "") ``` The benchmark asserts (simplified): ```smt2 (assert (= (str.in_re (str.replace tmp_str0 tmp_str0 tmp_str0) (re.range tmp_str0 tmp_str0)) (str.contains tmp_str0 tmp_str0))) ``` `str.contains x x` is always true and `str.replace x x x = x`, so this requires `str.in_re x (re.range x x)` to hold, which is satisfiable exactly when `x` is a single character (`len(x) = 1`). ## Root cause `seq_rewriter::mk_re_range` treated any bound that is not a concrete single-character literal as making the whole range empty: ```cpp if (str().is_string(lo, slo) && slo.length() == 1) clo = slo[0]; else if (str().is_unit(lo, lo1) && m_util.is_const_char(lo1, clo)) ; else is_empty = true; // unsound for a symbolic bound ``` For a symbolic bound this is unsound: `(re.range x x)` denotes `{x}` whenever `x` is a single character, not `∅`. Collapsing it to `re.empty` makes `str.in_re x (re.range x x)` false, contradicting the (true) `str.contains x x`, so the solver derives an unsound `unsat`. `git blame` attributes this unsound collapse to z3 commit ``15f33f458d`` ("Derive with ranges (#9965)"), which post-dates the oracle capture. ## Fix Two surgical changes in `src/ast/rewriter/seq_rewriter.cpp`: 1. `mk_re_range` no longer assumes emptiness for symbolic bounds. It concludes `re.empty` only when it can prove emptiness — a bound whose length can never be 1, or two concrete bounds with `lo > hi`. When a bound is symbolic it returns `BR_FAILED` and keeps the range. Concrete single-character ranges keep their existing handling (`lo == hi → str.to_re`, inverted → `re.empty`). 2. `mk_str_in_regexp` reduces membership in a range that has a symbolic bound to the equivalent length/order constraints, which are sound and complete under SMT-LIB `re.range` semantics: `str.in_re e (re.range lo hi)` ⟶ `len(lo)=1 ∧ len(hi)=1 ∧ len(e)=1 ∧ lo ≤ e ∧ e ≤ hi` (using `str.<=`). The derivative engine only unfolds ranges whose bounds are concrete characters, so without this reduction a symbolic-bound range would otherwise be left unsolved. ## Validation Rebuilt z3 from this branch on the workflow runner (`./configure && make -C build -j$(nproc)`) and re-ran the failing benchmark with the same option the snapshot capture uses (`-T:20`): ``` $ z3 -T:20 inputs/issues/iss-5873/bug-2.smt2 sat ((tmp_str0 "A")) (:reason-unknown "") ``` The verdict is now `sat` (was `unsat`) — the soundness regression is resolved. A correctness battery over concrete and symbolic ranges all returns the expected results, e.g.: - `(str.in_re "b" (re.range "a" "c"))` → `sat`, `(str.in_re "d" (re.range "a" "c"))` → `unsat` - `(str.in_re x (re.range x x))` → `sat`; with `(= (str.len x) 2)` → `unsat` - `(str.in_re "b" (re.range x y))` → `sat`; with `(str.< y x)` → `unsat` - `(str.in_re "" (re.range x y))` → `unsat`; `(str.in_re "ab" (re.range "a" "c"))` → `unsat` The pre-existing concrete-range derivative fast path is unchanged. ### Note on the model value (benign, unrelated to this fix) The model value differs from the recorded oracle: current z3 prints `((tmp_str0 "A"))` whereas the oracle recorded `((tmp_str0 "\u{0}"))`. Both are valid single-character models (the formula has many). This difference is pre-existing and unrelated to this fix: even a bare `(assert (= (str.len x) 1))` yields `"A"` on current z3. It stems from the seq/char theory's default character assignment for otherwise-unconstrained characters (`theory_char.cpp` assigns fresh characters starting from `'A'`), not from range handling. I deliberately did not force the character to `\u{0}` — adding `x = "\u{0}"` would be unsound over-constraining, and changing the global default character is out of scope for this soundness fix and would perturb unrelated models. The output is therefore semantically equivalent to the oracle (same `sat` verdict and reason-unknown) but not byte-identical. --- Draft for human review. Diagnosed and fixed by the `snapshot-regression-fixer` maintenance workflow. > Generated by [Fix a Z3 snapshot-regression divergence](https://github.com/Z3Prover/bench/actions/runs/28502614658) · 890.7 AIC · ⌖ 46.8 AIC · ⊞ 9K · [◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+snapshot-regression-fixer%22&type=pullrequests) <!-- gh-aw-agentic-workflow: Fix a Z3 snapshot-regression divergence, engine: copilot, version: 1.0.63, model: claude-opus-4.8, id: 28502614658, workflow_id: snapshot-regression-fixer, run: https://github.com/Z3Prover/bench/actions/runs/28502614658 --> <!-- gh-aw-workflow-id: snapshot-regression-fixer --> <!-- gh-aw-workflow-call-id: Z3Prover/bench/snapshot-regression-fixer --> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>	2026-07-02 14:00:51 -07:00
davedets	6ac3075022	Remove unnecessary semicolons (Attempt 2) (#10020 ) This is another PR towards the goal of getting Z3 to compile cleanly when included via FetchContents into clang-tidy, which uses a pretty strict set of warnings. This is a second version of https://github.com/Z3Prover/z3/pull/9957. I address @NikolajBjorner 's comments about not changing the semicolons after macro invocations, because some editors work better with them present. It now, to the best of my ability, only deletes semis: * after the closing brace of namespace decl. * after the closing brace of an extern "C" decl. * after a function definition. This PR is very large, but it consists entirely of deletions of semicolons in these situations. (If there was a way to update the previous PR, which had been closed, and that is preferable, please let me know. I couldn't figure it out.)	2026-07-02 12:47:29 -07:00
Nikolaj Bjorner	69444de05b	updated with bug fixes Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-07-01 16:26:41 -07:00
Nikolaj Bjorner	652402fa1f	branch Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-30 20:47:01 -07:00
Nikolaj Bjorner	4fb80761c6	bug fixes	2026-06-30 20:18:41 -07:00
Nikolaj Bjorner	8e70dbaebc	Update tptp_frontend.cpp	2026-06-30 15:22:41 -07:00
Nikolaj Bjorner	d666ef1ddf	skip modalities, print warnings Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-30 12:41:42 -07:00
Nikolaj Bjorner	c85e2ee2bd	sort constraint Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-30 12:41:41 -07:00
Clemens Eisenhofer	b3143e759b	Porting seq_split to master (#9840 ) Co-authored-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-30 10:18:28 -07:00
Nikolaj Bjorner	c22a7bac7c	remove debug output Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-30 09:48:29 -07:00
Lev Nachmanson	2490e86d3f	nlsat/anum: share mutation-aware merge sort in one helper (#10006 ) ## Summary Follow-up to #10001 addressing @NikolajBjorner's review comment: > isn't this nearly identical AI generated code to the other file? There has to be some modular approach to deal with sorting vectors? #10001 introduced two nearly-identical copies of a bounds-safe, mutation-aware index-permutation merge sort: - `algebraic_numbers.cpp::merge_sort_roots_perm` - `nlsat/levelwise.cpp::merge_sort_perm` Both exist because the comparator (`anum_manager::compare`/`lt`) is not pure: it mutates the algebraic numbers it compares (refining isolating intervals) and may throw on the resource limit, which makes `std::sort` undefined behavior (the original SIGSEGV). ## Change Extract the algorithm into a single shared helper `util/index_sort_with_mutations.h` (`stable_index_merge_sort`). The long rationale for why `std::sort` is unsafe and merge sort is safe now lives in exactly one place. Both call sites become thin wrappers that build the scratch buffer and forward their local comparator. No behavioral change: same stable O(n log n) merge sort over an index permutation. ## Verification CMake/Ninja Release build: - `test-z3 /seq algebraic_numbers` — PASS - `test-z3 /seq algebraic` — PASS - NRA/NIA smoke solves with `nlsat.lws=true` return expected sat/unsat. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-30 08:40:33 -07:00
Nikolaj Bjorner	32d806d500	fix warnings Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-29 21:20:31 -07:00
Nikolaj Bjorner	6428efc026	parser fixes Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-29 20:10:09 -07:00
Nikolaj Bjorner	d12d49dda1	[code-simplifier] Simplify int_cube: remove goto, use aggregate/brace init (#9874 ) Replace goto-based control flow in get_cube_delta_for_term with an all_ok flag for structured early-exit. Use aggregate initialization for flip_candidate, constructor-based vector sizing for occs, brace initialization for pairs in add_edge_rows_for_term. No functional changes - all lcube tests pass. Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-29 19:18:04 -07:00
Nikolaj Bjorner	63259d8a43	add missing registration of lambdas with legacy array solver, add missing beta reduction axiom Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-29 19:13:46 -07:00
Lev Nachmanson	8fe2f3c58a	nlsat: fix levelwise (lws) SIGSEGV instead of disabling it (#10001 ) ## Summary Alternative to #9991. Instead of disabling `nlsat.lws` by default, this fixes the underlying bug so levelwise single-cell projection stays enabled. ## Root cause The crash was reproduced on the QF_NIA benchmark from #9991 (`20170427-VeryMax/ITS/From_AProVE_2014__Round3.jar-obl-8__p11898_terminationG_0.smt2`, ~40% SIGSEGV at `-T:20`). A core-dump backtrace points at: ``` mpbq_manager::le (mpbq.cpp:362) algebraic_numbers::manager:👿:compare (algebraic_numbers.cpp:1913) c = 0xea24052d29f2d500 <- wild pointer algebraic_numbers::manager:👿:compare (algebraic_numbers.cpp:2128) nlsat::levelwise::impl::root_function_lt (levelwise.cpp:949) ... std::__unguarded_linear_insert ... <- OOB read std::sort nlsat::levelwise::impl::sort_root_function_partitions ``` The comparator (`root_function_lt` → `anum_manager::compare`, and `anum_manager::lt`) refines the isolating intervals of the algebraic numbers it compares and may hit the resource limit (throwing) mid-comparison. Both make the order it induces non-deterministic / not a strict weak ordering across a single `std::sort` — undefined behavior. libstdc++'s unguarded insertion pass then walks past `begin()` and dereferences a wild anum cell → SIGSEGV. This only fires when a timeout interrupts levelwise, explaining the non-determinism (`signal-11`). ## Fix Replace the two affected `std::sort` calls (`sort_root_function_partitions` and `add_adjacent_root_resultants`) with a bounds-checked insertion sort over an index permutation. A fully guarded insertion sort can never read out of bounds regardless of comparator consistency, and unwinds cleanly if `compare` throws on cancellation. The partitions sorted here are small, so the O(n²) cost is negligible. `nlsat.lws` stays `true`. ## Verification On the Linux repro box (Ubuntu 24.04, g++ 13), RelWithDebInfo: - Before: ~40% SIGSEGV (e.g. 5/16 runs at `-T:20`). - After: 0/30 SIGSEGV; results are `unsat`/`timeout`. - Sanity batch over 25 QF_NIA/VeryMax/ITS files: no crashes, expected sat/unsat/timeout mix. - `model_validate=true` full solve still returns `unsat`. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-29 16:36:51 -07:00
Nikolaj Bjorner	d197cee018	Fix TPTP front-end precedence and Int/Real coercion bugs Three translation defects in tptp_frontend.cpp caused spurious sat/unsat verdicts (reported as SZS BUG against annotated status): - Parenthesized negation bound the whole disjunction: ( ~ p \| q ) parsed as ~(p \| q) instead of (~p) \| q, flipping nearly every CNF/FOF clause. Negate only the next unary unit, then resume precedence parsing via a new parse_binary_rest helper. - Quantifier bodies absorbed lower-precedence connectives: ! [X] : p(X) => g parsed as ! [X] : (p(X) => g). TPTP quantifiers bind tighter than the binary connectives, so parse the body at parse_expr(PREC_EQ). - Mixed Int/Real equality coerced through an uninterpreted box function, severing arithmetic semantics and yielding spurious models. Use the arithmetic to_real/to_int conversions instead. Add regression cases to src/test/tptp.cpp covering all three fixes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-29 15:00:56 -07:00
Nikolaj Bjorner	14d24e2304	add verdicts Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-29 12:55:34 -07:00
Copilot	4fd22680b5	Go bindings: enable concurrent dec_ref for GC-driven finalizers (#10002 ) The Go bindings rely on finalizers to release Z3 references, which can run during concurrent GC and trigger unsafe decref behavior in shared contexts. This change aligns Go with other managed bindings by enabling concurrent decref support at context creation time. - Context initialization - Call `Z3_enable_concurrent_dec_ref` in both Go context constructors: - `NewContext()` - `NewContextWithConfig(cfg Config)` - This ensures AST/object finalizer decrefs are handled under Z3’s concurrent dec-ref mode. - Go binding docs* - Updated Go README memory-management section to explicitly document that contexts enable concurrent dec-ref for finalizer-driven decref paths. - Focused regression coverage - Added a small Go test (`z3_context_test.go`) that exercises `NewContext` through a basic SAT flow, ensuring context construction and normal solver usage remain consistent. ```go func NewContext() Context { ctx := &Context{ptr: C.Z3_mk_context_rc(C.Z3_mk_config())} C.Z3_enable_concurrent_dec_ref(ctx.ptr) runtime.SetFinalizer(ctx, func(c Context) { C.Z3_del_context(c.ptr) }) return ctx } ``` --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-29 13:14:41 -06:00
Nikolaj Bjorner	5531eb1e72	fix path	2026-06-29 10:30:14 -07:00
Copilot	56bf04e30a	Fix qe-lite de Bruijn reindexing after bounded quantifier expansion (#9996 ) `qe-lite` could produce malformed formulas when expanding bounded quantifiers under nested binders, leaving outer de Bruijn indices unshifted after eliminating an inner quantifier (e.g., `(:var 1)` escaping capture). This change fixes index normalization in that rewrite path and adds a regression for the reported forall/exists arithmetic case. - Rewrite correctness in bounded quantifier expansion - In `src/qe/lite/qe_lite_tactic.cpp`, after substituting bounded variables in payload conjuncts, apply `inv_var_shifter(num_decls)` so outer bound variables are reindexed relative to the removed binder. - This preserves quantifier structure correctness when `try_expand_bounded_quantifier` eliminates an inner quantifier. - Regression coverage for the reported pattern - In `src/test/smt_context.cpp`, add a focused quantified arithmetic formula matching the bug shape: - outer `forall (x, x4)` - inner `exists (y)` - mixed inequalities that trigger qe-lite bounded expansion - Assert the formula is unsatisfiable, preventing reintroduction of invalid index handling in this path. ```c++ inst = vs(p, subst_map.size(), subst_map.data()); shift(inst, num_decls, inst); // reindex outer de Bruijn vars after eliminating inner quantifier ``` --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>	2026-06-29 09:53:02 -07:00
Lev Nachmanson	a5454ec375	[snapshot-regression-fix] smt_parallel: report unknown on theory-incomplete cubes instead of hanging (#9999 ) ## Summary Fixes a hang (wall-clock timeout) in the native parallel SMT solver when a cube is incomplete for a reason that cannot change. Originating discussion: https://github.com/Z3Prover/bench/discussions/2746 Benchmark: `iss-3707/bug-1.smt2` (`QF_NRA`, runs with `parallel.enable=true`). ## Divergence The recorded oracle vs. current z3 (`z3 -T:20`): ```diff -(incomplete (theory difference-logic)) -unknown +timeout ``` z3 should terminate with `unknown` (incomplete theory) but instead spins until the 20s timeout. ## Root cause In `src/smt/smt_parallel.cpp` the per-cube worker handled an `l_undef` cube by unconditionally calling `update_max_thread_conflicts()` and re-splitting/re-checking. That only helps when the cube was abandoned at the per-cube conflict limit (`max-conflicts-reached`). When the cube is incomplete for a permanent reason (incomplete theory, quantifiers, resource limits), the verdict never changes, so the worker re-checks the same cube forever. The `batch_manager` had no `unknown` terminal state, so `get_result()` could only end as sat/unsat/exception — there was no way to settle on `unknown`, hence the hang. This is the `smt_parallel` analogue of the `parallel_tactical.cpp` regression fixed earlier. ## Fix Minimal, mirroring the tactic-side fix: - add an `is_unknown` batch-manager state + `m_reason_unknown`; - a worker reporting `l_undef` whose `last_failure` is not `max-conflicts-reached` calls `set_unknown(reason)` and stops re-splitting; - `set_sat`/`set_unsat` may still override `is_unknown` so a definitive answer wins; - `get_result()` maps `is_unknown -> l_undef` and the reason propagates to the parent context. ## Validation Rebuilt z3 (`make -C build -j16`) and re-ran the benchmark 5× with `-T:20`. Every run finished in well under the timeout with output matching the oracle byte-for-byte: ``` (incomplete (theory difference-logic)) unknown ``` Created as a draft for human review. > Generated by [Fix a Z3 snapshot-regression divergence](https://github.com/Z3Prover/bench/actions/runs/28358375255) · 553.9 AIC · ⌖ 27.2 AIC · ⊞ 9K · [◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+snapshot-regression-fixer%22&type=pullrequests) <!-- gh-aw-agentic-workflow: Fix a Z3 snapshot-regression divergence, engine: copilot, version: 1.0.63, model: claude-opus-4.8, id: 28358375255, workflow_id: snapshot-regression-fixer, run: https://github.com/Z3Prover/bench/actions/runs/28358375255 --> <!-- gh-aw-workflow-id: snapshot-regression-fixer --> <!-- gh-aw-workflow-call-id: Z3Prover/bench/snapshot-regression-fixer --> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-29 06:55:24 -07:00
Nikolaj Bjorner	4cefa52497	tweaks to string solver Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-28 17:16:52 -07:00
Nikolaj Bjorner	d5cf8e6263	tweaks to string solver Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-28 17:16:52 -07:00
Nikolaj Bjorner	1745d271b4	Modify thread allocation logic in smt_parallel.cpp	2026-06-28 16:33:46 -07:00
Nikolaj Bjorner	ef66acc6b5	change calculation of threads to use total threads indicated by parameter or processor count, subtract from worker threads based on backbone and core threads Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-28 12:43:58 -07:00
Copilot	6daebef4e4	Fix psmt deadlock when formula is theory-incomplete (#9986 ) `batch_manager::set_unknown()` in the parallel SMT tactic changed `m_state` to `is_unknown` but never notified backbone workers or the core-minimizer worker waiting on `m_bb_cv` / `m_core_min_cv`. Those threads blocked indefinitely, deadlocking `solve()` at `t.join()`. ### Root cause ``` (declare-fun a (Int) Bool) (declare-fun b (Int) Bool) (assert (distinct a b)) (check-sat-using psmt) ``` Every CDCL worker returns `l_undef` with reason `(incomplete (theory array))`. The first worker calls `set_unknown()` (a soft verdict — other workers may still find sat/unsat) and exits. Other CDCL workers exit when `get_cube()` checks `m_state != is_running`. Meanwhile, backbone workers and the core minimizer are already blocked in `wait_for_backbone_job()` / `wait_for_core_min_job()`, both of which condition-wait on CVs that `set_unknown()` never signals. Their predicates check `m_state != is_running`, but a CV predicate only re-evaluates on notification or spurious wakeup. ### Fix - `src/solver/parallel_tactical.cpp` — `set_unknown()` now calls `m_bb_cv.notify_all()` and `m_core_min_cv.notify_all()` after setting the terminal state, so waiting helper threads observe the change and exit via the existing `m_state != is_running` guard in their wait predicates. ### Test - `src/test/psmt.cpp` — new regression covering SAT, UNSAT, and the theory-incomplete (deadlock) path using `(as-array f)` terms to reproduce the exact array-theory incompleteness that triggers `set_unknown()`. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-28 13:27:58 -06:00
Nikolaj Bjorner	87712be04a	disregard skolems Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-28 12:05:32 -07:00
Nikolaj Bjorner	dbe0cf9312	disregard skolems in instantiation set? Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>	2026-06-28 12:04:56 -07:00
Lev Nachmanson	e87aaa6924	[snapshot-regression-fix] Fix psmt infinite loop on theory-incomplete cubes (#3044 ) (#9983 ) ## Summary Fixes a `psmt` (parallel SMT tactic) regression where the solver hangs to a wall-clock timeout instead of returning `unknown` on formulas whose root cube is genuinely undetermined by an incomplete theory. - Originating discussion: https://github.com/Z3Prover/bench/discussions/2735 - Benchmark: `iss-3044/bug-1.smt2` (from [Z3 issue #3044](https://github.com/Z3Prover/z3/issues/3044)) ```smt2 (declare-fun a (Int) Bool) (declare-fun b (Int) Bool) (assert (distinct a b)) (check-sat-using psmt) ``` ## Divergence The recorded oracle (expected) vs. current z3 (combined stdout+stderr, `-T:20`): ```diff -(incomplete (theory array)) -unknown +timeout ``` ## Root cause The rewritten parallel tactic (`src/solver/parallel_tactical.cpp`, introduced in #9824/#9825) hangs on this input. In the worker `run()` loop, every `l_undef` cube result was treated as if the per-cube conflict limit had been reached: the worker escalated the per-thread conflict budget (`update_max_thread_conflicts`) and re-checked / re-split the same cube. When the `l_undef` actually comes from theory incompleteness (here, the array theory cannot decide `(distinct a b)` over `Int -> Bool`) rather than the conflict limit, the verdict never changes, so the worker re-checks the same cube forever. Compounding this, the `batch_manager` state machine had no terminal `unknown` state — the only way to finish was for some worker to prove `sat`/`unsat`, which is impossible for a root-level theory-incomplete formula. The combination produced an infinite loop and a wall-clock timeout. The pre-rewrite parallel tactic avoided this: its `giveup()` detected reasons starting with `(incomplete` / `(sat.giveup`, reported a soft undef, and echoed the reason to `verbose_stream()`. ## Fix All changes are confined to `src/solver/parallel_tactical.cpp` (47 insertions, 4 deletions): 1. Distinguish genuine incompleteness from conflict-limit exhaustion. In the worker `l_undef` case, only `reason_unknown() == "max-conflicts-reached"` benefits from escalating the budget / splitting. For any other reason (incomplete theory, quantifiers, lambdas, resource limits, ...) re-checking is futile, so the worker records a sound `unknown` and stops working the branch. 2. Add a terminal `is_unknown` batch-manager state (`set_unknown`, `get_result() -> l_undef`, reason storage). It is a soft result: it does not cancel the other workers, and a definitive `sat`/`unsat` verdict from another branch may still override it (the `set_sat`/`set_unsat` guards now permit overriding `is_unknown`). All `set_unsat` call sites are global formula-unsat (core ⊆ assumptions, or independent of the tested backbone literal), so the override is sound; tree-closure unsat remains guarded by `is_running` and cannot fire because the undef leaf stays open. 3. Restore the reason output. The captured `reason_unknown` is propagated to the result goal and echoed to `verbose_stream()`, reproducing the `(incomplete (theory array))` line that the sequential path / old parallel tactic emitted. ## Validation Rebuilt the `./z3` checkout (`./configure && make -C build -j16`) and re-ran the benchmark with the freshly built binary using the same options the snapshot capture uses (`-T:20`, combined stdout+stderr): ``` $ z3 inputs/issues/iss-3044/bug-1.smt2 -T:20 (incomplete (theory array)) unknown ``` This matches the recorded `bug-1.expected.out` oracle byte-for-byte, and the benchmark now completes in ~0.5s (was: timeout). Verified stable across 8 consecutive runs. Basic `psmt` `sat`/`unsat` checks continue to produce correct results. Opened as a draft for human review. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> > Generated by [Fix a Z3 snapshot-regression divergence](https://github.com/Z3Prover/bench/actions/runs/28313246856) · 5.7K AIC · ⌖ 85.8 AIC · ⊞ 41.2K · [◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+snapshot-regression-fixer%22&type=pullrequests) <!-- gh-aw-agentic-workflow: Fix a Z3 snapshot-regression divergence, engine: copilot, version: 1.0.60, model: claude-opus-4.8, id: 28313246856, workflow_id: snapshot-regression-fixer, run: https://github.com/Z3Prover/bench/actions/runs/28313246856 --> <!-- gh-aw-workflow-id: snapshot-regression-fixer --> <!-- gh-aw-workflow-call-id: Z3Prover/bench/snapshot-regression-fixer --> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-28 11:20:32 -06:00
Lev Nachmanson	56bb49f8dc	lp: avoid per-call join allocation in explain_fixed_column (#9984 )	2026-06-28 08:12:52 -07:00
Lev Nachmanson	f07acb459f	Fix WASM build: remove duplicate mk_parallel_tactic definition (#9979 ) ## Problem The [master WebAssembly Build](https://github.com/Z3Prover/z3/actions/runs/28306680131) fails with: ``` ../src/solver/parallel_tactical.cpp:59:9: error: redefinition of 'mk_parallel_tactic' 59 \| tactic* mk_parallel_tactic(solver* s, params_ref const& /* p */) { ../src/solver/parallel_tactical.cpp:55:9: note: previous definition is here ``` ## Cause Commit `7564ccc3f` (an unrelated lar_solver change) accidentally renamed the dead `mk_parallel_tactic2` stub to `mk_parallel_tactic`, leaving two identical definitions inside the `#ifdef SINGLE_THREAD` block. The WASM build defines `SINGLE_THREAD`, so it hits the redefinition. ## Fix `mk_parallel_tactic2` and its `non_parallel_tactic2` class were never referenced anywhere. This removes the dead stub and orphaned class, keeping the single `mk_parallel_tactic` that degrades to `mk_solver2tactic(s)` in single-threaded mode (added in #9977). Verified both `SINGLE_THREAD` and multi-threaded paths pass `-fsyntax-only`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-27 18:25:04 -07:00
Lev Nachmanson	7564ccc3f1	capture row by pointer (#9973 ) Capture row as a pointer as lambda strips the reference and the vector was copied by value in lar_solver! --------- Signed-off-by: Lev Nachmanson <levnach@hotmail.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-27 17:43:08 -07:00
Copilot	75981a5d3b	Remove leaked `check-assignment` output from debug GCC CMake runs (#9978 ) The `Ubuntu build - cmake - debugGcc` job was failing because the solver could emit an unexpected `check-assignment` line before normal satisfiability output. This change removes that stray output so debug GCC runs no longer contaminate expected CLI/results streams. - Root cause - `src/math/lp/nra_solver.cpp` printed `check-assignment` from `solver::check_assignment()` via `IF_VERBOSE(0, ...)`. - Verbosity level `0` made this effectively unconditional in the failing path, so debug builds could leak internal diagnostics into user-visible output. - Change - Remove the `check-assignment` print from the exception path in `lp::solver::check_assignment()`. - Preserve all existing control flow and error handling; only the unintended output side effect is removed. - Effect - Debug GCC CMake builds keep their normal `sat`/`unsat` output shape. - Internal solver diagnostics no longer interfere with output-sensitive CI checks. ```c++ catch (z3_exception &) { statistics &st = m_imp->m_nla_core.lp_settings().stats().m_st; m_imp->m_nlsat->collect_statistics(st); if (m_imp->m_limit.is_canceled()) { return l_undef; } else { throw; } } ``` --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>	2026-06-27 11:22:47 -07:00
Copilot	39ea5ce8c0	Fix SINGLE_THREAD build: add `mk_parallel_tactic` stub to `parallel_tactical.cpp` (#9977 ) The `#ifdef SINGLE_THREAD` block in `parallel_tactical.cpp` only defined `mk_parallel_tactic2`, leaving `mk_parallel_tactic` (called by `smt_tactic_core.cpp`, `fd_solver.cpp`, and `inc_sat_solver.cpp`) undefined — causing linker failures in the ST CI job. ## Changes - `src/solver/parallel_tactical.cpp`: Add `mk_parallel_tactic` stub inside `#ifdef SINGLE_THREAD` that falls back to `mk_solver2tactic(s)`, consistent with how other parallel tactics degrade in single-threaded mode (`par()` → `or_else_tactical`, `par_and_then()` → `and_then_tactical`): ```cpp #ifdef SINGLE_THREAD tactic* mk_parallel_tactic2(solver* s, params_ref const& p) { return alloc(non_parallel_tactic2, s, p); } tactic* mk_parallel_tactic(solver* s, params_ref const& /* p */) { return mk_solver2tactic(s); } #else // ... full parallel implementation ``` --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>	2026-06-27 10:20:11 -06:00
Lev Nachmanson	6a62a53181	qsat: decide quantifier-free goals so qe2 returns sat instead of unknown (fixes iss-7027/small-30) (#9970 ) ## Summary Fixes the `iss-7027/small-30` snapshot regression (`Z3Prover/bench` discussion #2705) at its root, instead of working around it by retuning the LP heuristics. - Benchmark: `inputs/issues/iss-7027/small-30.smt2` — `(check-sat-using qe2)` over a single `(distinct ...)` of 33 mixed Int/Real terms. - The recorded oracle was `unknown`; current `master` produces `timeout`. ## Root cause `unknown`/`timeout` are both wrong here: the formula is a `distinct` over 33 terms (free Int/Real constants plus the literals `0`/`1`), which is trivially `sat` — there are infinitely many distinct reals. The real bug is in the `qsat` tactic that backs `qe2`. Running quantifier elimination on a quantifier-free formula has nothing to eliminate, so `qsat` left an undecided residual goal and `check-sat-using` reported `unknown`. This reproduces on any ground formula with free variables, e.g.: ``` (declare-fun a () Int)(assert (> a 0))(check-sat-using qe2) ; -> unknown (should be sat) ``` For `small-30` the QE alternation additionally drove `theory_lra` integer branch-and-bound down a non-terminating path, surfacing as a `timeout` under the capture budget (the symptom the `random_hammers` schedule change happened to expose). ## Fix Under `check-sat` semantics, top-level free variables are implicitly existentially quantified. So when the `qsat` input has no quantifiers, decide satisfiability directly (route through the existing `qsat_sat` path) instead of producing a residual goal. `qe2`/`qe` now return `sat`/`unsat` for ground formulas. QE of genuinely-quantified formulas is unchanged: `apply qe2` on a quantified goal produces the same projected formula as before (verified identical to `master`). Only the degenerate quantifier-free case is affected. This supersedes the previous approach in this PR (reverting the `lp.random_hammers` default). That default is left unchanged (`true`), preserving #9958's aggregate QF_LIA benefit. `small-30` now returns `sat` in ~0.01s regardless of the heuristic schedule, because the QE machinery no longer runs on this ground instance. Two changes: - `src/qe/qsat.cpp`: short-circuit quantifier-free input to the satisfiability decision path. - `Z3Prover/bench` `inputs/issues/iss-7027/small-30.expected.out`: oracle updated `unknown` -> `sat` (to be committed alongside this fix). ## Validation ``` $ z3 small-30.smt2 sat # ~0.01s $ echo '(declare-fun a () Int)(assert (> a 0))(check-sat-using qe2)' \| z3 -in sat $ echo '(declare-fun a () Int)(assert (and (> a 0)(< a 0)))(check-sat-using qe2)' \| z3 -in unsat ``` Full unit-test suite (`test-z3 /a`) passes (92/92). Quantified `qe2` round-trips (`apply qe2`) match `master` byte-for-byte. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-26 10:38:12 -06:00
Nikolaj Bjorner	612fab1c9a	Parallel tactic (#9824 ) (#9825 ) Add new parallel algorithm as a tactic (parallel_tactical2.cpp) Don't port over old experiments from smt_parallel that we aren't using (sls, inprocessing, failed_literal_mode for bb detection) Fix bugs: lease cancellation/reslimit race condition, involves changing lease epoch to simple boolean flag Also, now there is a single shared set of params for the tactic and smt_parallel Test runs for the parallel_tactical2 vs old smt_parallel version: run-2747-Z3-threads-4-qflia-30s-stats.md run-2746-Z3-threads-4-qflia-30s-parallel_tactic-stats.md run-2745-Z3-threads-1-qfbv-30s-stats.md run-3013-Z3-threads-4-qfbv-30s-parallel_tactic-stats.md --> note this is indeed run-3013, I reran after a bugfix in inc_sat_solver run-2743-Z3-threads-4-qfnia-30s-stats.md run-2742-Z3-threads-4-qfnia-30s-parallel_tactic-stats.md Test runs for the new smt_parallel with bugfixes: run-2801-Z3-threads-4-qflia-30s-smtparallel-bugfixes-stats.md, run-2800-Z3-threads-4-qflia-30s-smtparallel-bugfixes-stats.md run-2797-Z3-threads-4-qfnia-30s-smtparallel-bugfixes-stats.md compare to old smt_parallel: run-2747-Z3-threads-4-qflia-30s-stats.md run-2743-Z3-threads-4-qfnia-30s-stats.md Note that there is a slight regression on lia in run-2800. The source of this appears to be the new new LP largest-cube LIA heuristic param, which is enabled by default. disabling this param in run-2801 restored performance (I didn't change this in this PR though, just something to note) http://mtzguido.tplinkdns.com:8081/z3/compare_stats.html --------- Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com> Co-authored-by: Ilana Shapiro <ilanashapiro@Ilanas-MacBook-Pro.local> Co-authored-by: Ilana Shapiro <ilanashapiro@Ilanas-MBP.localdomain> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>	2026-06-26 10:36:15 -06:00
Nikolaj Bjorner	15f33f458d	Derive with ranges (#9965 ) Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Margus Veanes <margus@microsoft.com> Co-authored-by: Margus Veanes <veanes@users.noreply.github.com>	2026-06-26 08:44:13 -06:00
Lev Nachmanson	e76239ceda	[snapshot-regression-fix] Honor cancellation/timeout in bottom-up term enumeration (MBQI) (#9956 ) Fixes a Z3 snapshot-regression divergence reported in `Z3Prover/bench` discussion: https://github.com/Z3Prover/bench/discussions/2667 ## Divergence - benchmark: `iss-6615/original.smt2` (lives at `inputs/issues/iss-6615/` in `Z3Prover/bench`) - kind: `diff` - z3 under test: `z3-4.17.0-x64-glibc-2.39` (Nightly) - budget: per-file `20s` — the snapshot capture runs `z3 -T:20 original.smt2` The recorded oracle is 13× `unknown` (one per `check-sat`, each preceded by an in-file `(set-option :timeout 100)` soft timeout). Current z3 instead prints a single `timeout`: ```diff --- original.expected.out (expected) +++ produced (current z3) @@ -1,13 +1 @@ -unknown -unknown -unknown -unknown -unknown -unknown -unknown -unknown -unknown -unknown -unknown -unknown -unknown +timeout ``` ## Root cause The benchmark uses `(set-logic ALL)` with quantifiers over higher-order (array / lambda) sorts, so MBQI drives `ho_var::populate_inst_sets` (`src/smt/smt_model_finder.cpp`), which enumerates candidate ground terms with the bottom-up term-enumeration engine added in #9908 (`src/ast/rewriter/term_enumeration.cpp`): ```cpp unsigned max_count = 20; for (auto t : tn.enum_terms(srt)) { // each ++ runs find_next() if (max_count == 0) break; --max_count; S->insert(t, generation); } ``` `max_count = 20` bounds the number of inserted terms, but it does not bound the work the generator performs to find the next target-sort term. For sorts that admit few cheap target-sort terms but a large intermediate term space (here `(Array enc_val Int)` and `(Array String (option enc_val))`), a single advance of the iterator can explore an explosive number of intermediate terms, each rewritten through `th_rewriter`. Crucially, the three driving loops of the engine — `bottom_up_enumerator::find_next`, `bottom_up_enumerator::enumerate_operators`, and `children_iterator::has_next` — never check the resource limit / cancellation flag. The per-query soft timeout (`:timeout 100`) does fire and cancels `m.limit()` (via `cmd_context`'s `cancel_eh<reslimit>` + `scoped_timer`), but the enumeration never observes it, so the query cannot be interrupted at 100 ms. It spins until the hard process timeout `-T:20` fires, which prints `timeout` for the whole run and aborts — instead of the solver returning `unknown` per query. ## Fix Make the enumeration honor cancellation by checking `m.limit().is_canceled()` at the head of each of the three unbounded loops in `src/ast/rewriter/term_enumeration.cpp`. When a query is cancelled (soft timeout / rlimit / Ctrl-C) the enumeration stops promptly and the solver returns `unknown`, as it did before #9908. When nothing is cancelled `is_canceled()` is `false`, so the set of enumerated terms is unchanged — this only adds an interrupt point, it does not alter which terms are produced. ```diff bool has_next(unsigned cost) { while (!m_done) { + if (m.limit().is_canceled()) + return false; if (has_child_at_cost(cost)) return true; advance(); } @@ find_next() while (true) { + if (m.limit().is_canceled()) { + m_state = State::Done; + return nullptr; + } switch (m_state) { @@ enumerate_operators() while (true) { + if (m.limit().is_canceled()) + return nullptr; ``` ## Validation Built this branch in Release mode (base ``6fd303c4b``) and ran the exact snapshot-capture command: ``` $ z3 -T:20 inputs/issues/iss-6615/original.smt2 unknown unknown unknown unknown unknown unknown unknown unknown unknown unknown unknown unknown unknown real 0m1.49s ``` - Output is byte-identical to the recorded `inputs/issues/iss-6615/original.expected.out` oracle (13× `unknown`). - The isolated first `check-sat` returns `unknown` in 0.14 s (previously it did not terminate within 30 s under only the in-file `:timeout 100`). - Trivial sanity check (`(assert (> x 0)) (check-sat)` → `sat`) is unaffected. Opened as a draft for human review. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> > Generated by [Fix a Z3 snapshot-regression divergence](https://github.com/Z3Prover/bench/actions/runs/28155155541) · 3.5K AIC · ⌖ 85.5 AIC · ⊞ 41.2K · [◷](https://github.com/search?q=repo%3AZ3Prover%2Fz3+%22gh-aw-workflow-id%3A+snapshot-regression-fixer%22&type=pullrequests) <!-- gh-aw-agentic-workflow: Fix a Z3 snapshot-regression divergence, engine: copilot, version: 1.0.60, model: claude-opus-4.8, id: 28155155541, workflow_id: snapshot-regression-fixer, run: https://github.com/Z3Prover/bench/actions/runs/28155155541 --> <!-- gh-aw-workflow-id: snapshot-regression-fixer --> <!-- gh-aw-workflow-call-id: Z3Prover/bench/snapshot-regression-fixer --> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-06-25 21:36:06 -06:00

1 2 3 4 5 ...

18158 commits