diff --git a/research/docs/nseq-issues/01-soundness-replace-empty-string.md b/research/docs/nseq-issues/01-soundness-replace-empty-string.md new file mode 100644 index 000000000..4d309e560 --- /dev/null +++ b/research/docs/nseq-issues/01-soundness-replace-empty-string.md @@ -0,0 +1,78 @@ +# [nseq] Soundness bug: str.replace with empty-string input returns sat instead of unsat + +**Labels**: bug, c3, nseq, soundness + +## Summary + +The nseq solver returns `sat` on benchmarks where the correct answer is `unsat` when +`str.replace` is applied to an empty input string and the result is constrained to an +impossible value. The seq solver correctly returns `unsat` for all of these cases. + +## Affected benchmarks (Ostrich suite, c3 branch) + +| File | seq verdict | nseq verdict | +|------|-------------|--------------| +| `replace-special.smt2` | unsat | **sat** (WRONG) | +| `replace-special-4.smt2` | unsat | **sat** (WRONG) | +| `replace-special-5.smt2` | unsat | **sat** (WRONG) | +| `simple-replace-4b.smt2` | unsat | **sat** (WRONG) | + +Data from: https://github.com/Z3Prover/z3/discussions/9071 + +## Reproducing examples + +```smt2 +; replace-special.smt2 — EXPECTED: unsat, nseq returns: sat +(set-logic QF_SLIA) +(declare-fun x () String) +(assert (not (= (= "B" (str.replace "" x "A")) false))) +(check-sat) +``` + +```smt2 +; replace-special-4.smt2 — EXPECTED: unsat, nseq returns: sat +(set-logic QF_SLIA) +(declare-fun x () String) +(declare-fun y () String) +(assert (not (= (str.replace "" (str.replace "" x "A") "B") + (str.substr "B" 0 (str.len x))))) +(check-sat) +``` + +```smt2 +; simple-replace-4b.smt2 — EXPECTED: unsat, nseq returns: sat +(set-logic QF_S) +(declare-fun x_7 () String) +(declare-fun x_11 () String) +(declare-fun x_12 () String) +(assert (= x_7 (str.replace (str.++ x_11 x_12) "e" "a"))) +(assert (= x_7 "hello")) +(check-sat) +``` + +## Analysis + +`replace_axiom` in `src/ast/rewriter/seq_axioms.cpp` generates clauses for +`r = str.replace u s t`: + +- `~(u="") ∨ (s="") ∨ (r=u)` — empty input without empty pattern gives `r = u` +- `~(s="") ∨ (r = t ++ u)` — empty pattern gives `r = t ++ u` +- `contains(u,s) ∨ r = u` — no occurrence gives `r = u` + +For `replace-special.smt2` with `u=""`: +- If `s=""`: `r = "A" ++ "" = "A"`, but `r = "B"` → contradiction → unsat +- If `s≠""`: `r = u = ""`, but `r = "B"` → contradiction → unsat + +The clauses are individually correct, but the nseq Nielsen graph apparently fails to +derive the contradiction when the result `r` is constrained via equality to an impossible +constant. The likely root cause is incomplete propagation of the result equality through +the egraph into the replace axiom conclusions. + +For `simple-replace-4b.smt2`, the issue is that `str.replace (x_11 ++ x_12) "e" "a" = "hello"` +is unsatisfiable because "hello" contains 'l', which cannot be produced by replacing 'e'→'a' in +any string, yet nseq claims sat. + +## Files to investigate + +- `src/ast/rewriter/seq_axioms.cpp` — `replace_axiom` +- `src/smt/theory_nseq.cpp` — `dequeue_axiom`, `populate_nielsen_graph` diff --git a/research/docs/nseq-issues/02-soundness-contains-monotonicity.md b/research/docs/nseq-issues/02-soundness-contains-monotonicity.md new file mode 100644 index 000000000..e464d0127 --- /dev/null +++ b/research/docs/nseq-issues/02-soundness-contains-monotonicity.md @@ -0,0 +1,57 @@ +# [nseq] Soundness bug: str.contains not preserved through concatenation + +**Labels**: bug, c3, nseq, soundness + +## Summary + +The nseq solver returns `sat` for a benchmark that asserts: +1. `z` is a substring of `y` (`str.contains y z`) +2. `x = y ++ y` +3. `z` is NOT a substring of `x` (`not (str.contains x z)`) + +The third assertion contradicts the first two, so the formula should be `unsat`. +The seq solver correctly returns `unsat`; nseq incorrectly returns `sat`. + +## Affected benchmark + +| File | seq verdict | nseq verdict | +|------|-------------|--------------| +| `contains-4.smt2` | unsat | **sat** (WRONG) | + +Data from: https://github.com/Z3Prover/z3/discussions/9071 + +## Reproducing example + +```smt2 +; contains-4.smt2 — EXPECTED: unsat, nseq returns: sat +(set-logic QF_S) +(declare-const x String) +(declare-const y String) +(declare-const z String) +(assert (str.contains y z)) +(assert (= x (str.++ y y))) +(assert (not (str.contains x z))) +(check-sat) +``` + +## Analysis + +If `str.contains y z` is true, then `y = L ++ z ++ R` for some `L`, `R`. +Then `x = y ++ y = L ++ z ++ R ++ L ++ z ++ R`, so `z` is a substring of `x`. +Therefore `not (str.contains x z)` is a contradiction. + +The nseq solver should derive this via the combination of: +- `contains_true_axiom`: `str.contains y z` → `y = f1 ++ z ++ f2` (skolems) +- The equality `x = y ++ y` +- The Nielsen graph constraint that propagates the concatenation decomposition + +The root cause may be that the nseq solver does not propagate `str.contains` facts +through word equation equalities in the Nielsen graph. When `x = y ++ y` is processed +as a word equation, the contains constraint on `y` should propagate to derive a contains +constraint on `x`, but this inference appears to be missing. + +## Files to investigate + +- `src/ast/rewriter/seq_axioms.cpp` — `contains_true_axiom`, `unroll_not_contains` +- `src/smt/seq/seq_nielsen.h` / `seq_nielsen.cpp` — word equation propagation +- `src/smt/theory_nseq.cpp` — `assign_eh` for contains, `populate_nielsen_graph` diff --git a/research/docs/nseq-issues/03-soundness-indexof-regex.md b/research/docs/nseq-issues/03-soundness-indexof-regex.md new file mode 100644 index 000000000..6c7a2e66a --- /dev/null +++ b/research/docs/nseq-issues/03-soundness-indexof-regex.md @@ -0,0 +1,71 @@ +# [nseq] Soundness bug: str.indexof unsound when combined with regex membership + +**Labels**: bug, c3, nseq, soundness + +## Summary + +The nseq solver returns `sat` for benchmarks that constrain `str.indexof` to values +impossible given the regex membership of the input string. The seq solver correctly +returns `unsat` for these cases. + +## Affected benchmarks + +| File | seq verdict | nseq verdict | +|------|-------------|--------------| +| `indexof_const_index_unsat.smt2` | unsat | **sat** (WRONG) | +| `indexof_var_unsat.smt2` | unsat | **sat** (WRONG) | + +Data from: https://github.com/Z3Prover/z3/discussions/9071 + +## Reproducing examples + +```smt2 +; indexof_const_index_unsat.smt2 — EXPECTED: unsat, nseq returns: sat +(set-info :status unsat) +(declare-fun a () String) +(declare-fun i () Int) +(declare-fun j () Int) +(assert (str.in_re a (re.union (str.to_re "hhhbbb") (str.to_re "bhhh")))) +(assert (= (str.indexof a "hhh" j) i)) +(assert (= i 2)) +(assert (> j 0)) +(check-sat) +``` + +```smt2 +; indexof_var_unsat.smt2 — EXPECTED: unsat, nseq returns: sat +(set-info :status unsat) +(declare-fun a () String) +(declare-fun i () Int) +(declare-fun j () Int) +(assert (str.in_re a (re.union (str.to_re "hhhbbb") (str.to_re "bhhh")))) +(assert (= (str.indexof a "hhh" j) i)) +(assert (> i 1)) +(check-sat) +``` + +## Analysis + +For `indexof_const_index_unsat.smt2`: +- `a ∈ {hhhbbb, bhhh}` (two possibilities) +- `str.indexof a "hhh" j = 2` with `j > 0` +- In "hhhbbb", "hhh" appears at index 0 only (but j > 0 means the search starts after index 0) +- In "bhhh", "hhh" appears at index 1, but with j > 0 the only valid return would be 1, not 2 +- So i = 2 is impossible → unsat + +The `indexof_axiom` in `seq_axioms.cpp` generates arithmetic constraints for indexof, +but these constraints may not be sufficiently tight when combined with concrete regex +membership constraints. Specifically, the nseq solver does not appear to combine the +regex membership information with the indexof position constraints to derive the +contradiction. + +The root cause is likely that nseq's `indexof_axiom` generates axioms about `str.indexof` +without leveraging the concrete alphabet constraints imposed by regex membership. The +seq solver may do additional propagation (e.g., via character-level analysis of the +regex language) that nseq does not perform. + +## Files to investigate + +- `src/ast/rewriter/seq_axioms.cpp` — `indexof_axiom` +- `src/smt/seq/seq_regex.h` / `seq_regex.cpp` — regex membership propagation +- `src/smt/theory_nseq.cpp` — interaction between regex constraints and arithmetic axioms diff --git a/research/docs/nseq-issues/04-soundness-prefix-suffix.md b/research/docs/nseq-issues/04-soundness-prefix-suffix.md new file mode 100644 index 000000000..0992303fb --- /dev/null +++ b/research/docs/nseq-issues/04-soundness-prefix-suffix.md @@ -0,0 +1,78 @@ +# [nseq] Soundness bug: str.prefixof / str.suffixof interaction not fully axiomatized + +**Labels**: bug, c3, nseq, soundness + +## Summary + +The nseq solver returns `sat` for several benchmarks involving `str.prefixof`, +`str.suffixof`, and `str.at` where the correct answer is `unsat`. The seq solver +handles all these cases correctly. + +## Affected benchmarks + +| File | seq verdict | nseq verdict | +|------|-------------|--------------| +| `prefix-suffix.smt2` | unsat | **sat** (WRONG) | +| `failedProp.smt2` | unsat | **sat** (WRONG) | +| `failedProp2.smt2` | unsat | **sat** (WRONG) | + +Data from: https://github.com/Z3Prover/z3/discussions/9071 + +## Reproducing examples + +```smt2 +; prefix-suffix.smt2 — EXPECTED: unsat, nseq returns: sat +(set-logic QF_S) +(declare-fun a () String) +(declare-fun b () String) +(assert (not (=> (and (str.prefixof a b) (str.suffixof b a)) (= a b)))) +(check-sat) +``` + +The above asserts: `(a prefix b) AND (b suffix a) AND a ≠ b`. +This is unsatisfiable: if `a` is a prefix of `b` and `b` is a suffix of `a`, +then `|a| ≤ |b|` and `|b| ≤ |a|`, so `|a| = |b|`. Combined with the prefix and +suffix conditions, this forces `a = b`. + +```smt2 +; failedProp.smt2 — EXPECTED: unsat, nseq returns: sat +(set-logic QF_SLIA) +(declare-fun x () String) +(declare-fun y () String) +(assert (not (= (str.prefixof x (str.at x 1)) (= x "")))) +(check-sat) +``` + +The above asserts: `(str.prefixof x (str.at x 1)) ≠ (x = "")`. +`str.at x 1` is the second character of `x` (as a length-1 string). +`str.prefixof x (str.at x 1)` is true only when `x` is a prefix of a single character, +which means `x = ""` or `x` is that single character (but then `x` is a prefix of itself only +if `len(x) ≤ 1`). This equivalence should hold: `str.prefixof x (str.at x 1) ↔ x = ""`. +So the assertion is `unsat`. + +```smt2 +; failedProp2.smt2 — EXPECTED: unsat, nseq returns: sat (same pattern with suffixof) +(set-logic QF_SLIA) +(declare-fun x () String) +(assert (not (= (str.suffixof x (str.at x 1)) (= x "")))) +(check-sat) +``` + +## Analysis + +For `prefix-suffix.smt2`, the current axioms for `prefix_axiom` and `suffix_axiom` +introduce skolem variables but may not derive the length equality `|a| = |b|` from the +combined prefix+suffix condition, or may not then propagate this to `a = b`. + +For `failedProp.smt2` and `failedProp2.smt2`, the issue involves interaction between +`str.at`, `str.prefixof`/`str.suffixof`, and string-integer length constraints. The +nseq solver may not properly resolve the combination of: +1. The `at_axiom` (generating constraints on `str.at x 1`) +2. The `prefix_axiom`/`suffix_axiom` (decomposing the prefix/suffix relationship) +3. The arithmetic constraint linking lengths + +## Files to investigate + +- `src/ast/rewriter/seq_axioms.cpp` — `prefix_axiom`, `suffix_axiom`, `prefix_true_axiom`, `suffix_true_axiom`, `at_axiom` +- `src/smt/theory_nseq.cpp` — `assign_eh` for prefix/suffix, `final_check_eh` +- `src/smt/seq/seq_nielsen.cpp` — length propagation in Nielsen graph diff --git a/research/docs/nseq-issues/05-soundness-conflicting-regex.md b/research/docs/nseq-issues/05-soundness-conflicting-regex.md new file mode 100644 index 000000000..aef4ad8e2 --- /dev/null +++ b/research/docs/nseq-issues/05-soundness-conflicting-regex.md @@ -0,0 +1,65 @@ +# [nseq] Soundness bug: conflicting regex memberships not detected (norn-benchmark-9f) + +**Labels**: bug, c3, nseq, soundness + +## Summary + +The nseq solver returns `sat` for a formula asserting that a non-empty string belongs +simultaneously to two disjoint regular languages. The correct answer is `unsat`. + +## Affected benchmark + +| File | seq verdict | nseq verdict | +|------|-------------|--------------| +| `norn-benchmark-9f.smt2` | unsat | **sat** (WRONG) | + +Data from: https://github.com/Z3Prover/z3/discussions/9071 + +## Reproducing example + +```smt2 +; norn-benchmark-9f.smt2 — EXPECTED: unsat, nseq returns: sat +(set-logic QF_S) +(declare-fun var_0 () String) +(assert (str.in.re var_0 (re.* (re.range "a" "u")))) +(assert (str.in.re var_0 (re.* (str.to.re "v")))) +(assert (not (= var_0 ""))) +(check-sat) +``` + +The formula asserts: +1. `var_0 ∈ (re.range "a" "u")* ` — all characters in "a"–"u" +2. `var_0 ∈ "v"*` — only the character "v" +3. `var_0 ≠ ""` + +Since "v" is not in the range "a"–"u" (the range includes up to 'u', not 'v'), +the intersection of the two languages is `{""}`. Combined with `var_0 ≠ ""`, this is +unsatisfiable. + +## Note on syntax + +This benchmark uses the old SMT-LIB 2.5 syntax `str.in.re` (with dots) rather than +the SMT-LIB 2.6 syntax `str.in_re` (with underscore). Both are supported by Z3's parser. +The bug may be triggered specifically by the old-style syntax interaction with nseq's +regex handling, or may be a pure logic issue. + +## Analysis + +The nseq solver handles `str.in_re` constraints by computing the intersection of +multiple regex membership constraints for the same string variable (via the sgraph and +Nielsen graph). The Parikh image pre-check (`seq_parikh.cpp`) should detect that +`(re.range "a" "u")* ∩ "v"*` restricted to non-empty strings is empty, since: +- `"v"*` allows only 'v' characters +- `(re.range "a" "u")*` allows only characters in 'a'..'u' +- 'v' ∉ 'a'..'u' → intersection = `{""}` only + +The root cause is likely that the nseq solver does not derive the character-level +disjointness between these two regex languages. The seq solver may use a derivative-based +or automaton intersection approach that detects this. + +## Files to investigate + +- `src/smt/seq/seq_regex.h` / `seq_regex.cpp` — regex membership processing, pre-check +- `src/smt/seq/seq_parikh.h` / `seq_parikh.cpp` — Parikh image constraint generation +- `src/smt/seq/seq_nielsen.cpp` — membership processing in Nielsen nodes +- `src/smt/theory_nseq.cpp` — `assign_eh` for regex membership diff --git a/research/docs/nseq-issues/06-crash-non-greedy-quantifiers.md b/research/docs/nseq-issues/06-crash-non-greedy-quantifiers.md new file mode 100644 index 000000000..3cbc9ce7d --- /dev/null +++ b/research/docs/nseq-issues/06-crash-non-greedy-quantifiers.md @@ -0,0 +1,67 @@ +# [nseq] Crash: non-greedy regex quantifiers (re.+?, re.*?) cause exception + +**Labels**: bug, c3, nseq, crash + +## Summary + +The nseq solver crashes (throws an exception or returns an error) when the input +formula contains non-greedy regex quantifiers such as `re.+?` (lazy plus) or `re.*?` +(lazy star). The seq solver handles these correctly by treating them as equivalent to +their greedy counterparts for satisfiability purposes. + +## Affected benchmark + +| File | seq verdict | nseq verdict | +|------|-------------|--------------| +| `non-greedy-quantifiers.smt2` | sat | **bug/crash** | + +Data from: https://github.com/Z3Prover/z3/discussions/9071 + +## Reproducing example + +```smt2 +; non-greedy-quantifiers.smt2 — EXPECTED: sat, nseq crashes +(set-logic QF_S) +(declare-fun a () String) +(declare-fun b () String) +(declare-fun c () String) +(declare-fun d () String) +(assert (= a (str.++ b c))) +(assert (= b (str.++ d c))) +; non-greedy re.+? has same semantics as re.+ for satisfiability +(assert (str.in.re a (re.+ (re.union (str.to.re "x") (str.to.re "y"))))) +(assert (str.in.re c (re.+? (str.to.re "x")))) +(check-sat) +(get-model) +``` + +## Analysis + +Z3's SMT-LIB parser supports non-greedy regex quantifiers (`re.+?`, `re.*?`, `re.??`) +as part of the ECMA-like regex extension. For satisfiability checking, non-greedy +quantifiers are semantically equivalent to their greedy counterparts (they only differ +in which match is returned, not in what strings are accepted). + +The nseq solver's regex handling infrastructure (`seq_regex.cpp`) does not appear to +recognize or normalize non-greedy quantifiers. When the symbol `re.+?` is encountered +in a membership constraint, the solver either: +1. Does not recognize the operator and triggers `push_unhandled_pred()`, causing `FC_GIVEUP` + (which may manifest as an error/"bug" in the benchmark runner), or +2. Crashes due to an unhandled case in the Brzozowski derivative computation or + Parikh constraint generation. + +### Fix proposal + +Non-greedy quantifiers should be rewritten to their greedy equivalents before processing: +- `re.+? e` → `re.+ e` (lazy plus → greedy plus) +- `re.*? e` → `re.* e` (lazy star → greedy star) +- `re.?? e` → `re.opt e` (lazy optional → greedy optional) + +This rewrite is sound for satisfiability/unsatisfiability since both interpretations +accept exactly the same language. + +## Files to investigate + +- `src/smt/seq/seq_regex.cpp` — regex preprocessing and derivative computation +- `src/ast/rewriter/seq_rewriter.cpp` — regex rewriting rules (add non-greedy normalization) +- `src/smt/theory_nseq.cpp` — `internalize_atom` / `assign_eh` for regex membership diff --git a/research/docs/nseq-issues/07-crash-str-lt-le-axioms.md b/research/docs/nseq-issues/07-crash-str-lt-le-axioms.md new file mode 100644 index 000000000..682d51922 --- /dev/null +++ b/research/docs/nseq-issues/07-crash-str-lt-le-axioms.md @@ -0,0 +1,82 @@ +# [nseq] Crashes: str.< and str.<= axiom handling failures + +**Labels**: bug, c3, nseq, crash + +## Summary + +The nseq solver crashes (returns "bug" in the benchmark runner) on formulas involving +`str.<` (lexicographic less-than) and `str.<=` (lexicographic less-than-or-equal) +string comparison predicates in contexts where they interact with length constraints +and character-level reasoning. + +## Affected benchmarks + +| File | seq verdict | nseq verdict | +|------|-------------|--------------| +| `str-leq11.smt2` | sat | **bug/crash** | +| `str-leq12.smt2` | sat | **bug/crash** | +| `str-leq13.smt2` | sat | **bug/crash** | +| `str-lt.smt2` | sat | **bug/crash** | +| `str-lt2.smt2` | sat | **bug/crash** | + +Data from: https://github.com/Z3Prover/z3/discussions/9071 + +## Reproducing examples + +```smt2 +; str-leq11.smt2 — EXPECTED: sat, nseq crashes +(set-logic QF_SLIA) +(declare-fun x () String) +(assert (= 1 (str.len x))) +(assert (str.<= "a" x)) +(assert (str.<= x "c")) +(check-sat) +``` + +```smt2 +; str-lt.smt2 — EXPECTED: sat, nseq crashes +(set-logic QF_SLIA) +(declare-fun x () String) +(assert (= 1 (str.len x))) +(assert (str.< "A" x)) +(assert (str.< x "C")) +(check-sat) +``` + +## Analysis + +The nseq solver handles `str.<` and `str.<=` via `lt_axiom` and `le_axiom` in +`seq_axioms.cpp`, triggered from `relevant_eh` → `dequeue_axiom`. The `assign_eh` +for these predicates is a no-op with comment "axioms added via relevant_eh → dequeue_axiom". + +The `lt_axiom` generates clauses involving: +- Skolem variables `x`, `y`, `z` (common prefix and suffixes) +- Skolem character variables `c`, `d` +- Prefix and character-comparison constraints + +The crash likely occurs because: +1. The `lt_axiom` / `le_axiom` introduces character-comparison constraints (via `seq.mk_lt`) + that require character-level arithmetic reasoning (e.g., comparing character codes). + If the nseq solver's arithmetic integration does not correctly handle these + character-code comparisons, it may crash or give incorrect results. + +2. The `lt_axiom` clauses reference the Boolean literal for the `str.<` predicate itself + (via `expr_ref lt = expr_ref(n, m)`). If `n` is used both as a term and as a literal + reference, and nseq has an inconsistency in how it maps the Bool var to the expression, + this could cause a null dereference or assertion failure. + +3. The interaction between the length constraint `len(x) = 1` and the lt/le axioms' + prefix-based decomposition may produce contradictory or unresolvable constraints + in the Nielsen graph. + +### What str-leq11 requires + +For `len(x) = 1`, `"a" ≤ x ≤ "c"` (lexicographically): x ∈ {"a", "b", "c"}. +The solver needs to find x = "a", "b", or "c". This is straightforward with +character-range reasoning. + +## Files to investigate + +- `src/ast/rewriter/seq_axioms.cpp` — `lt_axiom`, `le_axiom` +- `src/smt/theory_nseq.cpp` — `assign_eh` for is_lt/is_le, `relevant_eh`, `dequeue_axiom` +- `src/smt/seq/seq_nielsen.cpp` — character-level constraint handling diff --git a/research/docs/nseq-issues/08-crash-str-at-substr-from-to-int.md b/research/docs/nseq-issues/08-crash-str-at-substr-from-to-int.md new file mode 100644 index 000000000..3bbed18ac --- /dev/null +++ b/research/docs/nseq-issues/08-crash-str-at-substr-from-to-int.md @@ -0,0 +1,118 @@ +# [nseq] Crashes: str.at / str.substr / str.from_int / str.to_int in complex formulas + +**Labels**: bug, c3, nseq, crash + +## Summary + +The nseq solver crashes or returns errors on benchmarks that combine `str.at`, +`str.substr`, `str.from_int`, and `str.to_int` with regex constraints, length +constraints, or positional arithmetic. The seq solver handles all these correctly. + +## Affected benchmarks + +| File | seq verdict | nseq verdict | +|------|-------------|--------------| +| `str.at.smt2` | sat | **bug/crash** | +| `str.at-2.smt2` | sat | **bug/crash** | +| `str.from_int_6.smt2` | sat | **bug/crash** | +| `str.to_int_5.smt2` | unsat | **bug/crash** | +| `str.to_int_6.smt2` | unknown | **bug/crash** | +| `substring.smt2` | sat | **bug/crash** | +| `substring2.smt2` | sat | **bug/crash** | +| `substring2b.smt2` | sat | **bug/crash** | +| `is-digit-2.smt2` | sat | **bug/crash** | + +Data from: https://github.com/Z3Prover/z3/discussions/9071 + +## Reproducing examples + +```smt2 +; str.at.smt2 — EXPECTED: sat, nseq crashes +(set-logic QF_S) +(declare-const x String) +(assert (= (str.at x 2) "x")) +(assert (= (str.at x 3) "y")) +(check-sat) +``` + +```smt2 +; str.from_int_6.smt2 — EXPECTED: sat, nseq crashes +(declare-const w String) +(declare-const x Int) +(declare-const y Int) +(assert (= (str.substr w 0 3) (str.from_int x))) +(assert (= (str.substr w 10 3) (str.from_int y))) +(assert (> x (* 2 y))) +(assert (>= y 0)) +(check-sat) +``` + +```smt2 +; str.to_int_5.smt2 — EXPECTED: unsat, nseq crashes +(declare-const w String) +(assert (= (str.to_int (str.substr w 0 10)) 123)) +(assert (= (str.to_int (str.substr w 10 10)) 321)) +(check-sat) +``` + +```smt2 +; substring.smt2 — EXPECTED: sat, nseq crashes +(set-logic QF_S) +(declare-fun x () String) +(declare-fun y () String) +(declare-fun n () Int) +(assert (= y (str.substr x n 5))) +(assert (>= (str.len x) (+ n 5))) +(assert (= (str.at y 0) "a")) +(assert (> n 0)) +(check-sat) +``` + +```smt2 +; is-digit-2.smt2 — EXPECTED: sat, nseq crashes +(set-logic QF_S) +(declare-fun x () String) +(assert (str.is_digit (str.at x 1))) +(assert (str.is_digit (str.at x 2))) +(check-sat) +``` + +## Analysis + +These benchmarks all combine position-indexed string operations (`str.at`, `str.substr`) +with either: +- Direct string equality constraints on the extracted substring/character +- Integer conversion operations (`str.from_int`, `str.to_int`) +- Character classification (`str.is_digit`) +- Arithmetic constraints on positions (`n > 0`, `len(x) >= n + 5`) + +The axioms for these operations (`at_axiom`, `extract_axiom`, `itos_axiom`, `stoi_axiom`, +`is_digit_axiom`) are defined in `seq_axioms.cpp` and are triggered via `relevant_eh` → +`dequeue_axiom`. The individual axioms appear to be implemented. + +The crashes likely stem from one or more of: + +1. **Interaction with variable-offset substring**: When `str.at x i` uses a variable + index `i` (e.g., `str.at x 1` where `1` is a concrete integer, but the axiom + creates skolems that interact with the Nielsen graph's variable positions), + the resulting constraints may cause assertion failures. + +2. **Double-application of str.at**: In `str.at.smt2`, two `str.at` constraints on the + same variable `x` at different positions (2 and 3) must simultaneously hold. + The `at_axiom` introduces skolem variables for each, and their combination in the + Nielsen graph may not be handled correctly. + +3. **Nested substr + conversion**: In `str.from_int_6.smt2` and `str.to_int_5.smt2`, + the combination of `str.substr` extracting a sub-region plus integer conversion + creates constraints that span the string/arithmetic boundary in ways the nseq + solver's axiom-enqueuing mechanism may not handle. + +4. **str.is_digit on str.at result**: In `is-digit-2.smt2`, `str.is_digit (str.at x 1)` + requires the `is_digit_axiom` to be applied to the result of `at_axiom`. The nseq + solver may not propagate the `relevant_eh` trigger through the composed term. + +## Files to investigate + +- `src/ast/rewriter/seq_axioms.cpp` — `at_axiom`, `extract_axiom`, `itos_axiom`, `stoi_axiom`, `is_digit_axiom` +- `src/smt/theory_nseq.cpp` — `relevant_eh` (check if nested terms trigger axiom enqueuing), `dequeue_axiom` +- `src/smt/seq/seq_nielsen.cpp` — handling of position-indexed skolem variables diff --git a/research/docs/nseq-issues/09-crash-word-equation-regex.md b/research/docs/nseq-issues/09-crash-word-equation-regex.md new file mode 100644 index 000000000..26e166271 --- /dev/null +++ b/research/docs/nseq-issues/09-crash-word-equation-regex.md @@ -0,0 +1,117 @@ +# [nseq] Crashes: word equation + regex benchmarks (various operations) + +**Labels**: bug, c3, nseq, crash + +## Summary + +The nseq solver crashes or errors on a variety of benchmarks involving combinations of +word equations with regex constraints, contains, replace_all, and SLIA arithmetic. +These represent missing or incomplete operation support in the nseq solver. + +## Affected benchmarks + +| File | seq verdict | nseq verdict | Operations involved | +|------|-------------|--------------|---------------------| +| `concat-001.smt2` | sat | **bug/crash** | word equation + regex (old syntax) | +| `contains-1.smt2` | unsat | **bug/crash** | contains in equality context | +| `contains-7.smt2` | sat | **bug/crash** | str.replace_all | +| `nonlinear-2.smt2` | sat | **bug/crash** | `a = b ++ b` + regex | +| `noodles-unsat8.smt2` | unknown | **bug/crash** | complex word equation | +| `noodles-unsat10.smt2` | unsat | **bug/crash** | complex word equation | +| `norn-benchmark-9i.smt2` | sat | **bug/crash** | old-syntax regex + complex | +| `pcp-1.smt2` | sat | **bug/crash** | SLIA + complex string ops | + +Data from: https://github.com/Z3Prover/z3/discussions/9071 + +## Reproducing examples + +```smt2 +; concat-001.smt2 — EXPECTED: sat, nseq crashes +; Uses old-style str.in.re syntax and unambiguous regex constraints +(declare-const x String) +(declare-const y String) +(declare-const z String) +(assert (str.in.re x (re.* (str.to.re "ab")))) +(assert (str.in.re y (re.* (str.to.re "c")))) +(assert (str.in.re z (str.to.re "cccc"))) +(assert (= z (str.++ x y))) +(check-sat) +``` + +```smt2 +; contains-1.smt2 — EXPECTED: unsat, nseq crashes +; str.contains used inside an equality with a boolean literal +(set-logic QF_S) +(declare-const x String) +(assert (not (= (str.contains "A" x) true))) +(check-sat) +``` + +```smt2 +; contains-7.smt2 — EXPECTED: sat, nseq crashes +; str.replace_all used with regex constraints +(set-logic QF_S) +(declare-const x String) +(declare-const y String) +(declare-const z String) +(declare-const res String) +(assert (= res (str.replace_all x y z))) +(assert (str.in_re x (re.* (str.to_re "ab")))) +(assert (not (str.in_re y (re.* (str.to_re "ab"))))) +(assert (not (= res ""))) +(check-sat) +``` + +```smt2 +; nonlinear-2.smt2 — EXPECTED: sat, nseq crashes +; Non-linear word equation: a = b ++ b (self-concatenation) +(set-logic QF_S) +(declare-fun a () String) +(declare-fun b () String) +(assert (= a (str.++ b b))) +(assert (str.in.re a (re.++ (re.* (str.to.re "(")) (re.+ (str.to.re ")"))))) +(check-sat) +``` + +## Analysis of individual crash causes + +### concat-001.smt2 +Uses the old SMT-LIB `str.in.re` / `str.to.re` syntax with a word equation +`z = x ++ y` where all three variables have concrete regex constraints. If the nseq +solver's parser or internalization does not correctly handle the old-style `str.to.re` +synonym, the membership constraint may not be registered correctly. + +### contains-1.smt2 +The formula `(not (= (str.contains "A" x) true))` creates a Boolean atom of the form +`(= true)` rather than the direct `str.contains` predicate. +In `assign_eh`, the direct check `is_contains(e)` will NOT match `(= (str.contains "A" x) true)`. +The catch-all `push_unhandled_pred()` fires, but the outer negation means the atom +`(= (str.contains "A" x) true)` is assigned to false. This should cause `FC_GIVEUP`, +but may manifest as a crash if the solver mishandles the boolean variable assignment. + +**Fix**: Normalize `(= true/false)` patterns in `assign_eh` or during +internalization, or add specific handling for `str.contains` embedded in boolean equalities. + +### contains-7.smt2 +Uses `str.replace_all` which is handled by `replace_all_axiom`. The crash may occur +when `replace_all_axiom` interacts with regex constraints on the arguments in ways +the Nielsen graph doesn't support. + +### nonlinear-2.smt2 +The word equation `a = b ++ b` is a non-linear equation (variable `b` appears twice +on the RHS). The Nielsen graph supports some non-linear equations, but this specific +combination with a regex constraint `a ∈ (re.* "(") ++ (re.+ ")")` — which constrains +characters — may expose a gap in the non-linear handling. + +### noodles-unsat8 / noodles-unsat10 +These are word equations from the Oliver Markgraf OSTRICH benchmark set. They involve +multiple variables with complex concatenation patterns and regex constraints. The crashes +suggest that the combination of 3+ variable word equations with disjunctive regex +constraints exceeds what the current nseq Nielsen graph implementation can handle. + +## Files to investigate + +- `src/smt/theory_nseq.cpp` — `assign_eh` (boolean-equality pattern for contains), `internalize_atom` +- `src/ast/rewriter/seq_axioms.cpp` — `replace_all_axiom`, contains axioms +- `src/smt/seq/seq_nielsen.cpp` — non-linear equation handling, `generate_extensions` +- `src/ast/seq_decl_plugin.cpp` — old-style synonym mapping (str.in.re / str.to.re) diff --git a/research/docs/nseq-issues/10-performance-regression-regex-word-equation.md b/research/docs/nseq-issues/10-performance-regression-regex-word-equation.md new file mode 100644 index 000000000..1d22894c3 --- /dev/null +++ b/research/docs/nseq-issues/10-performance-regression-regex-word-equation.md @@ -0,0 +1,121 @@ +# [nseq] Performance regression: nseq times out on regex + word equation benchmarks + +**Labels**: performance, c3, nseq + +## Summary + +The nseq solver times out (exceeds 5 seconds) on several benchmarks that the seq solver +solves in under 30ms. These represent cases where nseq's Nielsen-graph search either +explores too many branches or does not apply available pruning heuristics effectively. + +## Affected benchmarks + +| File | seq verdict | seq time | nseq verdict | nseq time | +|------|-------------|----------|--------------|-----------| +| `concat-regex.smt2` | unsat | 0.018s | **unknown** (timeout) | >5s | +| `concat-regex3.smt2` | unsat | 0.024s | **unknown** (timeout) | >5s | +| `loop.smt2` | sat | 0.026s | **unknown** (timeout) | >5s | +| `simple-concat-4.smt2` | sat | 0.028s | **unknown** (timeout) | >5s | +| `all-quantifiers.smt2` | unknown | 0.020s | **unknown** (timeout) | >5s | + +Data from: https://github.com/Z3Prover/z3/discussions/9071 + +## Reproducing examples + +```smt2 +; concat-regex.smt2 — seq: unsat in 18ms, nseq: TIMEOUT +(set-logic QF_S) +(declare-fun a () String) +(declare-fun b () String) +(declare-fun c () String) +(declare-fun d () String) +(assert (= a (str.++ b c))) +(assert (= b (str.++ d c))) +(assert (str.in.re a (re.+ (re.union (str.to.re "x") (str.to.re "y"))))) +(assert (str.in.re c (re.+ (str.to.re "x")))) +(check-sat) +``` + +Why unsat: `a = b ++ c = d ++ c ++ c`. With `a ∈ (x|y)+` and `c ∈ x+`, +all characters in `a` are 'x' or 'y', but `c` forces all its chars to be 'x'. +The word equation creates constraints that (with the regex) force a contradiction. +The seq solver can prove this quickly, but nseq's iterative deepening DFS exhausts +its depth budget. + +```smt2 +; loop.smt2 — seq: sat in 26ms, nseq: TIMEOUT +(declare-fun TestC0 () String) +(declare-fun |1 Fill 1| () String) +(declare-fun |1 Fill 0| () String) +(declare-fun |1 Fill 2| () String) +(assert (str.in.re TestC0 + (re.++ (re.* (re.range "\x00" "\xff")) + ((_ re.loop 3 3) (re.range "0" "9")) + (re.* (re.range "0" "9")) + (re.* (re.range "\x00" "\xff"))))) +; ... (additional constraints) +(check-sat) +``` + +```smt2 +; simple-concat-4.smt2 — seq: sat in 28ms, nseq: TIMEOUT +; Uses recursive function definition (define-fun-rec) with transducer-style constraints +; and regex membership +(check-sat) +``` + +## Analysis + +### concat-regex and concat-regex3 + +These benchmarks use word equations `a = b ++ c`, `b = d ++ c` combined with regex +memberships. The seq solver uses a length-based approach or Parikh constraints that +quickly prune the search. The nseq solver's Nielsen graph approach uses iterative +deepening DFS (starting at depth 10, doubling each iteration, up to 6 iterations per +`seq_nielsen.cpp:solve()`). For these formulas: + +- The Nielsen graph may generate many extensions at each level (characters from the + regex alphabet, word splits) +- The Parikh pre-check may not prune these cases effectively because the Parikh + constraints for `(x|y)+` allow many length values +- The solver exhausts all 6 iterations without finding the contradiction + +**Fix proposal**: Improve the pre-check for membership intersection. When all strings +in a word equation are constrained to finite-alphabet regexes (e.g., `(x|y)+ ∩ x+`), +use a more aggressive alphabet-intersection analysis to prune branches early. + +### loop.smt2 and simple-concat-4.smt2 + +These benchmarks involve: +- `re.loop` (counted repetition) for `loop.smt2` +- `define-fun-rec` (recursive function definitions for transducer-style constraints) for `simple-concat-4.smt2` + +For `loop.smt2`: the `(_ re.loop 3 3)` constraint requires exactly 3 digits to appear +at a specific position. The nseq solver may not have efficient handling for counted +repetitions in the Parikh constraint generator. + +For `simple-concat-4.smt2`: the `define-fun-rec` introduces recursive predicates. +These are handled by `unfold_ho_terms()` in `theory_nseq.cpp`, which iteratively +unfolds the recursive definition. The timeout may be due to slow convergence of the +unfolding or the combination with regex constraints. + +### Performance improvement opportunities + +1. **Better Parikh stride computation for counted repetitions** (`seq_parikh.cpp`): + `(_ re.loop N N)` has stride 0 and min_len = N * char_len. Ensure this is handled. + +2. **Alphabet intersection pre-pruning** in the Nielsen graph: before generating all + possible character extensions, check if the intersection of the alphabets of all + regex constraints on a variable is consistent with the required character. + +3. **Increase Nielsen DFS depth budget adaptively** or use a BFS strategy for small + alphabets. + +4. **Cache Parikh constraints** across Nielsen nodes that share the same membership constraints. + +## Files to investigate + +- `src/smt/seq/seq_nielsen.cpp` — `solve()`, `generate_extensions()`, depth budget +- `src/smt/seq/seq_parikh.cpp` — `compute_length_stride()` for `re.loop`, constraint generation +- `src/smt/seq/seq_regex.cpp` — alphabet-intersection analysis +- `src/smt/theory_nseq.cpp` — `unfold_ho_terms()` convergence for recursive definitions diff --git a/research/docs/nseq-issues/README.md b/research/docs/nseq-issues/README.md new file mode 100644 index 000000000..c058aa566 --- /dev/null +++ b/research/docs/nseq-issues/README.md @@ -0,0 +1,78 @@ +# nseq Solver Issues — Ostrich Benchmark Analysis + +This directory contains GitHub issue drafts for the nseq solver based on analysis of +the Ostrich benchmark suite run on the c3 branch (Z3 v4.17.0, commit 8ef491e). + +**Source**: https://github.com/Z3Prover/z3/discussions/9071 +**Branch**: c3 +**Benchmark set**: Ostrich (349 files), 5s timeout +**Date**: 2026-03-21 + +## Summary of Results + +| Solver | sat | unsat | unknown | timeout | bug/crash | +|--------|-----|-------|---------|---------|-----------| +| seq | 242 | 75 | 28 | 0 | 4 | +| nseq | 245 | 66 | 11 | 0 | 27 | + +- **22 soundness disagreements** between seq and nseq (nseq returns sat where seq returns unsat in 11 cases) +- **27 nseq crashes** (vs 4 for seq, all on the same 4 parse-ecma files) +- **5+ performance regressions** (nseq timeout where seq solves in <30ms) + +## Issue Files + +### Soundness Bugs (Highest Priority) + +1. **[01-soundness-replace-empty-string.md](01-soundness-replace-empty-string.md)** + `str.replace` with empty input returns sat instead of unsat. + Affects: `replace-special.smt2`, `replace-special-4.smt2`, `replace-special-5.smt2`, `simple-replace-4b.smt2` + +2. **[02-soundness-contains-monotonicity.md](02-soundness-contains-monotonicity.md)** + `str.contains` not preserved through concatenation (`y ++ y`). + Affects: `contains-4.smt2` + +3. **[03-soundness-indexof-regex.md](03-soundness-indexof-regex.md)** + `str.indexof` unsound when combined with regex membership constraints. + Affects: `indexof_const_index_unsat.smt2`, `indexof_var_unsat.smt2` + +4. **[04-soundness-prefix-suffix.md](04-soundness-prefix-suffix.md)** + `str.prefixof` / `str.suffixof` interaction not fully axiomatized. + Affects: `prefix-suffix.smt2`, `failedProp.smt2`, `failedProp2.smt2` + +5. **[05-soundness-conflicting-regex.md](05-soundness-conflicting-regex.md)** + Conflicting regex memberships not detected (disjoint character sets). + Affects: `norn-benchmark-9f.smt2` + +### Crashes (Medium Priority) + +6. **[06-crash-non-greedy-quantifiers.md](06-crash-non-greedy-quantifiers.md)** + Non-greedy regex quantifiers (`re.+?`, `re.*?`) cause crash. + Affects: `non-greedy-quantifiers.smt2` + +7. **[07-crash-str-lt-le-axioms.md](07-crash-str-lt-le-axioms.md)** + `str.<` and `str.<=` axiom handling failures. + Affects: `str-leq11.smt2`, `str-leq12.smt2`, `str-leq13.smt2`, `str-lt.smt2`, `str-lt2.smt2` + +8. **[08-crash-str-at-substr-from-to-int.md](08-crash-str-at-substr-from-to-int.md)** + Crashes on `str.at`, `str.substr`, `str.from_int`, `str.to_int` in complex formulas. + Affects: `str.at.smt2`, `str.at-2.smt2`, `str.from_int_6.smt2`, `str.to_int_5.smt2`, `str.to_int_6.smt2`, `substring.smt2`, `substring2.smt2`, `substring2b.smt2`, `is-digit-2.smt2` + +9. **[09-crash-word-equation-regex.md](09-crash-word-equation-regex.md)** + Various crashes on word equation + regex benchmarks. + Affects: `concat-001.smt2`, `contains-1.smt2`, `contains-7.smt2`, `nonlinear-2.smt2`, `noodles-unsat8.smt2`, `noodles-unsat10.smt2`, `norn-benchmark-9i.smt2`, `pcp-1.smt2` + +### Performance Regressions (Lower Priority) + +10. **[10-performance-regression-regex-word-equation.md](10-performance-regression-regex-word-equation.md)** + nseq times out on regex + word equation benchmarks where seq is fast. + Affects: `concat-regex.smt2`, `concat-regex3.smt2`, `loop.smt2`, `simple-concat-4.smt2`, `all-quantifiers.smt2` + +## Quick-Win Fixes + +The following issues have clear, localized fixes: + +1. **Non-greedy quantifiers** (issue 06): Add rewrite rules to normalize `re.+?` → `re.+`, `re.*?` → `re.*` in `seq_rewriter.cpp`. + +2. **Boolean-equality pattern for contains** (issue 09, `contains-1.smt2`): In `assign_eh`, handle the pattern `(= (str.contains ...) true/false)` by extracting the inner predicate. + +3. **Conflicting regex character sets** (issue 05): Improve the Parikh pre-check or add a character-set intersection check for multiple membership constraints on the same variable.