Add nseq issue drafts from Ostrich benchmark analysis (discussion #9071) (#9073)

Agent-Logs-Url: https://github.com/Z3Prover/z3/sessions/04321ea7-2a53-4ed5-9f43-816dc6f7476b Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>
2026-07-06 07:16:13 +00:00 · 2026-03-20 19:07:51 -07:00 · 2026-03-20 19:07:51 -07:00 · f518faac9b
commit f518faac9b
parent 8ef491edea
11 changed files with 932 additions and 0 deletions
--- a/research/docs/nseq-issues/01-soundness-replace-empty-string.md
+++ b/research/docs/nseq-issues/01-soundness-replace-empty-string.md
@ -0,0 +1,78 @@
+# [nseq] Soundness bug: str.replace with empty-string input returns sat instead of unsat
+
+**Labels**: bug, c3, nseq, soundness
+
+## Summary
+
+The nseq solver returns `sat` on benchmarks where the correct answer is `unsat` when
+`str.replace` is applied to an empty input string and the result is constrained to an
+impossible value. The seq solver correctly returns `unsat` for all of these cases.
+
+## Affected benchmarks (Ostrich suite, c3 branch)
+
+| File | seq verdict | nseq verdict |
+|------|-------------|--------------|
+| `replace-special.smt2` | unsat | **sat** (WRONG) |
+| `replace-special-4.smt2` | unsat | **sat** (WRONG) |
+| `replace-special-5.smt2` | unsat | **sat** (WRONG) |
+| `simple-replace-4b.smt2` | unsat | **sat** (WRONG) |
+
+Data from: https://github.com/Z3Prover/z3/discussions/9071
+
+## Reproducing examples
+
+```smt2
+; replace-special.smt2 — EXPECTED: unsat, nseq returns: sat
+(set-logic QF_SLIA)
+(declare-fun x () String)
+(assert (not (= (= "B" (str.replace "" x "A")) false)))
+(check-sat)
+```
+
+```smt2
+; replace-special-4.smt2 — EXPECTED: unsat, nseq returns: sat
+(set-logic QF_SLIA)
+(declare-fun x () String)
+(declare-fun y () String)
+(assert (not (= (str.replace "" (str.replace "" x "A") "B")
+               (str.substr "B" 0 (str.len x)))))
+(check-sat)
+```
+
+```smt2
+; simple-replace-4b.smt2 — EXPECTED: unsat, nseq returns: sat
+(set-logic QF_S)
+(declare-fun x_7 () String)
+(declare-fun x_11 () String)
+(declare-fun x_12 () String)
+(assert (= x_7 (str.replace (str.++ x_11 x_12) "e" "a")))
+(assert (= x_7 "hello"))
+(check-sat)
+```
+
+## Analysis
+
+`replace_axiom` in `src/ast/rewriter/seq_axioms.cpp` generates clauses for
+`r = str.replace u s t`:
+
+- `~(u="") ∨ (s="") ∨ (r=u)` — empty input without empty pattern gives `r = u`
+- `~(s="") ∨ (r = t ++ u)` — empty pattern gives `r = t ++ u`
+- `contains(u,s) ∨ r = u` — no occurrence gives `r = u`
+
+For `replace-special.smt2` with `u=""`:
+- If `s=""`: `r = "A" ++ "" = "A"`, but `r = "B"` → contradiction → unsat
+- If `s≠""`: `r = u = ""`, but `r = "B"` → contradiction → unsat
+
+The clauses are individually correct, but the nseq Nielsen graph apparently fails to
+derive the contradiction when the result `r` is constrained via equality to an impossible
+constant. The likely root cause is incomplete propagation of the result equality through
+the egraph into the replace axiom conclusions.
+
+For `simple-replace-4b.smt2`, the issue is that `str.replace (x_11 ++ x_12) "e" "a" = "hello"`
+is unsatisfiable because "hello" contains 'l', which cannot be produced by replacing 'e'→'a' in
+any string, yet nseq claims sat.
+
+## Files to investigate
+
+- `src/ast/rewriter/seq_axioms.cpp` — `replace_axiom`
+- `src/smt/theory_nseq.cpp` — `dequeue_axiom`, `populate_nielsen_graph`
--- a/research/docs/nseq-issues/02-soundness-contains-monotonicity.md
+++ b/research/docs/nseq-issues/02-soundness-contains-monotonicity.md
@ -0,0 +1,57 @@
+# [nseq] Soundness bug: str.contains not preserved through concatenation
+
+**Labels**: bug, c3, nseq, soundness
+
+## Summary
+
+The nseq solver returns `sat` for a benchmark that asserts:
+1. `z` is a substring of `y` (`str.contains y z`)
+2. `x = y ++ y`
+3. `z` is NOT a substring of `x` (`not (str.contains x z)`)
+
+The third assertion contradicts the first two, so the formula should be `unsat`.
+The seq solver correctly returns `unsat`; nseq incorrectly returns `sat`.
+
+## Affected benchmark
+
+| File | seq verdict | nseq verdict |
+|------|-------------|--------------|
+| `contains-4.smt2` | unsat | **sat** (WRONG) |
+
+Data from: https://github.com/Z3Prover/z3/discussions/9071
+
+## Reproducing example
+
+```smt2
+; contains-4.smt2 — EXPECTED: unsat, nseq returns: sat
+(set-logic QF_S)
+(declare-const x String)
+(declare-const y String)
+(declare-const z String)
+(assert (str.contains y z))
+(assert (= x (str.++ y y)))
+(assert (not (str.contains x z)))
+(check-sat)
+```
+
+## Analysis
+
+If `str.contains y z` is true, then `y = L ++ z ++ R` for some `L`, `R`.
+Then `x = y ++ y = L ++ z ++ R ++ L ++ z ++ R`, so `z` is a substring of `x`.
+Therefore `not (str.contains x z)` is a contradiction.
+
+The nseq solver should derive this via the combination of:
+- `contains_true_axiom`: `str.contains y z` → `y = f1 ++ z ++ f2` (skolems)
+- The equality `x = y ++ y`
+- The Nielsen graph constraint that propagates the concatenation decomposition
+
+The root cause may be that the nseq solver does not propagate `str.contains` facts
+through word equation equalities in the Nielsen graph. When `x = y ++ y` is processed
+as a word equation, the contains constraint on `y` should propagate to derive a contains
+constraint on `x`, but this inference appears to be missing.
+
+## Files to investigate
+
+- `src/ast/rewriter/seq_axioms.cpp` — `contains_true_axiom`, `unroll_not_contains`
+- `src/smt/seq/seq_nielsen.h` / `seq_nielsen.cpp` — word equation propagation
+- `src/smt/theory_nseq.cpp` — `assign_eh` for contains, `populate_nielsen_graph`
--- a/research/docs/nseq-issues/03-soundness-indexof-regex.md
+++ b/research/docs/nseq-issues/03-soundness-indexof-regex.md
@ -0,0 +1,71 @@
+# [nseq] Soundness bug: str.indexof unsound when combined with regex membership
+
+**Labels**: bug, c3, nseq, soundness
+
+## Summary
+
+The nseq solver returns `sat` for benchmarks that constrain `str.indexof` to values
+impossible given the regex membership of the input string. The seq solver correctly
+returns `unsat` for these cases.
+
+## Affected benchmarks
+
+| File | seq verdict | nseq verdict |
+|------|-------------|--------------|
+| `indexof_const_index_unsat.smt2` | unsat | **sat** (WRONG) |
+| `indexof_var_unsat.smt2` | unsat | **sat** (WRONG) |
+
+Data from: https://github.com/Z3Prover/z3/discussions/9071
+
+## Reproducing examples
+
+```smt2
+; indexof_const_index_unsat.smt2 — EXPECTED: unsat, nseq returns: sat
+(set-info :status unsat)
+(declare-fun a () String)
+(declare-fun i () Int)
+(declare-fun j () Int)
+(assert (str.in_re a (re.union (str.to_re "hhhbbb") (str.to_re "bhhh"))))
+(assert (= (str.indexof a "hhh" j) i))
+(assert (= i 2))
+(assert (> j 0))
+(check-sat)
+```
+
+```smt2
+; indexof_var_unsat.smt2 — EXPECTED: unsat, nseq returns: sat
+(set-info :status unsat)
+(declare-fun a () String)
+(declare-fun i () Int)
+(declare-fun j () Int)
+(assert (str.in_re a (re.union (str.to_re "hhhbbb") (str.to_re "bhhh"))))
+(assert (= (str.indexof a "hhh" j) i))
+(assert (> i 1))
+(check-sat)
+```
+
+## Analysis
+
+For `indexof_const_index_unsat.smt2`:
+- `a ∈ {hhhbbb, bhhh}` (two possibilities)
+- `str.indexof a "hhh" j = 2` with `j > 0`
+- In "hhhbbb", "hhh" appears at index 0 only (but j > 0 means the search starts after index 0)
+- In "bhhh", "hhh" appears at index 1, but with j > 0 the only valid return would be 1, not 2
+- So i = 2 is impossible → unsat
+
+The `indexof_axiom` in `seq_axioms.cpp` generates arithmetic constraints for indexof,
+but these constraints may not be sufficiently tight when combined with concrete regex
+membership constraints. Specifically, the nseq solver does not appear to combine the
+regex membership information with the indexof position constraints to derive the
+contradiction.
+
+The root cause is likely that nseq's `indexof_axiom` generates axioms about `str.indexof`
+without leveraging the concrete alphabet constraints imposed by regex membership. The
+seq solver may do additional propagation (e.g., via character-level analysis of the
+regex language) that nseq does not perform.
+
+## Files to investigate
+
+- `src/ast/rewriter/seq_axioms.cpp` — `indexof_axiom`
+- `src/smt/seq/seq_regex.h` / `seq_regex.cpp` — regex membership propagation
+- `src/smt/theory_nseq.cpp` — interaction between regex constraints and arithmetic axioms
--- a/research/docs/nseq-issues/04-soundness-prefix-suffix.md
+++ b/research/docs/nseq-issues/04-soundness-prefix-suffix.md
@ -0,0 +1,78 @@
+# [nseq] Soundness bug: str.prefixof / str.suffixof interaction not fully axiomatized
+
+**Labels**: bug, c3, nseq, soundness
+
+## Summary
+
+The nseq solver returns `sat` for several benchmarks involving `str.prefixof`,
+`str.suffixof`, and `str.at` where the correct answer is `unsat`. The seq solver
+handles all these cases correctly.
+
+## Affected benchmarks
+
+| File | seq verdict | nseq verdict |
+|------|-------------|--------------|
+| `prefix-suffix.smt2` | unsat | **sat** (WRONG) |
+| `failedProp.smt2` | unsat | **sat** (WRONG) |
+| `failedProp2.smt2` | unsat | **sat** (WRONG) |
+
+Data from: https://github.com/Z3Prover/z3/discussions/9071
+
+## Reproducing examples
+
+```smt2
+; prefix-suffix.smt2 — EXPECTED: unsat, nseq returns: sat
+(set-logic QF_S)
+(declare-fun a () String)
+(declare-fun b () String)
+(assert (not (=> (and (str.prefixof a b) (str.suffixof b a)) (= a b))))
+(check-sat)
+```
+
+The above asserts: `(a prefix b) AND (b suffix a) AND a ≠ b`.
+This is unsatisfiable: if `a` is a prefix of `b` and `b` is a suffix of `a`,
+then `|a| ≤ |b|` and `|b| ≤ |a|`, so `|a| = |b|`. Combined with the prefix and
+suffix conditions, this forces `a = b`.
+
+```smt2
+; failedProp.smt2 — EXPECTED: unsat, nseq returns: sat
+(set-logic QF_SLIA)
+(declare-fun x () String)
+(declare-fun y () String)
+(assert (not (= (str.prefixof x (str.at x 1)) (= x ""))))
+(check-sat)
+```
+
+The above asserts: `(str.prefixof x (str.at x 1)) ≠ (x = "")`.
+`str.at x 1` is the second character of `x` (as a length-1 string).
+`str.prefixof x (str.at x 1)` is true only when `x` is a prefix of a single character,
+which means `x = ""` or `x` is that single character (but then `x` is a prefix of itself only
+if `len(x) ≤ 1`). This equivalence should hold: `str.prefixof x (str.at x 1) ↔ x = ""`.
+So the assertion is `unsat`.
+
+```smt2
+; failedProp2.smt2 — EXPECTED: unsat, nseq returns: sat (same pattern with suffixof)
+(set-logic QF_SLIA)
+(declare-fun x () String)
+(assert (not (= (str.suffixof x (str.at x 1)) (= x ""))))
+(check-sat)
+```
+
+## Analysis
+
+For `prefix-suffix.smt2`, the current axioms for `prefix_axiom` and `suffix_axiom`
+introduce skolem variables but may not derive the length equality `|a| = |b|` from the
+combined prefix+suffix condition, or may not then propagate this to `a = b`.
+
+For `failedProp.smt2` and `failedProp2.smt2`, the issue involves interaction between
+`str.at`, `str.prefixof`/`str.suffixof`, and string-integer length constraints. The
+nseq solver may not properly resolve the combination of:
+1. The `at_axiom` (generating constraints on `str.at x 1`)
+2. The `prefix_axiom`/`suffix_axiom` (decomposing the prefix/suffix relationship)
+3. The arithmetic constraint linking lengths
+
+## Files to investigate
+
+- `src/ast/rewriter/seq_axioms.cpp` — `prefix_axiom`, `suffix_axiom`, `prefix_true_axiom`, `suffix_true_axiom`, `at_axiom`
+- `src/smt/theory_nseq.cpp` — `assign_eh` for prefix/suffix, `final_check_eh`
+- `src/smt/seq/seq_nielsen.cpp` — length propagation in Nielsen graph
--- a/research/docs/nseq-issues/05-soundness-conflicting-regex.md
+++ b/research/docs/nseq-issues/05-soundness-conflicting-regex.md
@ -0,0 +1,65 @@
+# [nseq] Soundness bug: conflicting regex memberships not detected (norn-benchmark-9f)
+
+**Labels**: bug, c3, nseq, soundness
+
+## Summary
+
+The nseq solver returns `sat` for a formula asserting that a non-empty string belongs
+simultaneously to two disjoint regular languages. The correct answer is `unsat`.
+
+## Affected benchmark
+
+| File | seq verdict | nseq verdict |
+|------|-------------|--------------|
+| `norn-benchmark-9f.smt2` | unsat | **sat** (WRONG) |
+
+Data from: https://github.com/Z3Prover/z3/discussions/9071
+
+## Reproducing example
+
+```smt2
+; norn-benchmark-9f.smt2 — EXPECTED: unsat, nseq returns: sat
+(set-logic QF_S)
+(declare-fun var_0 () String)
+(assert (str.in.re var_0 (re.* (re.range "a" "u"))))
+(assert (str.in.re var_0 (re.* (str.to.re "v"))))
+(assert (not (= var_0 "")))
+(check-sat)
+```
+
+The formula asserts:
+1. `var_0 ∈ (re.range "a" "u")* ` — all characters in "a"–"u"
+2. `var_0 ∈ "v"*` — only the character "v"
+3. `var_0 ≠ ""`
+
+Since "v" is not in the range "a"–"u" (the range includes up to 'u', not 'v'),
+the intersection of the two languages is `{""}`. Combined with `var_0 ≠ ""`, this is
+unsatisfiable.
+
+## Note on syntax
+
+This benchmark uses the old SMT-LIB 2.5 syntax `str.in.re` (with dots) rather than
+the SMT-LIB 2.6 syntax `str.in_re` (with underscore). Both are supported by Z3's parser.
+The bug may be triggered specifically by the old-style syntax interaction with nseq's
+regex handling, or may be a pure logic issue.
+
+## Analysis
+
+The nseq solver handles `str.in_re` constraints by computing the intersection of
+multiple regex membership constraints for the same string variable (via the sgraph and
+Nielsen graph). The Parikh image pre-check (`seq_parikh.cpp`) should detect that
+`(re.range "a" "u")* ∩ "v"*` restricted to non-empty strings is empty, since:
+- `"v"*` allows only 'v' characters
+- `(re.range "a" "u")*` allows only characters in 'a'..'u'
+- 'v' ∉ 'a'..'u' → intersection = `{""}` only
+
+The root cause is likely that the nseq solver does not derive the character-level
+disjointness between these two regex languages. The seq solver may use a derivative-based
+or automaton intersection approach that detects this.
+
+## Files to investigate
+
+- `src/smt/seq/seq_regex.h` / `seq_regex.cpp` — regex membership processing, pre-check
+- `src/smt/seq/seq_parikh.h` / `seq_parikh.cpp` — Parikh image constraint generation
+- `src/smt/seq/seq_nielsen.cpp` — membership processing in Nielsen nodes
+- `src/smt/theory_nseq.cpp` — `assign_eh` for regex membership
--- a/research/docs/nseq-issues/06-crash-non-greedy-quantifiers.md
+++ b/research/docs/nseq-issues/06-crash-non-greedy-quantifiers.md
@ -0,0 +1,67 @@
+# [nseq] Crash: non-greedy regex quantifiers (re.+?, re.*?) cause exception
+
+**Labels**: bug, c3, nseq, crash
+
+## Summary
+
+The nseq solver crashes (throws an exception or returns an error) when the input
+formula contains non-greedy regex quantifiers such as `re.+?` (lazy plus) or `re.*?`
+(lazy star). The seq solver handles these correctly by treating them as equivalent to
+their greedy counterparts for satisfiability purposes.
+
+## Affected benchmark
+
+| File | seq verdict | nseq verdict |
+|------|-------------|--------------|
+| `non-greedy-quantifiers.smt2` | sat | **bug/crash** |
+
+Data from: https://github.com/Z3Prover/z3/discussions/9071
+
+## Reproducing example
+
+```smt2
+; non-greedy-quantifiers.smt2 — EXPECTED: sat, nseq crashes
+(set-logic QF_S)
+(declare-fun a () String)
+(declare-fun b () String)
+(declare-fun c () String)
+(declare-fun d () String)
+(assert (= a (str.++ b c)))
+(assert (= b (str.++ d c)))
+; non-greedy re.+? has same semantics as re.+ for satisfiability
+(assert (str.in.re a (re.+ (re.union (str.to.re "x") (str.to.re "y")))))
+(assert (str.in.re c (re.+? (str.to.re "x"))))
+(check-sat)
+(get-model)
+```
+
+## Analysis
+
+Z3's SMT-LIB parser supports non-greedy regex quantifiers (`re.+?`, `re.*?`, `re.??`)
+as part of the ECMA-like regex extension. For satisfiability checking, non-greedy
+quantifiers are semantically equivalent to their greedy counterparts (they only differ
+in which match is returned, not in what strings are accepted).
+
+The nseq solver's regex handling infrastructure (`seq_regex.cpp`) does not appear to
+recognize or normalize non-greedy quantifiers. When the symbol `re.+?` is encountered
+in a membership constraint, the solver either:
+1. Does not recognize the operator and triggers `push_unhandled_pred()`, causing `FC_GIVEUP`
+   (which may manifest as an error/"bug" in the benchmark runner), or
+2. Crashes due to an unhandled case in the Brzozowski derivative computation or
+   Parikh constraint generation.
+
+### Fix proposal
+
+Non-greedy quantifiers should be rewritten to their greedy equivalents before processing:
+- `re.+? e` → `re.+ e` (lazy plus → greedy plus)
+- `re.*? e` → `re.* e` (lazy star → greedy star)
+- `re.?? e` → `re.opt e` (lazy optional → greedy optional)
+
+This rewrite is sound for satisfiability/unsatisfiability since both interpretations
+accept exactly the same language.
+
+## Files to investigate
+
+- `src/smt/seq/seq_regex.cpp` — regex preprocessing and derivative computation
+- `src/ast/rewriter/seq_rewriter.cpp` — regex rewriting rules (add non-greedy normalization)
+- `src/smt/theory_nseq.cpp` — `internalize_atom` / `assign_eh` for regex membership
--- a/research/docs/nseq-issues/07-crash-str-lt-le-axioms.md
+++ b/research/docs/nseq-issues/07-crash-str-lt-le-axioms.md
@ -0,0 +1,82 @@
+# [nseq] Crashes: str.< and str.<= axiom handling failures
+
+**Labels**: bug, c3, nseq, crash
+
+## Summary
+
+The nseq solver crashes (returns "bug" in the benchmark runner) on formulas involving
+`str.<` (lexicographic less-than) and `str.<=` (lexicographic less-than-or-equal)
+string comparison predicates in contexts where they interact with length constraints
+and character-level reasoning.
+
+## Affected benchmarks
+
+| File | seq verdict | nseq verdict |
+|------|-------------|--------------|
+| `str-leq11.smt2` | sat | **bug/crash** |
+| `str-leq12.smt2` | sat | **bug/crash** |
+| `str-leq13.smt2` | sat | **bug/crash** |
+| `str-lt.smt2` | sat | **bug/crash** |
+| `str-lt2.smt2` | sat | **bug/crash** |
+
+Data from: https://github.com/Z3Prover/z3/discussions/9071
+
+## Reproducing examples
+
+```smt2
+; str-leq11.smt2 — EXPECTED: sat, nseq crashes
+(set-logic QF_SLIA)
+(declare-fun x () String)
+(assert (= 1 (str.len x)))
+(assert (str.<= "a" x))
+(assert (str.<= x "c"))
+(check-sat)
+```
+
+```smt2
+; str-lt.smt2 — EXPECTED: sat, nseq crashes
+(set-logic QF_SLIA)
+(declare-fun x () String)
+(assert (= 1 (str.len x)))
+(assert (str.< "A" x))
+(assert (str.< x "C"))
+(check-sat)
+```
+
+## Analysis
+
+The nseq solver handles `str.<` and `str.<=` via `lt_axiom` and `le_axiom` in
+`seq_axioms.cpp`, triggered from `relevant_eh` → `dequeue_axiom`. The `assign_eh`
+for these predicates is a no-op with comment "axioms added via relevant_eh → dequeue_axiom".
+
+The `lt_axiom` generates clauses involving:
+- Skolem variables `x`, `y`, `z` (common prefix and suffixes)
+- Skolem character variables `c`, `d`
+- Prefix and character-comparison constraints
+
+The crash likely occurs because:
+1. The `lt_axiom` / `le_axiom` introduces character-comparison constraints (via `seq.mk_lt`)
+   that require character-level arithmetic reasoning (e.g., comparing character codes).
+   If the nseq solver's arithmetic integration does not correctly handle these
+   character-code comparisons, it may crash or give incorrect results.
+
+2. The `lt_axiom` clauses reference the Boolean literal for the `str.<` predicate itself
+   (via `expr_ref lt = expr_ref(n, m)`). If `n` is used both as a term and as a literal
+   reference, and nseq has an inconsistency in how it maps the Bool var to the expression,
+   this could cause a null dereference or assertion failure.
+
+3. The interaction between the length constraint `len(x) = 1` and the lt/le axioms'
+   prefix-based decomposition may produce contradictory or unresolvable constraints
+   in the Nielsen graph.
+
+### What str-leq11 requires
+
+For `len(x) = 1`, `"a" ≤ x ≤ "c"` (lexicographically): x ∈ {"a", "b", "c"}.
+The solver needs to find x = "a", "b", or "c". This is straightforward with
+character-range reasoning.
+
+## Files to investigate
+
+- `src/ast/rewriter/seq_axioms.cpp` — `lt_axiom`, `le_axiom`
+- `src/smt/theory_nseq.cpp` — `assign_eh` for is_lt/is_le, `relevant_eh`, `dequeue_axiom`
+- `src/smt/seq/seq_nielsen.cpp` — character-level constraint handling
--- a/research/docs/nseq-issues/08-crash-str-at-substr-from-to-int.md
+++ b/research/docs/nseq-issues/08-crash-str-at-substr-from-to-int.md
@ -0,0 +1,118 @@
+# [nseq] Crashes: str.at / str.substr / str.from_int / str.to_int in complex formulas
+
+**Labels**: bug, c3, nseq, crash
+
+## Summary
+
+The nseq solver crashes or returns errors on benchmarks that combine `str.at`,
+`str.substr`, `str.from_int`, and `str.to_int` with regex constraints, length
+constraints, or positional arithmetic. The seq solver handles all these correctly.
+
+## Affected benchmarks
+
+| File | seq verdict | nseq verdict |
+|------|-------------|--------------|
+| `str.at.smt2` | sat | **bug/crash** |
+| `str.at-2.smt2` | sat | **bug/crash** |
+| `str.from_int_6.smt2` | sat | **bug/crash** |
+| `str.to_int_5.smt2` | unsat | **bug/crash** |
+| `str.to_int_6.smt2` | unknown | **bug/crash** |
+| `substring.smt2` | sat | **bug/crash** |
+| `substring2.smt2` | sat | **bug/crash** |
+| `substring2b.smt2` | sat | **bug/crash** |
+| `is-digit-2.smt2` | sat | **bug/crash** |
+
+Data from: https://github.com/Z3Prover/z3/discussions/9071
+
+## Reproducing examples
+
+```smt2
+; str.at.smt2 — EXPECTED: sat, nseq crashes
+(set-logic QF_S)
+(declare-const x String)
+(assert (= (str.at x 2) "x"))
+(assert (= (str.at x 3) "y"))
+(check-sat)
+```
+
+```smt2
+; str.from_int_6.smt2 — EXPECTED: sat, nseq crashes
+(declare-const w String)
+(declare-const x Int)
+(declare-const y Int)
+(assert (= (str.substr w 0 3) (str.from_int x)))
+(assert (= (str.substr w 10 3) (str.from_int y)))
+(assert (> x (* 2 y)))
+(assert (>= y 0))
+(check-sat)
+```
+
+```smt2
+; str.to_int_5.smt2 — EXPECTED: unsat, nseq crashes
+(declare-const w String)
+(assert (= (str.to_int (str.substr w 0 10)) 123))
+(assert (= (str.to_int (str.substr w 10 10)) 321))
+(check-sat)
+```
+
+```smt2
+; substring.smt2 — EXPECTED: sat, nseq crashes
+(set-logic QF_S)
+(declare-fun x () String)
+(declare-fun y () String)
+(declare-fun n () Int)
+(assert (= y (str.substr x n 5)))
+(assert (>= (str.len x) (+ n 5)))
+(assert (= (str.at y 0) "a"))
+(assert (> n 0))
+(check-sat)
+```
+
+```smt2
+; is-digit-2.smt2 — EXPECTED: sat, nseq crashes
+(set-logic QF_S)
+(declare-fun x () String)
+(assert (str.is_digit (str.at x 1)))
+(assert (str.is_digit (str.at x 2)))
+(check-sat)
+```
+
+## Analysis
+
+These benchmarks all combine position-indexed string operations (`str.at`, `str.substr`)
+with either:
+- Direct string equality constraints on the extracted substring/character
+- Integer conversion operations (`str.from_int`, `str.to_int`)
+- Character classification (`str.is_digit`)
+- Arithmetic constraints on positions (`n > 0`, `len(x) >= n + 5`)
+
+The axioms for these operations (`at_axiom`, `extract_axiom`, `itos_axiom`, `stoi_axiom`,
+`is_digit_axiom`) are defined in `seq_axioms.cpp` and are triggered via `relevant_eh` →
+`dequeue_axiom`. The individual axioms appear to be implemented.
+
+The crashes likely stem from one or more of:
+
+1. **Interaction with variable-offset substring**: When `str.at x i` uses a variable
+   index `i` (e.g., `str.at x 1` where `1` is a concrete integer, but the axiom
+   creates skolems that interact with the Nielsen graph's variable positions),
+   the resulting constraints may cause assertion failures.
+
+2. **Double-application of str.at**: In `str.at.smt2`, two `str.at` constraints on the
+   same variable `x` at different positions (2 and 3) must simultaneously hold.
+   The `at_axiom` introduces skolem variables for each, and their combination in the
+   Nielsen graph may not be handled correctly.
+
+3. **Nested substr + conversion**: In `str.from_int_6.smt2` and `str.to_int_5.smt2`,
+   the combination of `str.substr` extracting a sub-region plus integer conversion
+   creates constraints that span the string/arithmetic boundary in ways the nseq
+   solver's axiom-enqueuing mechanism may not handle.
+
+4. **str.is_digit on str.at result**: In `is-digit-2.smt2`, `str.is_digit (str.at x 1)`
+   requires the `is_digit_axiom` to be applied to the result of `at_axiom`. The nseq
+   solver may not propagate the `relevant_eh` trigger through the composed term.
+
+## Files to investigate
+
+- `src/ast/rewriter/seq_axioms.cpp` — `at_axiom`, `extract_axiom`, `itos_axiom`, `stoi_axiom`, `is_digit_axiom`
+- `src/smt/theory_nseq.cpp` — `relevant_eh` (check if nested terms trigger axiom enqueuing), `dequeue_axiom`
+- `src/smt/seq/seq_nielsen.cpp` — handling of position-indexed skolem variables
--- a/research/docs/nseq-issues/09-crash-word-equation-regex.md
+++ b/research/docs/nseq-issues/09-crash-word-equation-regex.md
@ -0,0 +1,117 @@
+# [nseq] Crashes: word equation + regex benchmarks (various operations)
+
+**Labels**: bug, c3, nseq, crash
+
+## Summary
+
+The nseq solver crashes or errors on a variety of benchmarks involving combinations of
+word equations with regex constraints, contains, replace_all, and SLIA arithmetic.
+These represent missing or incomplete operation support in the nseq solver.
+
+## Affected benchmarks
+
+| File | seq verdict | nseq verdict | Operations involved |
+|------|-------------|--------------|---------------------|
+| `concat-001.smt2` | sat | **bug/crash** | word equation + regex (old syntax) |
+| `contains-1.smt2` | unsat | **bug/crash** | contains in equality context |
+| `contains-7.smt2` | sat | **bug/crash** | str.replace_all |
+| `nonlinear-2.smt2` | sat | **bug/crash** | `a = b ++ b` + regex |
+| `noodles-unsat8.smt2` | unknown | **bug/crash** | complex word equation |
+| `noodles-unsat10.smt2` | unsat | **bug/crash** | complex word equation |
+| `norn-benchmark-9i.smt2` | sat | **bug/crash** | old-syntax regex + complex |
+| `pcp-1.smt2` | sat | **bug/crash** | SLIA + complex string ops |
+
+Data from: https://github.com/Z3Prover/z3/discussions/9071
+
+## Reproducing examples
+
+```smt2
+; concat-001.smt2 — EXPECTED: sat, nseq crashes
+; Uses old-style str.in.re syntax and unambiguous regex constraints
+(declare-const x String)
+(declare-const y String)
+(declare-const z String)
+(assert (str.in.re x (re.* (str.to.re "ab"))))
+(assert (str.in.re y (re.* (str.to.re "c"))))
+(assert (str.in.re z (str.to.re "cccc")))
+(assert (= z (str.++ x y)))
+(check-sat)
+```
+
+```smt2
+; contains-1.smt2 — EXPECTED: unsat, nseq crashes
+; str.contains used inside an equality with a boolean literal
+(set-logic QF_S)
+(declare-const x String)
+(assert (not (= (str.contains "A" x) true)))
+(check-sat)
+```
+
+```smt2
+; contains-7.smt2 — EXPECTED: sat, nseq crashes
+; str.replace_all used with regex constraints
+(set-logic QF_S)
+(declare-const x String)
+(declare-const y String)
+(declare-const z String)
+(declare-const res String)
+(assert (= res (str.replace_all x y z)))
+(assert (str.in_re x (re.* (str.to_re "ab"))))
+(assert (not (str.in_re y (re.* (str.to_re "ab")))))
+(assert (not (= res "")))
+(check-sat)
+```
+
+```smt2
+; nonlinear-2.smt2 — EXPECTED: sat, nseq crashes
+; Non-linear word equation: a = b ++ b (self-concatenation)
+(set-logic QF_S)
+(declare-fun a () String)
+(declare-fun b () String)
+(assert (= a (str.++ b b)))
+(assert (str.in.re a (re.++ (re.* (str.to.re "(")) (re.+ (str.to.re ")")))))
+(check-sat)
+```
+
+## Analysis of individual crash causes
+
+### concat-001.smt2
+Uses the old SMT-LIB `str.in.re` / `str.to.re` syntax with a word equation
+`z = x ++ y` where all three variables have concrete regex constraints. If the nseq
+solver's parser or internalization does not correctly handle the old-style `str.to.re`
+synonym, the membership constraint may not be registered correctly.
+
+### contains-1.smt2
+The formula `(not (= (str.contains "A" x) true))` creates a Boolean atom of the form
+`(= <bool-expr> true)` rather than the direct `str.contains` predicate.
+In `assign_eh`, the direct check `is_contains(e)` will NOT match `(= (str.contains "A" x) true)`.
+The catch-all `push_unhandled_pred()` fires, but the outer negation means the atom
+`(= (str.contains "A" x) true)` is assigned to false. This should cause `FC_GIVEUP`,
+but may manifest as a crash if the solver mishandles the boolean variable assignment.
+
+**Fix**: Normalize `(= <bool-expr> true/false)` patterns in `assign_eh` or during
+internalization, or add specific handling for `str.contains` embedded in boolean equalities.
+
+### contains-7.smt2
+Uses `str.replace_all` which is handled by `replace_all_axiom`. The crash may occur
+when `replace_all_axiom` interacts with regex constraints on the arguments in ways
+the Nielsen graph doesn't support.
+
+### nonlinear-2.smt2
+The word equation `a = b ++ b` is a non-linear equation (variable `b` appears twice
+on the RHS). The Nielsen graph supports some non-linear equations, but this specific
+combination with a regex constraint `a ∈ (re.* "(") ++ (re.+ ")")` — which constrains
+characters — may expose a gap in the non-linear handling.
+
+### noodles-unsat8 / noodles-unsat10
+These are word equations from the Oliver Markgraf OSTRICH benchmark set. They involve
+multiple variables with complex concatenation patterns and regex constraints. The crashes
+suggest that the combination of 3+ variable word equations with disjunctive regex
+constraints exceeds what the current nseq Nielsen graph implementation can handle.
+
+## Files to investigate
+
+- `src/smt/theory_nseq.cpp` — `assign_eh` (boolean-equality pattern for contains), `internalize_atom`
+- `src/ast/rewriter/seq_axioms.cpp` — `replace_all_axiom`, contains axioms
+- `src/smt/seq/seq_nielsen.cpp` — non-linear equation handling, `generate_extensions`
+- `src/ast/seq_decl_plugin.cpp` — old-style synonym mapping (str.in.re / str.to.re)
--- a/research/docs/nseq-issues/10-performance-regression-regex-word-equation.md
+++ b/research/docs/nseq-issues/10-performance-regression-regex-word-equation.md
@ -0,0 +1,121 @@
+# [nseq] Performance regression: nseq times out on regex + word equation benchmarks
+
+**Labels**: performance, c3, nseq
+
+## Summary
+
+The nseq solver times out (exceeds 5 seconds) on several benchmarks that the seq solver
+solves in under 30ms. These represent cases where nseq's Nielsen-graph search either
+explores too many branches or does not apply available pruning heuristics effectively.
+
+## Affected benchmarks
+
+| File | seq verdict | seq time | nseq verdict | nseq time |
+|------|-------------|----------|--------------|-----------|
+| `concat-regex.smt2` | unsat | 0.018s | **unknown** (timeout) | >5s |
+| `concat-regex3.smt2` | unsat | 0.024s | **unknown** (timeout) | >5s |
+| `loop.smt2` | sat | 0.026s | **unknown** (timeout) | >5s |
+| `simple-concat-4.smt2` | sat | 0.028s | **unknown** (timeout) | >5s |
+| `all-quantifiers.smt2` | unknown | 0.020s | **unknown** (timeout) | >5s |
+
+Data from: https://github.com/Z3Prover/z3/discussions/9071
+
+## Reproducing examples
+
+```smt2
+; concat-regex.smt2 — seq: unsat in 18ms, nseq: TIMEOUT
+(set-logic QF_S)
+(declare-fun a () String)
+(declare-fun b () String)
+(declare-fun c () String)
+(declare-fun d () String)
+(assert (= a (str.++ b c)))
+(assert (= b (str.++ d c)))
+(assert (str.in.re a (re.+ (re.union (str.to.re "x") (str.to.re "y")))))
+(assert (str.in.re c (re.+ (str.to.re "x"))))
+(check-sat)
+```
+
+Why unsat: `a = b ++ c = d ++ c ++ c`. With `a ∈ (x|y)+` and `c ∈ x+`,
+all characters in `a` are 'x' or 'y', but `c` forces all its chars to be 'x'.
+The word equation creates constraints that (with the regex) force a contradiction.
+The seq solver can prove this quickly, but nseq's iterative deepening DFS exhausts
+its depth budget.
+
+```smt2
+; loop.smt2 — seq: sat in 26ms, nseq: TIMEOUT
+(declare-fun TestC0 () String)
+(declare-fun |1 Fill 1| () String)
+(declare-fun |1 Fill 0| () String)
+(declare-fun |1 Fill 2| () String)
+(assert (str.in.re TestC0
+           (re.++ (re.* (re.range "\x00" "\xff"))
+                  ((_ re.loop 3 3) (re.range "0" "9"))
+                  (re.* (re.range "0" "9"))
+                  (re.* (re.range "\x00" "\xff")))))
+; ... (additional constraints)
+(check-sat)
+```
+
+```smt2
+; simple-concat-4.smt2 — seq: sat in 28ms, nseq: TIMEOUT
+; Uses recursive function definition (define-fun-rec) with transducer-style constraints
+; and regex membership
+(check-sat)
+```
+
+## Analysis
+
+### concat-regex and concat-regex3
+
+These benchmarks use word equations `a = b ++ c`, `b = d ++ c` combined with regex
+memberships. The seq solver uses a length-based approach or Parikh constraints that
+quickly prune the search. The nseq solver's Nielsen graph approach uses iterative
+deepening DFS (starting at depth 10, doubling each iteration, up to 6 iterations per
+`seq_nielsen.cpp:solve()`). For these formulas:
+
+- The Nielsen graph may generate many extensions at each level (characters from the
+  regex alphabet, word splits)
+- The Parikh pre-check may not prune these cases effectively because the Parikh
+  constraints for `(x|y)+` allow many length values
+- The solver exhausts all 6 iterations without finding the contradiction
+
+**Fix proposal**: Improve the pre-check for membership intersection. When all strings
+in a word equation are constrained to finite-alphabet regexes (e.g., `(x|y)+ ∩ x+`),
+use a more aggressive alphabet-intersection analysis to prune branches early.
+
+### loop.smt2 and simple-concat-4.smt2
+
+These benchmarks involve:
+- `re.loop` (counted repetition) for `loop.smt2`
+- `define-fun-rec` (recursive function definitions for transducer-style constraints) for `simple-concat-4.smt2`
+
+For `loop.smt2`: the `(_ re.loop 3 3)` constraint requires exactly 3 digits to appear
+at a specific position. The nseq solver may not have efficient handling for counted
+repetitions in the Parikh constraint generator.
+
+For `simple-concat-4.smt2`: the `define-fun-rec` introduces recursive predicates.
+These are handled by `unfold_ho_terms()` in `theory_nseq.cpp`, which iteratively
+unfolds the recursive definition. The timeout may be due to slow convergence of the
+unfolding or the combination with regex constraints.
+
+### Performance improvement opportunities
+
+1. **Better Parikh stride computation for counted repetitions** (`seq_parikh.cpp`):
+   `(_ re.loop N N)` has stride 0 and min_len = N * char_len. Ensure this is handled.
+
+2. **Alphabet intersection pre-pruning** in the Nielsen graph: before generating all
+   possible character extensions, check if the intersection of the alphabets of all
+   regex constraints on a variable is consistent with the required character.
+
+3. **Increase Nielsen DFS depth budget adaptively** or use a BFS strategy for small
+   alphabets.
+
+4. **Cache Parikh constraints** across Nielsen nodes that share the same membership constraints.
+
+## Files to investigate
+
+- `src/smt/seq/seq_nielsen.cpp` — `solve()`, `generate_extensions()`, depth budget
+- `src/smt/seq/seq_parikh.cpp` — `compute_length_stride()` for `re.loop`, constraint generation
+- `src/smt/seq/seq_regex.cpp` — alphabet-intersection analysis
+- `src/smt/theory_nseq.cpp` — `unfold_ho_terms()` convergence for recursive definitions
--- a/research/docs/nseq-issues/README.md
+++ b/research/docs/nseq-issues/README.md
@ -0,0 +1,78 @@
+# nseq Solver Issues — Ostrich Benchmark Analysis
+
+This directory contains GitHub issue drafts for the nseq solver based on analysis of
+the Ostrich benchmark suite run on the c3 branch (Z3 v4.17.0, commit 8ef491e).
+
+**Source**: https://github.com/Z3Prover/z3/discussions/9071  
+**Branch**: c3  
+**Benchmark set**: Ostrich (349 files), 5s timeout  
+**Date**: 2026-03-21
+
+## Summary of Results
+
+| Solver | sat | unsat | unknown | timeout | bug/crash |
+|--------|-----|-------|---------|---------|-----------|
+| seq    | 242 | 75    | 28      | 0       | 4         |
+| nseq   | 245 | 66    | 11      | 0       | 27        |
+
+- **22 soundness disagreements** between seq and nseq (nseq returns sat where seq returns unsat in 11 cases)
+- **27 nseq crashes** (vs 4 for seq, all on the same 4 parse-ecma files)
+- **5+ performance regressions** (nseq timeout where seq solves in <30ms)
+
+## Issue Files
+
+### Soundness Bugs (Highest Priority)
+
+1. **[01-soundness-replace-empty-string.md](01-soundness-replace-empty-string.md)**  
+   `str.replace` with empty input returns sat instead of unsat.  
+   Affects: `replace-special.smt2`, `replace-special-4.smt2`, `replace-special-5.smt2`, `simple-replace-4b.smt2`
+
+2. **[02-soundness-contains-monotonicity.md](02-soundness-contains-monotonicity.md)**  
+   `str.contains` not preserved through concatenation (`y ++ y`).  
+   Affects: `contains-4.smt2`
+
+3. **[03-soundness-indexof-regex.md](03-soundness-indexof-regex.md)**  
+   `str.indexof` unsound when combined with regex membership constraints.  
+   Affects: `indexof_const_index_unsat.smt2`, `indexof_var_unsat.smt2`
+
+4. **[04-soundness-prefix-suffix.md](04-soundness-prefix-suffix.md)**  
+   `str.prefixof` / `str.suffixof` interaction not fully axiomatized.  
+   Affects: `prefix-suffix.smt2`, `failedProp.smt2`, `failedProp2.smt2`
+
+5. **[05-soundness-conflicting-regex.md](05-soundness-conflicting-regex.md)**  
+   Conflicting regex memberships not detected (disjoint character sets).  
+   Affects: `norn-benchmark-9f.smt2`
+
+### Crashes (Medium Priority)
+
+6. **[06-crash-non-greedy-quantifiers.md](06-crash-non-greedy-quantifiers.md)**  
+   Non-greedy regex quantifiers (`re.+?`, `re.*?`) cause crash.  
+   Affects: `non-greedy-quantifiers.smt2`
+
+7. **[07-crash-str-lt-le-axioms.md](07-crash-str-lt-le-axioms.md)**  
+   `str.<` and `str.<=` axiom handling failures.  
+   Affects: `str-leq11.smt2`, `str-leq12.smt2`, `str-leq13.smt2`, `str-lt.smt2`, `str-lt2.smt2`
+
+8. **[08-crash-str-at-substr-from-to-int.md](08-crash-str-at-substr-from-to-int.md)**  
+   Crashes on `str.at`, `str.substr`, `str.from_int`, `str.to_int` in complex formulas.  
+   Affects: `str.at.smt2`, `str.at-2.smt2`, `str.from_int_6.smt2`, `str.to_int_5.smt2`, `str.to_int_6.smt2`, `substring.smt2`, `substring2.smt2`, `substring2b.smt2`, `is-digit-2.smt2`
+
+9. **[09-crash-word-equation-regex.md](09-crash-word-equation-regex.md)**  
+   Various crashes on word equation + regex benchmarks.  
+   Affects: `concat-001.smt2`, `contains-1.smt2`, `contains-7.smt2`, `nonlinear-2.smt2`, `noodles-unsat8.smt2`, `noodles-unsat10.smt2`, `norn-benchmark-9i.smt2`, `pcp-1.smt2`
+
+### Performance Regressions (Lower Priority)
+
+10. **[10-performance-regression-regex-word-equation.md](10-performance-regression-regex-word-equation.md)**  
+    nseq times out on regex + word equation benchmarks where seq is fast.  
+    Affects: `concat-regex.smt2`, `concat-regex3.smt2`, `loop.smt2`, `simple-concat-4.smt2`, `all-quantifiers.smt2`
+
+## Quick-Win Fixes
+
+The following issues have clear, localized fixes:
+
+1. **Non-greedy quantifiers** (issue 06): Add rewrite rules to normalize `re.+?` → `re.+`, `re.*?` → `re.*` in `seq_rewriter.cpp`.
+
+2. **Boolean-equality pattern for contains** (issue 09, `contains-1.smt2`): In `assign_eh`, handle the pattern `(= (str.contains ...) true/false)` by extracting the inner predicate.
+
+3. **Conflicting regex character sets** (issue 05): Improve the Parikh pre-check or add a character-set intersection check for multiple membership constraints on the same variable.