3
0
Fork 0
mirror of https://github.com/Z3Prover/z3 synced 2026-03-23 04:49:11 +00:00

Add nseq issue drafts from Ostrich benchmark analysis (discussion #9071) (#9073)

Agent-Logs-Url: https://github.com/Z3Prover/z3/sessions/04321ea7-2a53-4ed5-9f43-816dc6f7476b

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>
This commit is contained in:
Copilot 2026-03-20 19:07:51 -07:00 committed by GitHub
parent 8ef491edea
commit f518faac9b
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
11 changed files with 932 additions and 0 deletions

View file

@ -0,0 +1,78 @@
# [nseq] Soundness bug: str.replace with empty-string input returns sat instead of unsat
**Labels**: bug, c3, nseq, soundness
## Summary
The nseq solver returns `sat` on benchmarks where the correct answer is `unsat` when
`str.replace` is applied to an empty input string and the result is constrained to an
impossible value. The seq solver correctly returns `unsat` for all of these cases.
## Affected benchmarks (Ostrich suite, c3 branch)
| File | seq verdict | nseq verdict |
|------|-------------|--------------|
| `replace-special.smt2` | unsat | **sat** (WRONG) |
| `replace-special-4.smt2` | unsat | **sat** (WRONG) |
| `replace-special-5.smt2` | unsat | **sat** (WRONG) |
| `simple-replace-4b.smt2` | unsat | **sat** (WRONG) |
Data from: https://github.com/Z3Prover/z3/discussions/9071
## Reproducing examples
```smt2
; replace-special.smt2 — EXPECTED: unsat, nseq returns: sat
(set-logic QF_SLIA)
(declare-fun x () String)
(assert (not (= (= "B" (str.replace "" x "A")) false)))
(check-sat)
```
```smt2
; replace-special-4.smt2 — EXPECTED: unsat, nseq returns: sat
(set-logic QF_SLIA)
(declare-fun x () String)
(declare-fun y () String)
(assert (not (= (str.replace "" (str.replace "" x "A") "B")
(str.substr "B" 0 (str.len x)))))
(check-sat)
```
```smt2
; simple-replace-4b.smt2 — EXPECTED: unsat, nseq returns: sat
(set-logic QF_S)
(declare-fun x_7 () String)
(declare-fun x_11 () String)
(declare-fun x_12 () String)
(assert (= x_7 (str.replace (str.++ x_11 x_12) "e" "a")))
(assert (= x_7 "hello"))
(check-sat)
```
## Analysis
`replace_axiom` in `src/ast/rewriter/seq_axioms.cpp` generates clauses for
`r = str.replace u s t`:
- `~(u="") (s="") (r=u)` — empty input without empty pattern gives `r = u`
- `~(s="") (r = t ++ u)` — empty pattern gives `r = t ++ u`
- `contains(u,s) r = u` — no occurrence gives `r = u`
For `replace-special.smt2` with `u=""`:
- If `s=""`: `r = "A" ++ "" = "A"`, but `r = "B"` → contradiction → unsat
- If `s≠""`: `r = u = ""`, but `r = "B"` → contradiction → unsat
The clauses are individually correct, but the nseq Nielsen graph apparently fails to
derive the contradiction when the result `r` is constrained via equality to an impossible
constant. The likely root cause is incomplete propagation of the result equality through
the egraph into the replace axiom conclusions.
For `simple-replace-4b.smt2`, the issue is that `str.replace (x_11 ++ x_12) "e" "a" = "hello"`
is unsatisfiable because "hello" contains 'l', which cannot be produced by replacing 'e'→'a' in
any string, yet nseq claims sat.
## Files to investigate
- `src/ast/rewriter/seq_axioms.cpp``replace_axiom`
- `src/smt/theory_nseq.cpp``dequeue_axiom`, `populate_nielsen_graph`

View file

@ -0,0 +1,57 @@
# [nseq] Soundness bug: str.contains not preserved through concatenation
**Labels**: bug, c3, nseq, soundness
## Summary
The nseq solver returns `sat` for a benchmark that asserts:
1. `z` is a substring of `y` (`str.contains y z`)
2. `x = y ++ y`
3. `z` is NOT a substring of `x` (`not (str.contains x z)`)
The third assertion contradicts the first two, so the formula should be `unsat`.
The seq solver correctly returns `unsat`; nseq incorrectly returns `sat`.
## Affected benchmark
| File | seq verdict | nseq verdict |
|------|-------------|--------------|
| `contains-4.smt2` | unsat | **sat** (WRONG) |
Data from: https://github.com/Z3Prover/z3/discussions/9071
## Reproducing example
```smt2
; contains-4.smt2 — EXPECTED: unsat, nseq returns: sat
(set-logic QF_S)
(declare-const x String)
(declare-const y String)
(declare-const z String)
(assert (str.contains y z))
(assert (= x (str.++ y y)))
(assert (not (str.contains x z)))
(check-sat)
```
## Analysis
If `str.contains y z` is true, then `y = L ++ z ++ R` for some `L`, `R`.
Then `x = y ++ y = L ++ z ++ R ++ L ++ z ++ R`, so `z` is a substring of `x`.
Therefore `not (str.contains x z)` is a contradiction.
The nseq solver should derive this via the combination of:
- `contains_true_axiom`: `str.contains y z``y = f1 ++ z ++ f2` (skolems)
- The equality `x = y ++ y`
- The Nielsen graph constraint that propagates the concatenation decomposition
The root cause may be that the nseq solver does not propagate `str.contains` facts
through word equation equalities in the Nielsen graph. When `x = y ++ y` is processed
as a word equation, the contains constraint on `y` should propagate to derive a contains
constraint on `x`, but this inference appears to be missing.
## Files to investigate
- `src/ast/rewriter/seq_axioms.cpp``contains_true_axiom`, `unroll_not_contains`
- `src/smt/seq/seq_nielsen.h` / `seq_nielsen.cpp` — word equation propagation
- `src/smt/theory_nseq.cpp``assign_eh` for contains, `populate_nielsen_graph`

View file

@ -0,0 +1,71 @@
# [nseq] Soundness bug: str.indexof unsound when combined with regex membership
**Labels**: bug, c3, nseq, soundness
## Summary
The nseq solver returns `sat` for benchmarks that constrain `str.indexof` to values
impossible given the regex membership of the input string. The seq solver correctly
returns `unsat` for these cases.
## Affected benchmarks
| File | seq verdict | nseq verdict |
|------|-------------|--------------|
| `indexof_const_index_unsat.smt2` | unsat | **sat** (WRONG) |
| `indexof_var_unsat.smt2` | unsat | **sat** (WRONG) |
Data from: https://github.com/Z3Prover/z3/discussions/9071
## Reproducing examples
```smt2
; indexof_const_index_unsat.smt2 — EXPECTED: unsat, nseq returns: sat
(set-info :status unsat)
(declare-fun a () String)
(declare-fun i () Int)
(declare-fun j () Int)
(assert (str.in_re a (re.union (str.to_re "hhhbbb") (str.to_re "bhhh"))))
(assert (= (str.indexof a "hhh" j) i))
(assert (= i 2))
(assert (> j 0))
(check-sat)
```
```smt2
; indexof_var_unsat.smt2 — EXPECTED: unsat, nseq returns: sat
(set-info :status unsat)
(declare-fun a () String)
(declare-fun i () Int)
(declare-fun j () Int)
(assert (str.in_re a (re.union (str.to_re "hhhbbb") (str.to_re "bhhh"))))
(assert (= (str.indexof a "hhh" j) i))
(assert (> i 1))
(check-sat)
```
## Analysis
For `indexof_const_index_unsat.smt2`:
- `a ∈ {hhhbbb, bhhh}` (two possibilities)
- `str.indexof a "hhh" j = 2` with `j > 0`
- In "hhhbbb", "hhh" appears at index 0 only (but j > 0 means the search starts after index 0)
- In "bhhh", "hhh" appears at index 1, but with j > 0 the only valid return would be 1, not 2
- So i = 2 is impossible → unsat
The `indexof_axiom` in `seq_axioms.cpp` generates arithmetic constraints for indexof,
but these constraints may not be sufficiently tight when combined with concrete regex
membership constraints. Specifically, the nseq solver does not appear to combine the
regex membership information with the indexof position constraints to derive the
contradiction.
The root cause is likely that nseq's `indexof_axiom` generates axioms about `str.indexof`
without leveraging the concrete alphabet constraints imposed by regex membership. The
seq solver may do additional propagation (e.g., via character-level analysis of the
regex language) that nseq does not perform.
## Files to investigate
- `src/ast/rewriter/seq_axioms.cpp``indexof_axiom`
- `src/smt/seq/seq_regex.h` / `seq_regex.cpp` — regex membership propagation
- `src/smt/theory_nseq.cpp` — interaction between regex constraints and arithmetic axioms

View file

@ -0,0 +1,78 @@
# [nseq] Soundness bug: str.prefixof / str.suffixof interaction not fully axiomatized
**Labels**: bug, c3, nseq, soundness
## Summary
The nseq solver returns `sat` for several benchmarks involving `str.prefixof`,
`str.suffixof`, and `str.at` where the correct answer is `unsat`. The seq solver
handles all these cases correctly.
## Affected benchmarks
| File | seq verdict | nseq verdict |
|------|-------------|--------------|
| `prefix-suffix.smt2` | unsat | **sat** (WRONG) |
| `failedProp.smt2` | unsat | **sat** (WRONG) |
| `failedProp2.smt2` | unsat | **sat** (WRONG) |
Data from: https://github.com/Z3Prover/z3/discussions/9071
## Reproducing examples
```smt2
; prefix-suffix.smt2 — EXPECTED: unsat, nseq returns: sat
(set-logic QF_S)
(declare-fun a () String)
(declare-fun b () String)
(assert (not (=> (and (str.prefixof a b) (str.suffixof b a)) (= a b))))
(check-sat)
```
The above asserts: `(a prefix b) AND (b suffix a) AND a ≠ b`.
This is unsatisfiable: if `a` is a prefix of `b` and `b` is a suffix of `a`,
then `|a| ≤ |b|` and `|b| ≤ |a|`, so `|a| = |b|`. Combined with the prefix and
suffix conditions, this forces `a = b`.
```smt2
; failedProp.smt2 — EXPECTED: unsat, nseq returns: sat
(set-logic QF_SLIA)
(declare-fun x () String)
(declare-fun y () String)
(assert (not (= (str.prefixof x (str.at x 1)) (= x ""))))
(check-sat)
```
The above asserts: `(str.prefixof x (str.at x 1)) ≠ (x = "")`.
`str.at x 1` is the second character of `x` (as a length-1 string).
`str.prefixof x (str.at x 1)` is true only when `x` is a prefix of a single character,
which means `x = ""` or `x` is that single character (but then `x` is a prefix of itself only
if `len(x) ≤ 1`). This equivalence should hold: `str.prefixof x (str.at x 1) ↔ x = ""`.
So the assertion is `unsat`.
```smt2
; failedProp2.smt2 — EXPECTED: unsat, nseq returns: sat (same pattern with suffixof)
(set-logic QF_SLIA)
(declare-fun x () String)
(assert (not (= (str.suffixof x (str.at x 1)) (= x ""))))
(check-sat)
```
## Analysis
For `prefix-suffix.smt2`, the current axioms for `prefix_axiom` and `suffix_axiom`
introduce skolem variables but may not derive the length equality `|a| = |b|` from the
combined prefix+suffix condition, or may not then propagate this to `a = b`.
For `failedProp.smt2` and `failedProp2.smt2`, the issue involves interaction between
`str.at`, `str.prefixof`/`str.suffixof`, and string-integer length constraints. The
nseq solver may not properly resolve the combination of:
1. The `at_axiom` (generating constraints on `str.at x 1`)
2. The `prefix_axiom`/`suffix_axiom` (decomposing the prefix/suffix relationship)
3. The arithmetic constraint linking lengths
## Files to investigate
- `src/ast/rewriter/seq_axioms.cpp``prefix_axiom`, `suffix_axiom`, `prefix_true_axiom`, `suffix_true_axiom`, `at_axiom`
- `src/smt/theory_nseq.cpp``assign_eh` for prefix/suffix, `final_check_eh`
- `src/smt/seq/seq_nielsen.cpp` — length propagation in Nielsen graph

View file

@ -0,0 +1,65 @@
# [nseq] Soundness bug: conflicting regex memberships not detected (norn-benchmark-9f)
**Labels**: bug, c3, nseq, soundness
## Summary
The nseq solver returns `sat` for a formula asserting that a non-empty string belongs
simultaneously to two disjoint regular languages. The correct answer is `unsat`.
## Affected benchmark
| File | seq verdict | nseq verdict |
|------|-------------|--------------|
| `norn-benchmark-9f.smt2` | unsat | **sat** (WRONG) |
Data from: https://github.com/Z3Prover/z3/discussions/9071
## Reproducing example
```smt2
; norn-benchmark-9f.smt2 — EXPECTED: unsat, nseq returns: sat
(set-logic QF_S)
(declare-fun var_0 () String)
(assert (str.in.re var_0 (re.* (re.range "a" "u"))))
(assert (str.in.re var_0 (re.* (str.to.re "v"))))
(assert (not (= var_0 "")))
(check-sat)
```
The formula asserts:
1. `var_0 ∈ (re.range "a" "u")* ` — all characters in "a""u"
2. `var_0 ∈ "v"*` — only the character "v"
3. `var_0 ≠ ""`
Since "v" is not in the range "a""u" (the range includes up to 'u', not 'v'),
the intersection of the two languages is `{""}`. Combined with `var_0 ≠ ""`, this is
unsatisfiable.
## Note on syntax
This benchmark uses the old SMT-LIB 2.5 syntax `str.in.re` (with dots) rather than
the SMT-LIB 2.6 syntax `str.in_re` (with underscore). Both are supported by Z3's parser.
The bug may be triggered specifically by the old-style syntax interaction with nseq's
regex handling, or may be a pure logic issue.
## Analysis
The nseq solver handles `str.in_re` constraints by computing the intersection of
multiple regex membership constraints for the same string variable (via the sgraph and
Nielsen graph). The Parikh image pre-check (`seq_parikh.cpp`) should detect that
`(re.range "a" "u")* ∩ "v"*` restricted to non-empty strings is empty, since:
- `"v"*` allows only 'v' characters
- `(re.range "a" "u")*` allows only characters in 'a'..'u'
- 'v' ∉ 'a'..'u' → intersection = `{""}` only
The root cause is likely that the nseq solver does not derive the character-level
disjointness between these two regex languages. The seq solver may use a derivative-based
or automaton intersection approach that detects this.
## Files to investigate
- `src/smt/seq/seq_regex.h` / `seq_regex.cpp` — regex membership processing, pre-check
- `src/smt/seq/seq_parikh.h` / `seq_parikh.cpp` — Parikh image constraint generation
- `src/smt/seq/seq_nielsen.cpp` — membership processing in Nielsen nodes
- `src/smt/theory_nseq.cpp``assign_eh` for regex membership

View file

@ -0,0 +1,67 @@
# [nseq] Crash: non-greedy regex quantifiers (re.+?, re.*?) cause exception
**Labels**: bug, c3, nseq, crash
## Summary
The nseq solver crashes (throws an exception or returns an error) when the input
formula contains non-greedy regex quantifiers such as `re.+?` (lazy plus) or `re.*?`
(lazy star). The seq solver handles these correctly by treating them as equivalent to
their greedy counterparts for satisfiability purposes.
## Affected benchmark
| File | seq verdict | nseq verdict |
|------|-------------|--------------|
| `non-greedy-quantifiers.smt2` | sat | **bug/crash** |
Data from: https://github.com/Z3Prover/z3/discussions/9071
## Reproducing example
```smt2
; non-greedy-quantifiers.smt2 — EXPECTED: sat, nseq crashes
(set-logic QF_S)
(declare-fun a () String)
(declare-fun b () String)
(declare-fun c () String)
(declare-fun d () String)
(assert (= a (str.++ b c)))
(assert (= b (str.++ d c)))
; non-greedy re.+? has same semantics as re.+ for satisfiability
(assert (str.in.re a (re.+ (re.union (str.to.re "x") (str.to.re "y")))))
(assert (str.in.re c (re.+? (str.to.re "x"))))
(check-sat)
(get-model)
```
## Analysis
Z3's SMT-LIB parser supports non-greedy regex quantifiers (`re.+?`, `re.*?`, `re.??`)
as part of the ECMA-like regex extension. For satisfiability checking, non-greedy
quantifiers are semantically equivalent to their greedy counterparts (they only differ
in which match is returned, not in what strings are accepted).
The nseq solver's regex handling infrastructure (`seq_regex.cpp`) does not appear to
recognize or normalize non-greedy quantifiers. When the symbol `re.+?` is encountered
in a membership constraint, the solver either:
1. Does not recognize the operator and triggers `push_unhandled_pred()`, causing `FC_GIVEUP`
(which may manifest as an error/"bug" in the benchmark runner), or
2. Crashes due to an unhandled case in the Brzozowski derivative computation or
Parikh constraint generation.
### Fix proposal
Non-greedy quantifiers should be rewritten to their greedy equivalents before processing:
- `re.+? e``re.+ e` (lazy plus → greedy plus)
- `re.*? e``re.* e` (lazy star → greedy star)
- `re.?? e``re.opt e` (lazy optional → greedy optional)
This rewrite is sound for satisfiability/unsatisfiability since both interpretations
accept exactly the same language.
## Files to investigate
- `src/smt/seq/seq_regex.cpp` — regex preprocessing and derivative computation
- `src/ast/rewriter/seq_rewriter.cpp` — regex rewriting rules (add non-greedy normalization)
- `src/smt/theory_nseq.cpp``internalize_atom` / `assign_eh` for regex membership

View file

@ -0,0 +1,82 @@
# [nseq] Crashes: str.< and str.<= axiom handling failures
**Labels**: bug, c3, nseq, crash
## Summary
The nseq solver crashes (returns "bug" in the benchmark runner) on formulas involving
`str.<` (lexicographic less-than) and `str.<=` (lexicographic less-than-or-equal)
string comparison predicates in contexts where they interact with length constraints
and character-level reasoning.
## Affected benchmarks
| File | seq verdict | nseq verdict |
|------|-------------|--------------|
| `str-leq11.smt2` | sat | **bug/crash** |
| `str-leq12.smt2` | sat | **bug/crash** |
| `str-leq13.smt2` | sat | **bug/crash** |
| `str-lt.smt2` | sat | **bug/crash** |
| `str-lt2.smt2` | sat | **bug/crash** |
Data from: https://github.com/Z3Prover/z3/discussions/9071
## Reproducing examples
```smt2
; str-leq11.smt2 — EXPECTED: sat, nseq crashes
(set-logic QF_SLIA)
(declare-fun x () String)
(assert (= 1 (str.len x)))
(assert (str.<= "a" x))
(assert (str.<= x "c"))
(check-sat)
```
```smt2
; str-lt.smt2 — EXPECTED: sat, nseq crashes
(set-logic QF_SLIA)
(declare-fun x () String)
(assert (= 1 (str.len x)))
(assert (str.< "A" x))
(assert (str.< x "C"))
(check-sat)
```
## Analysis
The nseq solver handles `str.<` and `str.<=` via `lt_axiom` and `le_axiom` in
`seq_axioms.cpp`, triggered from `relevant_eh``dequeue_axiom`. The `assign_eh`
for these predicates is a no-op with comment "axioms added via relevant_eh → dequeue_axiom".
The `lt_axiom` generates clauses involving:
- Skolem variables `x`, `y`, `z` (common prefix and suffixes)
- Skolem character variables `c`, `d`
- Prefix and character-comparison constraints
The crash likely occurs because:
1. The `lt_axiom` / `le_axiom` introduces character-comparison constraints (via `seq.mk_lt`)
that require character-level arithmetic reasoning (e.g., comparing character codes).
If the nseq solver's arithmetic integration does not correctly handle these
character-code comparisons, it may crash or give incorrect results.
2. The `lt_axiom` clauses reference the Boolean literal for the `str.<` predicate itself
(via `expr_ref lt = expr_ref(n, m)`). If `n` is used both as a term and as a literal
reference, and nseq has an inconsistency in how it maps the Bool var to the expression,
this could cause a null dereference or assertion failure.
3. The interaction between the length constraint `len(x) = 1` and the lt/le axioms'
prefix-based decomposition may produce contradictory or unresolvable constraints
in the Nielsen graph.
### What str-leq11 requires
For `len(x) = 1`, `"a" ≤ x ≤ "c"` (lexicographically): x ∈ {"a", "b", "c"}.
The solver needs to find x = "a", "b", or "c". This is straightforward with
character-range reasoning.
## Files to investigate
- `src/ast/rewriter/seq_axioms.cpp``lt_axiom`, `le_axiom`
- `src/smt/theory_nseq.cpp``assign_eh` for is_lt/is_le, `relevant_eh`, `dequeue_axiom`
- `src/smt/seq/seq_nielsen.cpp` — character-level constraint handling

View file

@ -0,0 +1,118 @@
# [nseq] Crashes: str.at / str.substr / str.from_int / str.to_int in complex formulas
**Labels**: bug, c3, nseq, crash
## Summary
The nseq solver crashes or returns errors on benchmarks that combine `str.at`,
`str.substr`, `str.from_int`, and `str.to_int` with regex constraints, length
constraints, or positional arithmetic. The seq solver handles all these correctly.
## Affected benchmarks
| File | seq verdict | nseq verdict |
|------|-------------|--------------|
| `str.at.smt2` | sat | **bug/crash** |
| `str.at-2.smt2` | sat | **bug/crash** |
| `str.from_int_6.smt2` | sat | **bug/crash** |
| `str.to_int_5.smt2` | unsat | **bug/crash** |
| `str.to_int_6.smt2` | unknown | **bug/crash** |
| `substring.smt2` | sat | **bug/crash** |
| `substring2.smt2` | sat | **bug/crash** |
| `substring2b.smt2` | sat | **bug/crash** |
| `is-digit-2.smt2` | sat | **bug/crash** |
Data from: https://github.com/Z3Prover/z3/discussions/9071
## Reproducing examples
```smt2
; str.at.smt2 — EXPECTED: sat, nseq crashes
(set-logic QF_S)
(declare-const x String)
(assert (= (str.at x 2) "x"))
(assert (= (str.at x 3) "y"))
(check-sat)
```
```smt2
; str.from_int_6.smt2 — EXPECTED: sat, nseq crashes
(declare-const w String)
(declare-const x Int)
(declare-const y Int)
(assert (= (str.substr w 0 3) (str.from_int x)))
(assert (= (str.substr w 10 3) (str.from_int y)))
(assert (> x (* 2 y)))
(assert (>= y 0))
(check-sat)
```
```smt2
; str.to_int_5.smt2 — EXPECTED: unsat, nseq crashes
(declare-const w String)
(assert (= (str.to_int (str.substr w 0 10)) 123))
(assert (= (str.to_int (str.substr w 10 10)) 321))
(check-sat)
```
```smt2
; substring.smt2 — EXPECTED: sat, nseq crashes
(set-logic QF_S)
(declare-fun x () String)
(declare-fun y () String)
(declare-fun n () Int)
(assert (= y (str.substr x n 5)))
(assert (>= (str.len x) (+ n 5)))
(assert (= (str.at y 0) "a"))
(assert (> n 0))
(check-sat)
```
```smt2
; is-digit-2.smt2 — EXPECTED: sat, nseq crashes
(set-logic QF_S)
(declare-fun x () String)
(assert (str.is_digit (str.at x 1)))
(assert (str.is_digit (str.at x 2)))
(check-sat)
```
## Analysis
These benchmarks all combine position-indexed string operations (`str.at`, `str.substr`)
with either:
- Direct string equality constraints on the extracted substring/character
- Integer conversion operations (`str.from_int`, `str.to_int`)
- Character classification (`str.is_digit`)
- Arithmetic constraints on positions (`n > 0`, `len(x) >= n + 5`)
The axioms for these operations (`at_axiom`, `extract_axiom`, `itos_axiom`, `stoi_axiom`,
`is_digit_axiom`) are defined in `seq_axioms.cpp` and are triggered via `relevant_eh`
`dequeue_axiom`. The individual axioms appear to be implemented.
The crashes likely stem from one or more of:
1. **Interaction with variable-offset substring**: When `str.at x i` uses a variable
index `i` (e.g., `str.at x 1` where `1` is a concrete integer, but the axiom
creates skolems that interact with the Nielsen graph's variable positions),
the resulting constraints may cause assertion failures.
2. **Double-application of str.at**: In `str.at.smt2`, two `str.at` constraints on the
same variable `x` at different positions (2 and 3) must simultaneously hold.
The `at_axiom` introduces skolem variables for each, and their combination in the
Nielsen graph may not be handled correctly.
3. **Nested substr + conversion**: In `str.from_int_6.smt2` and `str.to_int_5.smt2`,
the combination of `str.substr` extracting a sub-region plus integer conversion
creates constraints that span the string/arithmetic boundary in ways the nseq
solver's axiom-enqueuing mechanism may not handle.
4. **str.is_digit on str.at result**: In `is-digit-2.smt2`, `str.is_digit (str.at x 1)`
requires the `is_digit_axiom` to be applied to the result of `at_axiom`. The nseq
solver may not propagate the `relevant_eh` trigger through the composed term.
## Files to investigate
- `src/ast/rewriter/seq_axioms.cpp``at_axiom`, `extract_axiom`, `itos_axiom`, `stoi_axiom`, `is_digit_axiom`
- `src/smt/theory_nseq.cpp``relevant_eh` (check if nested terms trigger axiom enqueuing), `dequeue_axiom`
- `src/smt/seq/seq_nielsen.cpp` — handling of position-indexed skolem variables

View file

@ -0,0 +1,117 @@
# [nseq] Crashes: word equation + regex benchmarks (various operations)
**Labels**: bug, c3, nseq, crash
## Summary
The nseq solver crashes or errors on a variety of benchmarks involving combinations of
word equations with regex constraints, contains, replace_all, and SLIA arithmetic.
These represent missing or incomplete operation support in the nseq solver.
## Affected benchmarks
| File | seq verdict | nseq verdict | Operations involved |
|------|-------------|--------------|---------------------|
| `concat-001.smt2` | sat | **bug/crash** | word equation + regex (old syntax) |
| `contains-1.smt2` | unsat | **bug/crash** | contains in equality context |
| `contains-7.smt2` | sat | **bug/crash** | str.replace_all |
| `nonlinear-2.smt2` | sat | **bug/crash** | `a = b ++ b` + regex |
| `noodles-unsat8.smt2` | unknown | **bug/crash** | complex word equation |
| `noodles-unsat10.smt2` | unsat | **bug/crash** | complex word equation |
| `norn-benchmark-9i.smt2` | sat | **bug/crash** | old-syntax regex + complex |
| `pcp-1.smt2` | sat | **bug/crash** | SLIA + complex string ops |
Data from: https://github.com/Z3Prover/z3/discussions/9071
## Reproducing examples
```smt2
; concat-001.smt2 — EXPECTED: sat, nseq crashes
; Uses old-style str.in.re syntax and unambiguous regex constraints
(declare-const x String)
(declare-const y String)
(declare-const z String)
(assert (str.in.re x (re.* (str.to.re "ab"))))
(assert (str.in.re y (re.* (str.to.re "c"))))
(assert (str.in.re z (str.to.re "cccc")))
(assert (= z (str.++ x y)))
(check-sat)
```
```smt2
; contains-1.smt2 — EXPECTED: unsat, nseq crashes
; str.contains used inside an equality with a boolean literal
(set-logic QF_S)
(declare-const x String)
(assert (not (= (str.contains "A" x) true)))
(check-sat)
```
```smt2
; contains-7.smt2 — EXPECTED: sat, nseq crashes
; str.replace_all used with regex constraints
(set-logic QF_S)
(declare-const x String)
(declare-const y String)
(declare-const z String)
(declare-const res String)
(assert (= res (str.replace_all x y z)))
(assert (str.in_re x (re.* (str.to_re "ab"))))
(assert (not (str.in_re y (re.* (str.to_re "ab")))))
(assert (not (= res "")))
(check-sat)
```
```smt2
; nonlinear-2.smt2 — EXPECTED: sat, nseq crashes
; Non-linear word equation: a = b ++ b (self-concatenation)
(set-logic QF_S)
(declare-fun a () String)
(declare-fun b () String)
(assert (= a (str.++ b b)))
(assert (str.in.re a (re.++ (re.* (str.to.re "(")) (re.+ (str.to.re ")")))))
(check-sat)
```
## Analysis of individual crash causes
### concat-001.smt2
Uses the old SMT-LIB `str.in.re` / `str.to.re` syntax with a word equation
`z = x ++ y` where all three variables have concrete regex constraints. If the nseq
solver's parser or internalization does not correctly handle the old-style `str.to.re`
synonym, the membership constraint may not be registered correctly.
### contains-1.smt2
The formula `(not (= (str.contains "A" x) true))` creates a Boolean atom of the form
`(= <bool-expr> true)` rather than the direct `str.contains` predicate.
In `assign_eh`, the direct check `is_contains(e)` will NOT match `(= (str.contains "A" x) true)`.
The catch-all `push_unhandled_pred()` fires, but the outer negation means the atom
`(= (str.contains "A" x) true)` is assigned to false. This should cause `FC_GIVEUP`,
but may manifest as a crash if the solver mishandles the boolean variable assignment.
**Fix**: Normalize `(= <bool-expr> true/false)` patterns in `assign_eh` or during
internalization, or add specific handling for `str.contains` embedded in boolean equalities.
### contains-7.smt2
Uses `str.replace_all` which is handled by `replace_all_axiom`. The crash may occur
when `replace_all_axiom` interacts with regex constraints on the arguments in ways
the Nielsen graph doesn't support.
### nonlinear-2.smt2
The word equation `a = b ++ b` is a non-linear equation (variable `b` appears twice
on the RHS). The Nielsen graph supports some non-linear equations, but this specific
combination with a regex constraint `a ∈ (re.* "(") ++ (re.+ ")")` — which constrains
characters — may expose a gap in the non-linear handling.
### noodles-unsat8 / noodles-unsat10
These are word equations from the Oliver Markgraf OSTRICH benchmark set. They involve
multiple variables with complex concatenation patterns and regex constraints. The crashes
suggest that the combination of 3+ variable word equations with disjunctive regex
constraints exceeds what the current nseq Nielsen graph implementation can handle.
## Files to investigate
- `src/smt/theory_nseq.cpp``assign_eh` (boolean-equality pattern for contains), `internalize_atom`
- `src/ast/rewriter/seq_axioms.cpp``replace_all_axiom`, contains axioms
- `src/smt/seq/seq_nielsen.cpp` — non-linear equation handling, `generate_extensions`
- `src/ast/seq_decl_plugin.cpp` — old-style synonym mapping (str.in.re / str.to.re)

View file

@ -0,0 +1,121 @@
# [nseq] Performance regression: nseq times out on regex + word equation benchmarks
**Labels**: performance, c3, nseq
## Summary
The nseq solver times out (exceeds 5 seconds) on several benchmarks that the seq solver
solves in under 30ms. These represent cases where nseq's Nielsen-graph search either
explores too many branches or does not apply available pruning heuristics effectively.
## Affected benchmarks
| File | seq verdict | seq time | nseq verdict | nseq time |
|------|-------------|----------|--------------|-----------|
| `concat-regex.smt2` | unsat | 0.018s | **unknown** (timeout) | >5s |
| `concat-regex3.smt2` | unsat | 0.024s | **unknown** (timeout) | >5s |
| `loop.smt2` | sat | 0.026s | **unknown** (timeout) | >5s |
| `simple-concat-4.smt2` | sat | 0.028s | **unknown** (timeout) | >5s |
| `all-quantifiers.smt2` | unknown | 0.020s | **unknown** (timeout) | >5s |
Data from: https://github.com/Z3Prover/z3/discussions/9071
## Reproducing examples
```smt2
; concat-regex.smt2 — seq: unsat in 18ms, nseq: TIMEOUT
(set-logic QF_S)
(declare-fun a () String)
(declare-fun b () String)
(declare-fun c () String)
(declare-fun d () String)
(assert (= a (str.++ b c)))
(assert (= b (str.++ d c)))
(assert (str.in.re a (re.+ (re.union (str.to.re "x") (str.to.re "y")))))
(assert (str.in.re c (re.+ (str.to.re "x"))))
(check-sat)
```
Why unsat: `a = b ++ c = d ++ c ++ c`. With `a ∈ (x|y)+` and `c ∈ x+`,
all characters in `a` are 'x' or 'y', but `c` forces all its chars to be 'x'.
The word equation creates constraints that (with the regex) force a contradiction.
The seq solver can prove this quickly, but nseq's iterative deepening DFS exhausts
its depth budget.
```smt2
; loop.smt2 — seq: sat in 26ms, nseq: TIMEOUT
(declare-fun TestC0 () String)
(declare-fun |1 Fill 1| () String)
(declare-fun |1 Fill 0| () String)
(declare-fun |1 Fill 2| () String)
(assert (str.in.re TestC0
(re.++ (re.* (re.range "\x00" "\xff"))
((_ re.loop 3 3) (re.range "0" "9"))
(re.* (re.range "0" "9"))
(re.* (re.range "\x00" "\xff")))))
; ... (additional constraints)
(check-sat)
```
```smt2
; simple-concat-4.smt2 — seq: sat in 28ms, nseq: TIMEOUT
; Uses recursive function definition (define-fun-rec) with transducer-style constraints
; and regex membership
(check-sat)
```
## Analysis
### concat-regex and concat-regex3
These benchmarks use word equations `a = b ++ c`, `b = d ++ c` combined with regex
memberships. The seq solver uses a length-based approach or Parikh constraints that
quickly prune the search. The nseq solver's Nielsen graph approach uses iterative
deepening DFS (starting at depth 10, doubling each iteration, up to 6 iterations per
`seq_nielsen.cpp:solve()`). For these formulas:
- The Nielsen graph may generate many extensions at each level (characters from the
regex alphabet, word splits)
- The Parikh pre-check may not prune these cases effectively because the Parikh
constraints for `(x|y)+` allow many length values
- The solver exhausts all 6 iterations without finding the contradiction
**Fix proposal**: Improve the pre-check for membership intersection. When all strings
in a word equation are constrained to finite-alphabet regexes (e.g., `(x|y)+ ∩ x+`),
use a more aggressive alphabet-intersection analysis to prune branches early.
### loop.smt2 and simple-concat-4.smt2
These benchmarks involve:
- `re.loop` (counted repetition) for `loop.smt2`
- `define-fun-rec` (recursive function definitions for transducer-style constraints) for `simple-concat-4.smt2`
For `loop.smt2`: the `(_ re.loop 3 3)` constraint requires exactly 3 digits to appear
at a specific position. The nseq solver may not have efficient handling for counted
repetitions in the Parikh constraint generator.
For `simple-concat-4.smt2`: the `define-fun-rec` introduces recursive predicates.
These are handled by `unfold_ho_terms()` in `theory_nseq.cpp`, which iteratively
unfolds the recursive definition. The timeout may be due to slow convergence of the
unfolding or the combination with regex constraints.
### Performance improvement opportunities
1. **Better Parikh stride computation for counted repetitions** (`seq_parikh.cpp`):
`(_ re.loop N N)` has stride 0 and min_len = N * char_len. Ensure this is handled.
2. **Alphabet intersection pre-pruning** in the Nielsen graph: before generating all
possible character extensions, check if the intersection of the alphabets of all
regex constraints on a variable is consistent with the required character.
3. **Increase Nielsen DFS depth budget adaptively** or use a BFS strategy for small
alphabets.
4. **Cache Parikh constraints** across Nielsen nodes that share the same membership constraints.
## Files to investigate
- `src/smt/seq/seq_nielsen.cpp``solve()`, `generate_extensions()`, depth budget
- `src/smt/seq/seq_parikh.cpp``compute_length_stride()` for `re.loop`, constraint generation
- `src/smt/seq/seq_regex.cpp` — alphabet-intersection analysis
- `src/smt/theory_nseq.cpp``unfold_ho_terms()` convergence for recursive definitions

View file

@ -0,0 +1,78 @@
# nseq Solver Issues — Ostrich Benchmark Analysis
This directory contains GitHub issue drafts for the nseq solver based on analysis of
the Ostrich benchmark suite run on the c3 branch (Z3 v4.17.0, commit 8ef491e).
**Source**: https://github.com/Z3Prover/z3/discussions/9071
**Branch**: c3
**Benchmark set**: Ostrich (349 files), 5s timeout
**Date**: 2026-03-21
## Summary of Results
| Solver | sat | unsat | unknown | timeout | bug/crash |
|--------|-----|-------|---------|---------|-----------|
| seq | 242 | 75 | 28 | 0 | 4 |
| nseq | 245 | 66 | 11 | 0 | 27 |
- **22 soundness disagreements** between seq and nseq (nseq returns sat where seq returns unsat in 11 cases)
- **27 nseq crashes** (vs 4 for seq, all on the same 4 parse-ecma files)
- **5+ performance regressions** (nseq timeout where seq solves in <30ms)
## Issue Files
### Soundness Bugs (Highest Priority)
1. **[01-soundness-replace-empty-string.md](01-soundness-replace-empty-string.md)**
`str.replace` with empty input returns sat instead of unsat.
Affects: `replace-special.smt2`, `replace-special-4.smt2`, `replace-special-5.smt2`, `simple-replace-4b.smt2`
2. **[02-soundness-contains-monotonicity.md](02-soundness-contains-monotonicity.md)**
`str.contains` not preserved through concatenation (`y ++ y`).
Affects: `contains-4.smt2`
3. **[03-soundness-indexof-regex.md](03-soundness-indexof-regex.md)**
`str.indexof` unsound when combined with regex membership constraints.
Affects: `indexof_const_index_unsat.smt2`, `indexof_var_unsat.smt2`
4. **[04-soundness-prefix-suffix.md](04-soundness-prefix-suffix.md)**
`str.prefixof` / `str.suffixof` interaction not fully axiomatized.
Affects: `prefix-suffix.smt2`, `failedProp.smt2`, `failedProp2.smt2`
5. **[05-soundness-conflicting-regex.md](05-soundness-conflicting-regex.md)**
Conflicting regex memberships not detected (disjoint character sets).
Affects: `norn-benchmark-9f.smt2`
### Crashes (Medium Priority)
6. **[06-crash-non-greedy-quantifiers.md](06-crash-non-greedy-quantifiers.md)**
Non-greedy regex quantifiers (`re.+?`, `re.*?`) cause crash.
Affects: `non-greedy-quantifiers.smt2`
7. **[07-crash-str-lt-le-axioms.md](07-crash-str-lt-le-axioms.md)**
`str.<` and `str.<=` axiom handling failures.
Affects: `str-leq11.smt2`, `str-leq12.smt2`, `str-leq13.smt2`, `str-lt.smt2`, `str-lt2.smt2`
8. **[08-crash-str-at-substr-from-to-int.md](08-crash-str-at-substr-from-to-int.md)**
Crashes on `str.at`, `str.substr`, `str.from_int`, `str.to_int` in complex formulas.
Affects: `str.at.smt2`, `str.at-2.smt2`, `str.from_int_6.smt2`, `str.to_int_5.smt2`, `str.to_int_6.smt2`, `substring.smt2`, `substring2.smt2`, `substring2b.smt2`, `is-digit-2.smt2`
9. **[09-crash-word-equation-regex.md](09-crash-word-equation-regex.md)**
Various crashes on word equation + regex benchmarks.
Affects: `concat-001.smt2`, `contains-1.smt2`, `contains-7.smt2`, `nonlinear-2.smt2`, `noodles-unsat8.smt2`, `noodles-unsat10.smt2`, `norn-benchmark-9i.smt2`, `pcp-1.smt2`
### Performance Regressions (Lower Priority)
10. **[10-performance-regression-regex-word-equation.md](10-performance-regression-regex-word-equation.md)**
nseq times out on regex + word equation benchmarks where seq is fast.
Affects: `concat-regex.smt2`, `concat-regex3.smt2`, `loop.smt2`, `simple-concat-4.smt2`, `all-quantifiers.smt2`
## Quick-Win Fixes
The following issues have clear, localized fixes:
1. **Non-greedy quantifiers** (issue 06): Add rewrite rules to normalize `re.+?``re.+`, `re.*?``re.*` in `seq_rewriter.cpp`.
2. **Boolean-equality pattern for contains** (issue 09, `contains-1.smt2`): In `assign_eh`, handle the pattern `(= (str.contains ...) true/false)` by extracting the inner predicate.
3. **Conflicting regex character sets** (issue 05): Improve the Parikh pre-check or add a character-set intersection check for multiple membership constraints on the same variable.