mirror of
https://github.com/Z3Prover/z3
synced 2026-03-23 04:49:11 +00:00
Agent-Logs-Url: https://github.com/Z3Prover/z3/sessions/04321ea7-2a53-4ed5-9f43-816dc6f7476b Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>
This commit is contained in:
parent
8ef491edea
commit
f518faac9b
11 changed files with 932 additions and 0 deletions
|
|
@ -0,0 +1,78 @@
|
|||
# [nseq] Soundness bug: str.replace with empty-string input returns sat instead of unsat
|
||||
|
||||
**Labels**: bug, c3, nseq, soundness
|
||||
|
||||
## Summary
|
||||
|
||||
The nseq solver returns `sat` on benchmarks where the correct answer is `unsat` when
|
||||
`str.replace` is applied to an empty input string and the result is constrained to an
|
||||
impossible value. The seq solver correctly returns `unsat` for all of these cases.
|
||||
|
||||
## Affected benchmarks (Ostrich suite, c3 branch)
|
||||
|
||||
| File | seq verdict | nseq verdict |
|
||||
|------|-------------|--------------|
|
||||
| `replace-special.smt2` | unsat | **sat** (WRONG) |
|
||||
| `replace-special-4.smt2` | unsat | **sat** (WRONG) |
|
||||
| `replace-special-5.smt2` | unsat | **sat** (WRONG) |
|
||||
| `simple-replace-4b.smt2` | unsat | **sat** (WRONG) |
|
||||
|
||||
Data from: https://github.com/Z3Prover/z3/discussions/9071
|
||||
|
||||
## Reproducing examples
|
||||
|
||||
```smt2
|
||||
; replace-special.smt2 — EXPECTED: unsat, nseq returns: sat
|
||||
(set-logic QF_SLIA)
|
||||
(declare-fun x () String)
|
||||
(assert (not (= (= "B" (str.replace "" x "A")) false)))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
```smt2
|
||||
; replace-special-4.smt2 — EXPECTED: unsat, nseq returns: sat
|
||||
(set-logic QF_SLIA)
|
||||
(declare-fun x () String)
|
||||
(declare-fun y () String)
|
||||
(assert (not (= (str.replace "" (str.replace "" x "A") "B")
|
||||
(str.substr "B" 0 (str.len x)))))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
```smt2
|
||||
; simple-replace-4b.smt2 — EXPECTED: unsat, nseq returns: sat
|
||||
(set-logic QF_S)
|
||||
(declare-fun x_7 () String)
|
||||
(declare-fun x_11 () String)
|
||||
(declare-fun x_12 () String)
|
||||
(assert (= x_7 (str.replace (str.++ x_11 x_12) "e" "a")))
|
||||
(assert (= x_7 "hello"))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
## Analysis
|
||||
|
||||
`replace_axiom` in `src/ast/rewriter/seq_axioms.cpp` generates clauses for
|
||||
`r = str.replace u s t`:
|
||||
|
||||
- `~(u="") ∨ (s="") ∨ (r=u)` — empty input without empty pattern gives `r = u`
|
||||
- `~(s="") ∨ (r = t ++ u)` — empty pattern gives `r = t ++ u`
|
||||
- `contains(u,s) ∨ r = u` — no occurrence gives `r = u`
|
||||
|
||||
For `replace-special.smt2` with `u=""`:
|
||||
- If `s=""`: `r = "A" ++ "" = "A"`, but `r = "B"` → contradiction → unsat
|
||||
- If `s≠""`: `r = u = ""`, but `r = "B"` → contradiction → unsat
|
||||
|
||||
The clauses are individually correct, but the nseq Nielsen graph apparently fails to
|
||||
derive the contradiction when the result `r` is constrained via equality to an impossible
|
||||
constant. The likely root cause is incomplete propagation of the result equality through
|
||||
the egraph into the replace axiom conclusions.
|
||||
|
||||
For `simple-replace-4b.smt2`, the issue is that `str.replace (x_11 ++ x_12) "e" "a" = "hello"`
|
||||
is unsatisfiable because "hello" contains 'l', which cannot be produced by replacing 'e'→'a' in
|
||||
any string, yet nseq claims sat.
|
||||
|
||||
## Files to investigate
|
||||
|
||||
- `src/ast/rewriter/seq_axioms.cpp` — `replace_axiom`
|
||||
- `src/smt/theory_nseq.cpp` — `dequeue_axiom`, `populate_nielsen_graph`
|
||||
|
|
@ -0,0 +1,57 @@
|
|||
# [nseq] Soundness bug: str.contains not preserved through concatenation
|
||||
|
||||
**Labels**: bug, c3, nseq, soundness
|
||||
|
||||
## Summary
|
||||
|
||||
The nseq solver returns `sat` for a benchmark that asserts:
|
||||
1. `z` is a substring of `y` (`str.contains y z`)
|
||||
2. `x = y ++ y`
|
||||
3. `z` is NOT a substring of `x` (`not (str.contains x z)`)
|
||||
|
||||
The third assertion contradicts the first two, so the formula should be `unsat`.
|
||||
The seq solver correctly returns `unsat`; nseq incorrectly returns `sat`.
|
||||
|
||||
## Affected benchmark
|
||||
|
||||
| File | seq verdict | nseq verdict |
|
||||
|------|-------------|--------------|
|
||||
| `contains-4.smt2` | unsat | **sat** (WRONG) |
|
||||
|
||||
Data from: https://github.com/Z3Prover/z3/discussions/9071
|
||||
|
||||
## Reproducing example
|
||||
|
||||
```smt2
|
||||
; contains-4.smt2 — EXPECTED: unsat, nseq returns: sat
|
||||
(set-logic QF_S)
|
||||
(declare-const x String)
|
||||
(declare-const y String)
|
||||
(declare-const z String)
|
||||
(assert (str.contains y z))
|
||||
(assert (= x (str.++ y y)))
|
||||
(assert (not (str.contains x z)))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
## Analysis
|
||||
|
||||
If `str.contains y z` is true, then `y = L ++ z ++ R` for some `L`, `R`.
|
||||
Then `x = y ++ y = L ++ z ++ R ++ L ++ z ++ R`, so `z` is a substring of `x`.
|
||||
Therefore `not (str.contains x z)` is a contradiction.
|
||||
|
||||
The nseq solver should derive this via the combination of:
|
||||
- `contains_true_axiom`: `str.contains y z` → `y = f1 ++ z ++ f2` (skolems)
|
||||
- The equality `x = y ++ y`
|
||||
- The Nielsen graph constraint that propagates the concatenation decomposition
|
||||
|
||||
The root cause may be that the nseq solver does not propagate `str.contains` facts
|
||||
through word equation equalities in the Nielsen graph. When `x = y ++ y` is processed
|
||||
as a word equation, the contains constraint on `y` should propagate to derive a contains
|
||||
constraint on `x`, but this inference appears to be missing.
|
||||
|
||||
## Files to investigate
|
||||
|
||||
- `src/ast/rewriter/seq_axioms.cpp` — `contains_true_axiom`, `unroll_not_contains`
|
||||
- `src/smt/seq/seq_nielsen.h` / `seq_nielsen.cpp` — word equation propagation
|
||||
- `src/smt/theory_nseq.cpp` — `assign_eh` for contains, `populate_nielsen_graph`
|
||||
71
research/docs/nseq-issues/03-soundness-indexof-regex.md
Normal file
71
research/docs/nseq-issues/03-soundness-indexof-regex.md
Normal file
|
|
@ -0,0 +1,71 @@
|
|||
# [nseq] Soundness bug: str.indexof unsound when combined with regex membership
|
||||
|
||||
**Labels**: bug, c3, nseq, soundness
|
||||
|
||||
## Summary
|
||||
|
||||
The nseq solver returns `sat` for benchmarks that constrain `str.indexof` to values
|
||||
impossible given the regex membership of the input string. The seq solver correctly
|
||||
returns `unsat` for these cases.
|
||||
|
||||
## Affected benchmarks
|
||||
|
||||
| File | seq verdict | nseq verdict |
|
||||
|------|-------------|--------------|
|
||||
| `indexof_const_index_unsat.smt2` | unsat | **sat** (WRONG) |
|
||||
| `indexof_var_unsat.smt2` | unsat | **sat** (WRONG) |
|
||||
|
||||
Data from: https://github.com/Z3Prover/z3/discussions/9071
|
||||
|
||||
## Reproducing examples
|
||||
|
||||
```smt2
|
||||
; indexof_const_index_unsat.smt2 — EXPECTED: unsat, nseq returns: sat
|
||||
(set-info :status unsat)
|
||||
(declare-fun a () String)
|
||||
(declare-fun i () Int)
|
||||
(declare-fun j () Int)
|
||||
(assert (str.in_re a (re.union (str.to_re "hhhbbb") (str.to_re "bhhh"))))
|
||||
(assert (= (str.indexof a "hhh" j) i))
|
||||
(assert (= i 2))
|
||||
(assert (> j 0))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
```smt2
|
||||
; indexof_var_unsat.smt2 — EXPECTED: unsat, nseq returns: sat
|
||||
(set-info :status unsat)
|
||||
(declare-fun a () String)
|
||||
(declare-fun i () Int)
|
||||
(declare-fun j () Int)
|
||||
(assert (str.in_re a (re.union (str.to_re "hhhbbb") (str.to_re "bhhh"))))
|
||||
(assert (= (str.indexof a "hhh" j) i))
|
||||
(assert (> i 1))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
## Analysis
|
||||
|
||||
For `indexof_const_index_unsat.smt2`:
|
||||
- `a ∈ {hhhbbb, bhhh}` (two possibilities)
|
||||
- `str.indexof a "hhh" j = 2` with `j > 0`
|
||||
- In "hhhbbb", "hhh" appears at index 0 only (but j > 0 means the search starts after index 0)
|
||||
- In "bhhh", "hhh" appears at index 1, but with j > 0 the only valid return would be 1, not 2
|
||||
- So i = 2 is impossible → unsat
|
||||
|
||||
The `indexof_axiom` in `seq_axioms.cpp` generates arithmetic constraints for indexof,
|
||||
but these constraints may not be sufficiently tight when combined with concrete regex
|
||||
membership constraints. Specifically, the nseq solver does not appear to combine the
|
||||
regex membership information with the indexof position constraints to derive the
|
||||
contradiction.
|
||||
|
||||
The root cause is likely that nseq's `indexof_axiom` generates axioms about `str.indexof`
|
||||
without leveraging the concrete alphabet constraints imposed by regex membership. The
|
||||
seq solver may do additional propagation (e.g., via character-level analysis of the
|
||||
regex language) that nseq does not perform.
|
||||
|
||||
## Files to investigate
|
||||
|
||||
- `src/ast/rewriter/seq_axioms.cpp` — `indexof_axiom`
|
||||
- `src/smt/seq/seq_regex.h` / `seq_regex.cpp` — regex membership propagation
|
||||
- `src/smt/theory_nseq.cpp` — interaction between regex constraints and arithmetic axioms
|
||||
78
research/docs/nseq-issues/04-soundness-prefix-suffix.md
Normal file
78
research/docs/nseq-issues/04-soundness-prefix-suffix.md
Normal file
|
|
@ -0,0 +1,78 @@
|
|||
# [nseq] Soundness bug: str.prefixof / str.suffixof interaction not fully axiomatized
|
||||
|
||||
**Labels**: bug, c3, nseq, soundness
|
||||
|
||||
## Summary
|
||||
|
||||
The nseq solver returns `sat` for several benchmarks involving `str.prefixof`,
|
||||
`str.suffixof`, and `str.at` where the correct answer is `unsat`. The seq solver
|
||||
handles all these cases correctly.
|
||||
|
||||
## Affected benchmarks
|
||||
|
||||
| File | seq verdict | nseq verdict |
|
||||
|------|-------------|--------------|
|
||||
| `prefix-suffix.smt2` | unsat | **sat** (WRONG) |
|
||||
| `failedProp.smt2` | unsat | **sat** (WRONG) |
|
||||
| `failedProp2.smt2` | unsat | **sat** (WRONG) |
|
||||
|
||||
Data from: https://github.com/Z3Prover/z3/discussions/9071
|
||||
|
||||
## Reproducing examples
|
||||
|
||||
```smt2
|
||||
; prefix-suffix.smt2 — EXPECTED: unsat, nseq returns: sat
|
||||
(set-logic QF_S)
|
||||
(declare-fun a () String)
|
||||
(declare-fun b () String)
|
||||
(assert (not (=> (and (str.prefixof a b) (str.suffixof b a)) (= a b))))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
The above asserts: `(a prefix b) AND (b suffix a) AND a ≠ b`.
|
||||
This is unsatisfiable: if `a` is a prefix of `b` and `b` is a suffix of `a`,
|
||||
then `|a| ≤ |b|` and `|b| ≤ |a|`, so `|a| = |b|`. Combined with the prefix and
|
||||
suffix conditions, this forces `a = b`.
|
||||
|
||||
```smt2
|
||||
; failedProp.smt2 — EXPECTED: unsat, nseq returns: sat
|
||||
(set-logic QF_SLIA)
|
||||
(declare-fun x () String)
|
||||
(declare-fun y () String)
|
||||
(assert (not (= (str.prefixof x (str.at x 1)) (= x ""))))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
The above asserts: `(str.prefixof x (str.at x 1)) ≠ (x = "")`.
|
||||
`str.at x 1` is the second character of `x` (as a length-1 string).
|
||||
`str.prefixof x (str.at x 1)` is true only when `x` is a prefix of a single character,
|
||||
which means `x = ""` or `x` is that single character (but then `x` is a prefix of itself only
|
||||
if `len(x) ≤ 1`). This equivalence should hold: `str.prefixof x (str.at x 1) ↔ x = ""`.
|
||||
So the assertion is `unsat`.
|
||||
|
||||
```smt2
|
||||
; failedProp2.smt2 — EXPECTED: unsat, nseq returns: sat (same pattern with suffixof)
|
||||
(set-logic QF_SLIA)
|
||||
(declare-fun x () String)
|
||||
(assert (not (= (str.suffixof x (str.at x 1)) (= x ""))))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
## Analysis
|
||||
|
||||
For `prefix-suffix.smt2`, the current axioms for `prefix_axiom` and `suffix_axiom`
|
||||
introduce skolem variables but may not derive the length equality `|a| = |b|` from the
|
||||
combined prefix+suffix condition, or may not then propagate this to `a = b`.
|
||||
|
||||
For `failedProp.smt2` and `failedProp2.smt2`, the issue involves interaction between
|
||||
`str.at`, `str.prefixof`/`str.suffixof`, and string-integer length constraints. The
|
||||
nseq solver may not properly resolve the combination of:
|
||||
1. The `at_axiom` (generating constraints on `str.at x 1`)
|
||||
2. The `prefix_axiom`/`suffix_axiom` (decomposing the prefix/suffix relationship)
|
||||
3. The arithmetic constraint linking lengths
|
||||
|
||||
## Files to investigate
|
||||
|
||||
- `src/ast/rewriter/seq_axioms.cpp` — `prefix_axiom`, `suffix_axiom`, `prefix_true_axiom`, `suffix_true_axiom`, `at_axiom`
|
||||
- `src/smt/theory_nseq.cpp` — `assign_eh` for prefix/suffix, `final_check_eh`
|
||||
- `src/smt/seq/seq_nielsen.cpp` — length propagation in Nielsen graph
|
||||
65
research/docs/nseq-issues/05-soundness-conflicting-regex.md
Normal file
65
research/docs/nseq-issues/05-soundness-conflicting-regex.md
Normal file
|
|
@ -0,0 +1,65 @@
|
|||
# [nseq] Soundness bug: conflicting regex memberships not detected (norn-benchmark-9f)
|
||||
|
||||
**Labels**: bug, c3, nseq, soundness
|
||||
|
||||
## Summary
|
||||
|
||||
The nseq solver returns `sat` for a formula asserting that a non-empty string belongs
|
||||
simultaneously to two disjoint regular languages. The correct answer is `unsat`.
|
||||
|
||||
## Affected benchmark
|
||||
|
||||
| File | seq verdict | nseq verdict |
|
||||
|------|-------------|--------------|
|
||||
| `norn-benchmark-9f.smt2` | unsat | **sat** (WRONG) |
|
||||
|
||||
Data from: https://github.com/Z3Prover/z3/discussions/9071
|
||||
|
||||
## Reproducing example
|
||||
|
||||
```smt2
|
||||
; norn-benchmark-9f.smt2 — EXPECTED: unsat, nseq returns: sat
|
||||
(set-logic QF_S)
|
||||
(declare-fun var_0 () String)
|
||||
(assert (str.in.re var_0 (re.* (re.range "a" "u"))))
|
||||
(assert (str.in.re var_0 (re.* (str.to.re "v"))))
|
||||
(assert (not (= var_0 "")))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
The formula asserts:
|
||||
1. `var_0 ∈ (re.range "a" "u")* ` — all characters in "a"–"u"
|
||||
2. `var_0 ∈ "v"*` — only the character "v"
|
||||
3. `var_0 ≠ ""`
|
||||
|
||||
Since "v" is not in the range "a"–"u" (the range includes up to 'u', not 'v'),
|
||||
the intersection of the two languages is `{""}`. Combined with `var_0 ≠ ""`, this is
|
||||
unsatisfiable.
|
||||
|
||||
## Note on syntax
|
||||
|
||||
This benchmark uses the old SMT-LIB 2.5 syntax `str.in.re` (with dots) rather than
|
||||
the SMT-LIB 2.6 syntax `str.in_re` (with underscore). Both are supported by Z3's parser.
|
||||
The bug may be triggered specifically by the old-style syntax interaction with nseq's
|
||||
regex handling, or may be a pure logic issue.
|
||||
|
||||
## Analysis
|
||||
|
||||
The nseq solver handles `str.in_re` constraints by computing the intersection of
|
||||
multiple regex membership constraints for the same string variable (via the sgraph and
|
||||
Nielsen graph). The Parikh image pre-check (`seq_parikh.cpp`) should detect that
|
||||
`(re.range "a" "u")* ∩ "v"*` restricted to non-empty strings is empty, since:
|
||||
- `"v"*` allows only 'v' characters
|
||||
- `(re.range "a" "u")*` allows only characters in 'a'..'u'
|
||||
- 'v' ∉ 'a'..'u' → intersection = `{""}` only
|
||||
|
||||
The root cause is likely that the nseq solver does not derive the character-level
|
||||
disjointness between these two regex languages. The seq solver may use a derivative-based
|
||||
or automaton intersection approach that detects this.
|
||||
|
||||
## Files to investigate
|
||||
|
||||
- `src/smt/seq/seq_regex.h` / `seq_regex.cpp` — regex membership processing, pre-check
|
||||
- `src/smt/seq/seq_parikh.h` / `seq_parikh.cpp` — Parikh image constraint generation
|
||||
- `src/smt/seq/seq_nielsen.cpp` — membership processing in Nielsen nodes
|
||||
- `src/smt/theory_nseq.cpp` — `assign_eh` for regex membership
|
||||
67
research/docs/nseq-issues/06-crash-non-greedy-quantifiers.md
Normal file
67
research/docs/nseq-issues/06-crash-non-greedy-quantifiers.md
Normal file
|
|
@ -0,0 +1,67 @@
|
|||
# [nseq] Crash: non-greedy regex quantifiers (re.+?, re.*?) cause exception
|
||||
|
||||
**Labels**: bug, c3, nseq, crash
|
||||
|
||||
## Summary
|
||||
|
||||
The nseq solver crashes (throws an exception or returns an error) when the input
|
||||
formula contains non-greedy regex quantifiers such as `re.+?` (lazy plus) or `re.*?`
|
||||
(lazy star). The seq solver handles these correctly by treating them as equivalent to
|
||||
their greedy counterparts for satisfiability purposes.
|
||||
|
||||
## Affected benchmark
|
||||
|
||||
| File | seq verdict | nseq verdict |
|
||||
|------|-------------|--------------|
|
||||
| `non-greedy-quantifiers.smt2` | sat | **bug/crash** |
|
||||
|
||||
Data from: https://github.com/Z3Prover/z3/discussions/9071
|
||||
|
||||
## Reproducing example
|
||||
|
||||
```smt2
|
||||
; non-greedy-quantifiers.smt2 — EXPECTED: sat, nseq crashes
|
||||
(set-logic QF_S)
|
||||
(declare-fun a () String)
|
||||
(declare-fun b () String)
|
||||
(declare-fun c () String)
|
||||
(declare-fun d () String)
|
||||
(assert (= a (str.++ b c)))
|
||||
(assert (= b (str.++ d c)))
|
||||
; non-greedy re.+? has same semantics as re.+ for satisfiability
|
||||
(assert (str.in.re a (re.+ (re.union (str.to.re "x") (str.to.re "y")))))
|
||||
(assert (str.in.re c (re.+? (str.to.re "x"))))
|
||||
(check-sat)
|
||||
(get-model)
|
||||
```
|
||||
|
||||
## Analysis
|
||||
|
||||
Z3's SMT-LIB parser supports non-greedy regex quantifiers (`re.+?`, `re.*?`, `re.??`)
|
||||
as part of the ECMA-like regex extension. For satisfiability checking, non-greedy
|
||||
quantifiers are semantically equivalent to their greedy counterparts (they only differ
|
||||
in which match is returned, not in what strings are accepted).
|
||||
|
||||
The nseq solver's regex handling infrastructure (`seq_regex.cpp`) does not appear to
|
||||
recognize or normalize non-greedy quantifiers. When the symbol `re.+?` is encountered
|
||||
in a membership constraint, the solver either:
|
||||
1. Does not recognize the operator and triggers `push_unhandled_pred()`, causing `FC_GIVEUP`
|
||||
(which may manifest as an error/"bug" in the benchmark runner), or
|
||||
2. Crashes due to an unhandled case in the Brzozowski derivative computation or
|
||||
Parikh constraint generation.
|
||||
|
||||
### Fix proposal
|
||||
|
||||
Non-greedy quantifiers should be rewritten to their greedy equivalents before processing:
|
||||
- `re.+? e` → `re.+ e` (lazy plus → greedy plus)
|
||||
- `re.*? e` → `re.* e` (lazy star → greedy star)
|
||||
- `re.?? e` → `re.opt e` (lazy optional → greedy optional)
|
||||
|
||||
This rewrite is sound for satisfiability/unsatisfiability since both interpretations
|
||||
accept exactly the same language.
|
||||
|
||||
## Files to investigate
|
||||
|
||||
- `src/smt/seq/seq_regex.cpp` — regex preprocessing and derivative computation
|
||||
- `src/ast/rewriter/seq_rewriter.cpp` — regex rewriting rules (add non-greedy normalization)
|
||||
- `src/smt/theory_nseq.cpp` — `internalize_atom` / `assign_eh` for regex membership
|
||||
82
research/docs/nseq-issues/07-crash-str-lt-le-axioms.md
Normal file
82
research/docs/nseq-issues/07-crash-str-lt-le-axioms.md
Normal file
|
|
@ -0,0 +1,82 @@
|
|||
# [nseq] Crashes: str.< and str.<= axiom handling failures
|
||||
|
||||
**Labels**: bug, c3, nseq, crash
|
||||
|
||||
## Summary
|
||||
|
||||
The nseq solver crashes (returns "bug" in the benchmark runner) on formulas involving
|
||||
`str.<` (lexicographic less-than) and `str.<=` (lexicographic less-than-or-equal)
|
||||
string comparison predicates in contexts where they interact with length constraints
|
||||
and character-level reasoning.
|
||||
|
||||
## Affected benchmarks
|
||||
|
||||
| File | seq verdict | nseq verdict |
|
||||
|------|-------------|--------------|
|
||||
| `str-leq11.smt2` | sat | **bug/crash** |
|
||||
| `str-leq12.smt2` | sat | **bug/crash** |
|
||||
| `str-leq13.smt2` | sat | **bug/crash** |
|
||||
| `str-lt.smt2` | sat | **bug/crash** |
|
||||
| `str-lt2.smt2` | sat | **bug/crash** |
|
||||
|
||||
Data from: https://github.com/Z3Prover/z3/discussions/9071
|
||||
|
||||
## Reproducing examples
|
||||
|
||||
```smt2
|
||||
; str-leq11.smt2 — EXPECTED: sat, nseq crashes
|
||||
(set-logic QF_SLIA)
|
||||
(declare-fun x () String)
|
||||
(assert (= 1 (str.len x)))
|
||||
(assert (str.<= "a" x))
|
||||
(assert (str.<= x "c"))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
```smt2
|
||||
; str-lt.smt2 — EXPECTED: sat, nseq crashes
|
||||
(set-logic QF_SLIA)
|
||||
(declare-fun x () String)
|
||||
(assert (= 1 (str.len x)))
|
||||
(assert (str.< "A" x))
|
||||
(assert (str.< x "C"))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
## Analysis
|
||||
|
||||
The nseq solver handles `str.<` and `str.<=` via `lt_axiom` and `le_axiom` in
|
||||
`seq_axioms.cpp`, triggered from `relevant_eh` → `dequeue_axiom`. The `assign_eh`
|
||||
for these predicates is a no-op with comment "axioms added via relevant_eh → dequeue_axiom".
|
||||
|
||||
The `lt_axiom` generates clauses involving:
|
||||
- Skolem variables `x`, `y`, `z` (common prefix and suffixes)
|
||||
- Skolem character variables `c`, `d`
|
||||
- Prefix and character-comparison constraints
|
||||
|
||||
The crash likely occurs because:
|
||||
1. The `lt_axiom` / `le_axiom` introduces character-comparison constraints (via `seq.mk_lt`)
|
||||
that require character-level arithmetic reasoning (e.g., comparing character codes).
|
||||
If the nseq solver's arithmetic integration does not correctly handle these
|
||||
character-code comparisons, it may crash or give incorrect results.
|
||||
|
||||
2. The `lt_axiom` clauses reference the Boolean literal for the `str.<` predicate itself
|
||||
(via `expr_ref lt = expr_ref(n, m)`). If `n` is used both as a term and as a literal
|
||||
reference, and nseq has an inconsistency in how it maps the Bool var to the expression,
|
||||
this could cause a null dereference or assertion failure.
|
||||
|
||||
3. The interaction between the length constraint `len(x) = 1` and the lt/le axioms'
|
||||
prefix-based decomposition may produce contradictory or unresolvable constraints
|
||||
in the Nielsen graph.
|
||||
|
||||
### What str-leq11 requires
|
||||
|
||||
For `len(x) = 1`, `"a" ≤ x ≤ "c"` (lexicographically): x ∈ {"a", "b", "c"}.
|
||||
The solver needs to find x = "a", "b", or "c". This is straightforward with
|
||||
character-range reasoning.
|
||||
|
||||
## Files to investigate
|
||||
|
||||
- `src/ast/rewriter/seq_axioms.cpp` — `lt_axiom`, `le_axiom`
|
||||
- `src/smt/theory_nseq.cpp` — `assign_eh` for is_lt/is_le, `relevant_eh`, `dequeue_axiom`
|
||||
- `src/smt/seq/seq_nielsen.cpp` — character-level constraint handling
|
||||
118
research/docs/nseq-issues/08-crash-str-at-substr-from-to-int.md
Normal file
118
research/docs/nseq-issues/08-crash-str-at-substr-from-to-int.md
Normal file
|
|
@ -0,0 +1,118 @@
|
|||
# [nseq] Crashes: str.at / str.substr / str.from_int / str.to_int in complex formulas
|
||||
|
||||
**Labels**: bug, c3, nseq, crash
|
||||
|
||||
## Summary
|
||||
|
||||
The nseq solver crashes or returns errors on benchmarks that combine `str.at`,
|
||||
`str.substr`, `str.from_int`, and `str.to_int` with regex constraints, length
|
||||
constraints, or positional arithmetic. The seq solver handles all these correctly.
|
||||
|
||||
## Affected benchmarks
|
||||
|
||||
| File | seq verdict | nseq verdict |
|
||||
|------|-------------|--------------|
|
||||
| `str.at.smt2` | sat | **bug/crash** |
|
||||
| `str.at-2.smt2` | sat | **bug/crash** |
|
||||
| `str.from_int_6.smt2` | sat | **bug/crash** |
|
||||
| `str.to_int_5.smt2` | unsat | **bug/crash** |
|
||||
| `str.to_int_6.smt2` | unknown | **bug/crash** |
|
||||
| `substring.smt2` | sat | **bug/crash** |
|
||||
| `substring2.smt2` | sat | **bug/crash** |
|
||||
| `substring2b.smt2` | sat | **bug/crash** |
|
||||
| `is-digit-2.smt2` | sat | **bug/crash** |
|
||||
|
||||
Data from: https://github.com/Z3Prover/z3/discussions/9071
|
||||
|
||||
## Reproducing examples
|
||||
|
||||
```smt2
|
||||
; str.at.smt2 — EXPECTED: sat, nseq crashes
|
||||
(set-logic QF_S)
|
||||
(declare-const x String)
|
||||
(assert (= (str.at x 2) "x"))
|
||||
(assert (= (str.at x 3) "y"))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
```smt2
|
||||
; str.from_int_6.smt2 — EXPECTED: sat, nseq crashes
|
||||
(declare-const w String)
|
||||
(declare-const x Int)
|
||||
(declare-const y Int)
|
||||
(assert (= (str.substr w 0 3) (str.from_int x)))
|
||||
(assert (= (str.substr w 10 3) (str.from_int y)))
|
||||
(assert (> x (* 2 y)))
|
||||
(assert (>= y 0))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
```smt2
|
||||
; str.to_int_5.smt2 — EXPECTED: unsat, nseq crashes
|
||||
(declare-const w String)
|
||||
(assert (= (str.to_int (str.substr w 0 10)) 123))
|
||||
(assert (= (str.to_int (str.substr w 10 10)) 321))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
```smt2
|
||||
; substring.smt2 — EXPECTED: sat, nseq crashes
|
||||
(set-logic QF_S)
|
||||
(declare-fun x () String)
|
||||
(declare-fun y () String)
|
||||
(declare-fun n () Int)
|
||||
(assert (= y (str.substr x n 5)))
|
||||
(assert (>= (str.len x) (+ n 5)))
|
||||
(assert (= (str.at y 0) "a"))
|
||||
(assert (> n 0))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
```smt2
|
||||
; is-digit-2.smt2 — EXPECTED: sat, nseq crashes
|
||||
(set-logic QF_S)
|
||||
(declare-fun x () String)
|
||||
(assert (str.is_digit (str.at x 1)))
|
||||
(assert (str.is_digit (str.at x 2)))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
## Analysis
|
||||
|
||||
These benchmarks all combine position-indexed string operations (`str.at`, `str.substr`)
|
||||
with either:
|
||||
- Direct string equality constraints on the extracted substring/character
|
||||
- Integer conversion operations (`str.from_int`, `str.to_int`)
|
||||
- Character classification (`str.is_digit`)
|
||||
- Arithmetic constraints on positions (`n > 0`, `len(x) >= n + 5`)
|
||||
|
||||
The axioms for these operations (`at_axiom`, `extract_axiom`, `itos_axiom`, `stoi_axiom`,
|
||||
`is_digit_axiom`) are defined in `seq_axioms.cpp` and are triggered via `relevant_eh` →
|
||||
`dequeue_axiom`. The individual axioms appear to be implemented.
|
||||
|
||||
The crashes likely stem from one or more of:
|
||||
|
||||
1. **Interaction with variable-offset substring**: When `str.at x i` uses a variable
|
||||
index `i` (e.g., `str.at x 1` where `1` is a concrete integer, but the axiom
|
||||
creates skolems that interact with the Nielsen graph's variable positions),
|
||||
the resulting constraints may cause assertion failures.
|
||||
|
||||
2. **Double-application of str.at**: In `str.at.smt2`, two `str.at` constraints on the
|
||||
same variable `x` at different positions (2 and 3) must simultaneously hold.
|
||||
The `at_axiom` introduces skolem variables for each, and their combination in the
|
||||
Nielsen graph may not be handled correctly.
|
||||
|
||||
3. **Nested substr + conversion**: In `str.from_int_6.smt2` and `str.to_int_5.smt2`,
|
||||
the combination of `str.substr` extracting a sub-region plus integer conversion
|
||||
creates constraints that span the string/arithmetic boundary in ways the nseq
|
||||
solver's axiom-enqueuing mechanism may not handle.
|
||||
|
||||
4. **str.is_digit on str.at result**: In `is-digit-2.smt2`, `str.is_digit (str.at x 1)`
|
||||
requires the `is_digit_axiom` to be applied to the result of `at_axiom`. The nseq
|
||||
solver may not propagate the `relevant_eh` trigger through the composed term.
|
||||
|
||||
## Files to investigate
|
||||
|
||||
- `src/ast/rewriter/seq_axioms.cpp` — `at_axiom`, `extract_axiom`, `itos_axiom`, `stoi_axiom`, `is_digit_axiom`
|
||||
- `src/smt/theory_nseq.cpp` — `relevant_eh` (check if nested terms trigger axiom enqueuing), `dequeue_axiom`
|
||||
- `src/smt/seq/seq_nielsen.cpp` — handling of position-indexed skolem variables
|
||||
117
research/docs/nseq-issues/09-crash-word-equation-regex.md
Normal file
117
research/docs/nseq-issues/09-crash-word-equation-regex.md
Normal file
|
|
@ -0,0 +1,117 @@
|
|||
# [nseq] Crashes: word equation + regex benchmarks (various operations)
|
||||
|
||||
**Labels**: bug, c3, nseq, crash
|
||||
|
||||
## Summary
|
||||
|
||||
The nseq solver crashes or errors on a variety of benchmarks involving combinations of
|
||||
word equations with regex constraints, contains, replace_all, and SLIA arithmetic.
|
||||
These represent missing or incomplete operation support in the nseq solver.
|
||||
|
||||
## Affected benchmarks
|
||||
|
||||
| File | seq verdict | nseq verdict | Operations involved |
|
||||
|------|-------------|--------------|---------------------|
|
||||
| `concat-001.smt2` | sat | **bug/crash** | word equation + regex (old syntax) |
|
||||
| `contains-1.smt2` | unsat | **bug/crash** | contains in equality context |
|
||||
| `contains-7.smt2` | sat | **bug/crash** | str.replace_all |
|
||||
| `nonlinear-2.smt2` | sat | **bug/crash** | `a = b ++ b` + regex |
|
||||
| `noodles-unsat8.smt2` | unknown | **bug/crash** | complex word equation |
|
||||
| `noodles-unsat10.smt2` | unsat | **bug/crash** | complex word equation |
|
||||
| `norn-benchmark-9i.smt2` | sat | **bug/crash** | old-syntax regex + complex |
|
||||
| `pcp-1.smt2` | sat | **bug/crash** | SLIA + complex string ops |
|
||||
|
||||
Data from: https://github.com/Z3Prover/z3/discussions/9071
|
||||
|
||||
## Reproducing examples
|
||||
|
||||
```smt2
|
||||
; concat-001.smt2 — EXPECTED: sat, nseq crashes
|
||||
; Uses old-style str.in.re syntax and unambiguous regex constraints
|
||||
(declare-const x String)
|
||||
(declare-const y String)
|
||||
(declare-const z String)
|
||||
(assert (str.in.re x (re.* (str.to.re "ab"))))
|
||||
(assert (str.in.re y (re.* (str.to.re "c"))))
|
||||
(assert (str.in.re z (str.to.re "cccc")))
|
||||
(assert (= z (str.++ x y)))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
```smt2
|
||||
; contains-1.smt2 — EXPECTED: unsat, nseq crashes
|
||||
; str.contains used inside an equality with a boolean literal
|
||||
(set-logic QF_S)
|
||||
(declare-const x String)
|
||||
(assert (not (= (str.contains "A" x) true)))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
```smt2
|
||||
; contains-7.smt2 — EXPECTED: sat, nseq crashes
|
||||
; str.replace_all used with regex constraints
|
||||
(set-logic QF_S)
|
||||
(declare-const x String)
|
||||
(declare-const y String)
|
||||
(declare-const z String)
|
||||
(declare-const res String)
|
||||
(assert (= res (str.replace_all x y z)))
|
||||
(assert (str.in_re x (re.* (str.to_re "ab"))))
|
||||
(assert (not (str.in_re y (re.* (str.to_re "ab")))))
|
||||
(assert (not (= res "")))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
```smt2
|
||||
; nonlinear-2.smt2 — EXPECTED: sat, nseq crashes
|
||||
; Non-linear word equation: a = b ++ b (self-concatenation)
|
||||
(set-logic QF_S)
|
||||
(declare-fun a () String)
|
||||
(declare-fun b () String)
|
||||
(assert (= a (str.++ b b)))
|
||||
(assert (str.in.re a (re.++ (re.* (str.to.re "(")) (re.+ (str.to.re ")")))))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
## Analysis of individual crash causes
|
||||
|
||||
### concat-001.smt2
|
||||
Uses the old SMT-LIB `str.in.re` / `str.to.re` syntax with a word equation
|
||||
`z = x ++ y` where all three variables have concrete regex constraints. If the nseq
|
||||
solver's parser or internalization does not correctly handle the old-style `str.to.re`
|
||||
synonym, the membership constraint may not be registered correctly.
|
||||
|
||||
### contains-1.smt2
|
||||
The formula `(not (= (str.contains "A" x) true))` creates a Boolean atom of the form
|
||||
`(= <bool-expr> true)` rather than the direct `str.contains` predicate.
|
||||
In `assign_eh`, the direct check `is_contains(e)` will NOT match `(= (str.contains "A" x) true)`.
|
||||
The catch-all `push_unhandled_pred()` fires, but the outer negation means the atom
|
||||
`(= (str.contains "A" x) true)` is assigned to false. This should cause `FC_GIVEUP`,
|
||||
but may manifest as a crash if the solver mishandles the boolean variable assignment.
|
||||
|
||||
**Fix**: Normalize `(= <bool-expr> true/false)` patterns in `assign_eh` or during
|
||||
internalization, or add specific handling for `str.contains` embedded in boolean equalities.
|
||||
|
||||
### contains-7.smt2
|
||||
Uses `str.replace_all` which is handled by `replace_all_axiom`. The crash may occur
|
||||
when `replace_all_axiom` interacts with regex constraints on the arguments in ways
|
||||
the Nielsen graph doesn't support.
|
||||
|
||||
### nonlinear-2.smt2
|
||||
The word equation `a = b ++ b` is a non-linear equation (variable `b` appears twice
|
||||
on the RHS). The Nielsen graph supports some non-linear equations, but this specific
|
||||
combination with a regex constraint `a ∈ (re.* "(") ++ (re.+ ")")` — which constrains
|
||||
characters — may expose a gap in the non-linear handling.
|
||||
|
||||
### noodles-unsat8 / noodles-unsat10
|
||||
These are word equations from the Oliver Markgraf OSTRICH benchmark set. They involve
|
||||
multiple variables with complex concatenation patterns and regex constraints. The crashes
|
||||
suggest that the combination of 3+ variable word equations with disjunctive regex
|
||||
constraints exceeds what the current nseq Nielsen graph implementation can handle.
|
||||
|
||||
## Files to investigate
|
||||
|
||||
- `src/smt/theory_nseq.cpp` — `assign_eh` (boolean-equality pattern for contains), `internalize_atom`
|
||||
- `src/ast/rewriter/seq_axioms.cpp` — `replace_all_axiom`, contains axioms
|
||||
- `src/smt/seq/seq_nielsen.cpp` — non-linear equation handling, `generate_extensions`
|
||||
- `src/ast/seq_decl_plugin.cpp` — old-style synonym mapping (str.in.re / str.to.re)
|
||||
|
|
@ -0,0 +1,121 @@
|
|||
# [nseq] Performance regression: nseq times out on regex + word equation benchmarks
|
||||
|
||||
**Labels**: performance, c3, nseq
|
||||
|
||||
## Summary
|
||||
|
||||
The nseq solver times out (exceeds 5 seconds) on several benchmarks that the seq solver
|
||||
solves in under 30ms. These represent cases where nseq's Nielsen-graph search either
|
||||
explores too many branches or does not apply available pruning heuristics effectively.
|
||||
|
||||
## Affected benchmarks
|
||||
|
||||
| File | seq verdict | seq time | nseq verdict | nseq time |
|
||||
|------|-------------|----------|--------------|-----------|
|
||||
| `concat-regex.smt2` | unsat | 0.018s | **unknown** (timeout) | >5s |
|
||||
| `concat-regex3.smt2` | unsat | 0.024s | **unknown** (timeout) | >5s |
|
||||
| `loop.smt2` | sat | 0.026s | **unknown** (timeout) | >5s |
|
||||
| `simple-concat-4.smt2` | sat | 0.028s | **unknown** (timeout) | >5s |
|
||||
| `all-quantifiers.smt2` | unknown | 0.020s | **unknown** (timeout) | >5s |
|
||||
|
||||
Data from: https://github.com/Z3Prover/z3/discussions/9071
|
||||
|
||||
## Reproducing examples
|
||||
|
||||
```smt2
|
||||
; concat-regex.smt2 — seq: unsat in 18ms, nseq: TIMEOUT
|
||||
(set-logic QF_S)
|
||||
(declare-fun a () String)
|
||||
(declare-fun b () String)
|
||||
(declare-fun c () String)
|
||||
(declare-fun d () String)
|
||||
(assert (= a (str.++ b c)))
|
||||
(assert (= b (str.++ d c)))
|
||||
(assert (str.in.re a (re.+ (re.union (str.to.re "x") (str.to.re "y")))))
|
||||
(assert (str.in.re c (re.+ (str.to.re "x"))))
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
Why unsat: `a = b ++ c = d ++ c ++ c`. With `a ∈ (x|y)+` and `c ∈ x+`,
|
||||
all characters in `a` are 'x' or 'y', but `c` forces all its chars to be 'x'.
|
||||
The word equation creates constraints that (with the regex) force a contradiction.
|
||||
The seq solver can prove this quickly, but nseq's iterative deepening DFS exhausts
|
||||
its depth budget.
|
||||
|
||||
```smt2
|
||||
; loop.smt2 — seq: sat in 26ms, nseq: TIMEOUT
|
||||
(declare-fun TestC0 () String)
|
||||
(declare-fun |1 Fill 1| () String)
|
||||
(declare-fun |1 Fill 0| () String)
|
||||
(declare-fun |1 Fill 2| () String)
|
||||
(assert (str.in.re TestC0
|
||||
(re.++ (re.* (re.range "\x00" "\xff"))
|
||||
((_ re.loop 3 3) (re.range "0" "9"))
|
||||
(re.* (re.range "0" "9"))
|
||||
(re.* (re.range "\x00" "\xff")))))
|
||||
; ... (additional constraints)
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
```smt2
|
||||
; simple-concat-4.smt2 — seq: sat in 28ms, nseq: TIMEOUT
|
||||
; Uses recursive function definition (define-fun-rec) with transducer-style constraints
|
||||
; and regex membership
|
||||
(check-sat)
|
||||
```
|
||||
|
||||
## Analysis
|
||||
|
||||
### concat-regex and concat-regex3
|
||||
|
||||
These benchmarks use word equations `a = b ++ c`, `b = d ++ c` combined with regex
|
||||
memberships. The seq solver uses a length-based approach or Parikh constraints that
|
||||
quickly prune the search. The nseq solver's Nielsen graph approach uses iterative
|
||||
deepening DFS (starting at depth 10, doubling each iteration, up to 6 iterations per
|
||||
`seq_nielsen.cpp:solve()`). For these formulas:
|
||||
|
||||
- The Nielsen graph may generate many extensions at each level (characters from the
|
||||
regex alphabet, word splits)
|
||||
- The Parikh pre-check may not prune these cases effectively because the Parikh
|
||||
constraints for `(x|y)+` allow many length values
|
||||
- The solver exhausts all 6 iterations without finding the contradiction
|
||||
|
||||
**Fix proposal**: Improve the pre-check for membership intersection. When all strings
|
||||
in a word equation are constrained to finite-alphabet regexes (e.g., `(x|y)+ ∩ x+`),
|
||||
use a more aggressive alphabet-intersection analysis to prune branches early.
|
||||
|
||||
### loop.smt2 and simple-concat-4.smt2
|
||||
|
||||
These benchmarks involve:
|
||||
- `re.loop` (counted repetition) for `loop.smt2`
|
||||
- `define-fun-rec` (recursive function definitions for transducer-style constraints) for `simple-concat-4.smt2`
|
||||
|
||||
For `loop.smt2`: the `(_ re.loop 3 3)` constraint requires exactly 3 digits to appear
|
||||
at a specific position. The nseq solver may not have efficient handling for counted
|
||||
repetitions in the Parikh constraint generator.
|
||||
|
||||
For `simple-concat-4.smt2`: the `define-fun-rec` introduces recursive predicates.
|
||||
These are handled by `unfold_ho_terms()` in `theory_nseq.cpp`, which iteratively
|
||||
unfolds the recursive definition. The timeout may be due to slow convergence of the
|
||||
unfolding or the combination with regex constraints.
|
||||
|
||||
### Performance improvement opportunities
|
||||
|
||||
1. **Better Parikh stride computation for counted repetitions** (`seq_parikh.cpp`):
|
||||
`(_ re.loop N N)` has stride 0 and min_len = N * char_len. Ensure this is handled.
|
||||
|
||||
2. **Alphabet intersection pre-pruning** in the Nielsen graph: before generating all
|
||||
possible character extensions, check if the intersection of the alphabets of all
|
||||
regex constraints on a variable is consistent with the required character.
|
||||
|
||||
3. **Increase Nielsen DFS depth budget adaptively** or use a BFS strategy for small
|
||||
alphabets.
|
||||
|
||||
4. **Cache Parikh constraints** across Nielsen nodes that share the same membership constraints.
|
||||
|
||||
## Files to investigate
|
||||
|
||||
- `src/smt/seq/seq_nielsen.cpp` — `solve()`, `generate_extensions()`, depth budget
|
||||
- `src/smt/seq/seq_parikh.cpp` — `compute_length_stride()` for `re.loop`, constraint generation
|
||||
- `src/smt/seq/seq_regex.cpp` — alphabet-intersection analysis
|
||||
- `src/smt/theory_nseq.cpp` — `unfold_ho_terms()` convergence for recursive definitions
|
||||
78
research/docs/nseq-issues/README.md
Normal file
78
research/docs/nseq-issues/README.md
Normal file
|
|
@ -0,0 +1,78 @@
|
|||
# nseq Solver Issues — Ostrich Benchmark Analysis
|
||||
|
||||
This directory contains GitHub issue drafts for the nseq solver based on analysis of
|
||||
the Ostrich benchmark suite run on the c3 branch (Z3 v4.17.0, commit 8ef491e).
|
||||
|
||||
**Source**: https://github.com/Z3Prover/z3/discussions/9071
|
||||
**Branch**: c3
|
||||
**Benchmark set**: Ostrich (349 files), 5s timeout
|
||||
**Date**: 2026-03-21
|
||||
|
||||
## Summary of Results
|
||||
|
||||
| Solver | sat | unsat | unknown | timeout | bug/crash |
|
||||
|--------|-----|-------|---------|---------|-----------|
|
||||
| seq | 242 | 75 | 28 | 0 | 4 |
|
||||
| nseq | 245 | 66 | 11 | 0 | 27 |
|
||||
|
||||
- **22 soundness disagreements** between seq and nseq (nseq returns sat where seq returns unsat in 11 cases)
|
||||
- **27 nseq crashes** (vs 4 for seq, all on the same 4 parse-ecma files)
|
||||
- **5+ performance regressions** (nseq timeout where seq solves in <30ms)
|
||||
|
||||
## Issue Files
|
||||
|
||||
### Soundness Bugs (Highest Priority)
|
||||
|
||||
1. **[01-soundness-replace-empty-string.md](01-soundness-replace-empty-string.md)**
|
||||
`str.replace` with empty input returns sat instead of unsat.
|
||||
Affects: `replace-special.smt2`, `replace-special-4.smt2`, `replace-special-5.smt2`, `simple-replace-4b.smt2`
|
||||
|
||||
2. **[02-soundness-contains-monotonicity.md](02-soundness-contains-monotonicity.md)**
|
||||
`str.contains` not preserved through concatenation (`y ++ y`).
|
||||
Affects: `contains-4.smt2`
|
||||
|
||||
3. **[03-soundness-indexof-regex.md](03-soundness-indexof-regex.md)**
|
||||
`str.indexof` unsound when combined with regex membership constraints.
|
||||
Affects: `indexof_const_index_unsat.smt2`, `indexof_var_unsat.smt2`
|
||||
|
||||
4. **[04-soundness-prefix-suffix.md](04-soundness-prefix-suffix.md)**
|
||||
`str.prefixof` / `str.suffixof` interaction not fully axiomatized.
|
||||
Affects: `prefix-suffix.smt2`, `failedProp.smt2`, `failedProp2.smt2`
|
||||
|
||||
5. **[05-soundness-conflicting-regex.md](05-soundness-conflicting-regex.md)**
|
||||
Conflicting regex memberships not detected (disjoint character sets).
|
||||
Affects: `norn-benchmark-9f.smt2`
|
||||
|
||||
### Crashes (Medium Priority)
|
||||
|
||||
6. **[06-crash-non-greedy-quantifiers.md](06-crash-non-greedy-quantifiers.md)**
|
||||
Non-greedy regex quantifiers (`re.+?`, `re.*?`) cause crash.
|
||||
Affects: `non-greedy-quantifiers.smt2`
|
||||
|
||||
7. **[07-crash-str-lt-le-axioms.md](07-crash-str-lt-le-axioms.md)**
|
||||
`str.<` and `str.<=` axiom handling failures.
|
||||
Affects: `str-leq11.smt2`, `str-leq12.smt2`, `str-leq13.smt2`, `str-lt.smt2`, `str-lt2.smt2`
|
||||
|
||||
8. **[08-crash-str-at-substr-from-to-int.md](08-crash-str-at-substr-from-to-int.md)**
|
||||
Crashes on `str.at`, `str.substr`, `str.from_int`, `str.to_int` in complex formulas.
|
||||
Affects: `str.at.smt2`, `str.at-2.smt2`, `str.from_int_6.smt2`, `str.to_int_5.smt2`, `str.to_int_6.smt2`, `substring.smt2`, `substring2.smt2`, `substring2b.smt2`, `is-digit-2.smt2`
|
||||
|
||||
9. **[09-crash-word-equation-regex.md](09-crash-word-equation-regex.md)**
|
||||
Various crashes on word equation + regex benchmarks.
|
||||
Affects: `concat-001.smt2`, `contains-1.smt2`, `contains-7.smt2`, `nonlinear-2.smt2`, `noodles-unsat8.smt2`, `noodles-unsat10.smt2`, `norn-benchmark-9i.smt2`, `pcp-1.smt2`
|
||||
|
||||
### Performance Regressions (Lower Priority)
|
||||
|
||||
10. **[10-performance-regression-regex-word-equation.md](10-performance-regression-regex-word-equation.md)**
|
||||
nseq times out on regex + word equation benchmarks where seq is fast.
|
||||
Affects: `concat-regex.smt2`, `concat-regex3.smt2`, `loop.smt2`, `simple-concat-4.smt2`, `all-quantifiers.smt2`
|
||||
|
||||
## Quick-Win Fixes
|
||||
|
||||
The following issues have clear, localized fixes:
|
||||
|
||||
1. **Non-greedy quantifiers** (issue 06): Add rewrite rules to normalize `re.+?` → `re.+`, `re.*?` → `re.*` in `seq_rewriter.cpp`.
|
||||
|
||||
2. **Boolean-equality pattern for contains** (issue 09, `contains-1.smt2`): In `assign_eh`, handle the pattern `(= (str.contains ...) true/false)` by extracting the inner predicate.
|
||||
|
||||
3. **Conflicting regex character sets** (issue 05): Improve the Parikh pre-check or add a character-set intersection check for multiple membership constraints on the same variable.
|
||||
Loading…
Add table
Add a link
Reference in a new issue