3
0
Fork 0
mirror of https://github.com/Z3Prover/z3 synced 2026-04-03 18:31:08 +00:00
z3/research/docs/nseq-issues/06-crash-non-greedy-quantifiers.md
Copilot f518faac9b
Add nseq issue drafts from Ostrich benchmark analysis (discussion #9071) (#9073)
Agent-Logs-Url: https://github.com/Z3Prover/z3/sessions/04321ea7-2a53-4ed5-9f43-816dc6f7476b

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>
2026-03-20 19:07:51 -07:00

2.6 KiB

[nseq] Crash: non-greedy regex quantifiers (re.+?, re.*?) cause exception

Labels: bug, c3, nseq, crash

Summary

The nseq solver crashes (throws an exception or returns an error) when the input formula contains non-greedy regex quantifiers such as re.+? (lazy plus) or re.*? (lazy star). The seq solver handles these correctly by treating them as equivalent to their greedy counterparts for satisfiability purposes.

Affected benchmark

File seq verdict nseq verdict
non-greedy-quantifiers.smt2 sat bug/crash

Data from: https://github.com/Z3Prover/z3/discussions/9071

Reproducing example

; non-greedy-quantifiers.smt2 — EXPECTED: sat, nseq crashes
(set-logic QF_S)
(declare-fun a () String)
(declare-fun b () String)
(declare-fun c () String)
(declare-fun d () String)
(assert (= a (str.++ b c)))
(assert (= b (str.++ d c)))
; non-greedy re.+? has same semantics as re.+ for satisfiability
(assert (str.in.re a (re.+ (re.union (str.to.re "x") (str.to.re "y")))))
(assert (str.in.re c (re.+? (str.to.re "x"))))
(check-sat)
(get-model)

Analysis

Z3's SMT-LIB parser supports non-greedy regex quantifiers (re.+?, re.*?, re.??) as part of the ECMA-like regex extension. For satisfiability checking, non-greedy quantifiers are semantically equivalent to their greedy counterparts (they only differ in which match is returned, not in what strings are accepted).

The nseq solver's regex handling infrastructure (seq_regex.cpp) does not appear to recognize or normalize non-greedy quantifiers. When the symbol re.+? is encountered in a membership constraint, the solver either:

  1. Does not recognize the operator and triggers push_unhandled_pred(), causing FC_GIVEUP (which may manifest as an error/"bug" in the benchmark runner), or
  2. Crashes due to an unhandled case in the Brzozowski derivative computation or Parikh constraint generation.

Fix proposal

Non-greedy quantifiers should be rewritten to their greedy equivalents before processing:

  • re.+? ere.+ e (lazy plus → greedy plus)
  • re.*? ere.* e (lazy star → greedy star)
  • re.?? ere.opt e (lazy optional → greedy optional)

This rewrite is sound for satisfiability/unsatisfiability since both interpretations accept exactly the same language.

Files to investigate

  • src/smt/seq/seq_regex.cpp — regex preprocessing and derivative computation
  • src/ast/rewriter/seq_rewriter.cpp — regex rewriting rules (add non-greedy normalization)
  • src/smt/theory_nseq.cppinternalize_atom / assign_eh for regex membership