3
0
Fork 0
mirror of https://github.com/Z3Prover/z3 synced 2026-06-19 07:06:28 +00:00
Commit graph

1 commit

Author SHA1 Message Date
Copilot
8c2a425e4b
Smart constructors for regex ranges: canonical form at construction time (#9814)
Regex range expressions (`re.range`) and Boolean operations over them
were left in unsimplified form, defeating downstream optimisations
(bisimulation classical fast-path, derivative engine) and producing
semantically-empty terms not syntactically equal to `re.none`.

## Changes

### `seq_decl_plugin.h` / `seq_decl_plugin.cpp`

- **`seq_util::rex::mk_range(sort*, unsigned lo, unsigned hi)`** — new
smart constructor that normalises at call time:
  - `lo > hi` → `re.empty`
  - `lo == hi` → `str.to_re` (singleton string)
  - `lo < hi` → `re.range`
- **`mk_info_rec` `OP_RE_RANGE`** — concrete non-empty ranges (both
bounds are single-char literals with `lo ≤ hi`) now return `classical =
true`, enabling the XOR-bisimulation `classical_distinguishing`
fast-path on character-predicate leaves. Symbolic/unknown ranges retain
`classical = false`.

### `seq_rewriter.cpp`

- **`mk_re_range`** — singleton collapse: `(re.range "a" "a")` →
`(str.to_re "a")`
- **`mk_regex_inter_normalize`** — range × range intersection: `[a,b] ∩
[c,d]` → `[max(a,c), min(b,d)]`, or `re.none` (disjoint), or `str.to_re`
(boundary singleton); now delegates to `re().mk_range(sort*, lo, hi)`
- **`mk_regex_union_normalize`** — range × range union for
overlapping/adjacent ranges: `[a,b] ∪ [c,d]` → `[min(a,c), max(b,d)]`;
disjoint ranges fall through to existing `merge_regex_sets`; now
delegates to `re().mk_range(sort*, lo, hi)`
- **`mk_re_complement`** — range complement expands to one or two
concrete ranges instead of an opaque `re.comp` node; now delegates to
`re().mk_range(sort*, lo, hi)`:
  - `comp([0, b])` → `[b+1, max]`
  - `comp([a, max])` → `[0, a-1]`
  - `comp([a, b])` → `[0, a-1] ∪ [b+1, max]`

```
(simplify (re.range "z" "a"))                              ; → re.none
(simplify (re.range "a" "a"))                              ; → (str.to_re "a")
(simplify (re.inter (re.range "a" "z") (re.range "f" "k"))); → (re.range "f" "k")
(simplify (re.union (re.range "a" "f") (re.range "g" "k"))); → (re.range "a" "k")
(simplify (re.comp  (re.range "b" "y")))                   ; → (re.union [0,a] [z,max])
```

### Tests

New `src/test/seq_rewriter.cpp` with 14 cases covering all the above
reductions plus downstream propagation (star/concat/union/inter
absorbing empty ranges).

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Nikolaj Bjorner <nbjorner@microsoft.com>
2026-06-16 13:58:56 -06:00