- Cache now indexes by (ele, r) pair using obj_pair_map
- Remove eval() function; operator()(ele, r) handles all cases
- Rewrite simplify_ite_rec with path vector of signed conditions
- Add range-based simplification: (lo <= x, false) + (x <= hi, false)
eliminates ite(x = v, t, e) when v is outside [lo, hi]
- Add is_itos case in derive_to_re: guards on n >= 0, digit range,
and first character match
- Port is_reverse normalization (previous commit)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement a new seq::derive class (seq_derive.h/cpp) that computes
symbolic derivatives of regular expressions using ITE-trees, based on
the RE# approach (Varatalu, Veanes, Ernits - POPL 2025).
Key features:
- Two-argument operator()(ele, r): computes derivative of regex r w.r.t.
element ele (concrete character or de Bruijn variable for symbolic mode)
- ACI canonicalization (flatten, stable_sort, dedup) for union/intersection
- ITE-tree combinators for binary/unary operations
- Info-based nullability with recursive fallback
- Complement absorption rules
- Depth-bounded recursion to prevent stack overflow
Integration with seq_rewriter:
- mk_derivative(ele, r) and mk_derivative(r) now delegate to m_derive
- Removed dead mk_derivative_rec function
- Added ITE hoisting in mk_re_star, mk_re_concat, mk_re_union0,
mk_re_inter0, mk_re_complement
- Added depth limiting in Antimirov derivative helpers
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implements the algorithm of Eq(p,q) = Empty(p XOR q)' using a union-find
driven bisimulation closure (per the CAV'26 ERE paper).
### What's added
* **New primitive OP_RE_XOR (re.xor)** wired through seq_decl_plugin:
parser signature, info propagation (nullable, min_length), and
pretty-printer.
* **seq_rewriter**: structural XOR rewrites ( XOR r = empty, XOR empty =
r, ull XOR r = comp(r), comp/comp absorption, complement push, AC
normalisation), nullability (Null(p XOR q) = Null(p) != Null(q)),
derivative (D_a(p XOR q) = D_a(p) XOR D_a(q)), reverse, antimirov
derivative, and `check_deriv_normal_form` coverage.
* **New class seq::regex_bisim** in
`src/ast/rewriter/seq_regex_bisim.{h,cpp}` to keep the bisim logic out
of the already-large `seq_rewriter.cpp`. Uses `basic_union_find` from
`util/union_find.h`, an `obj_map` for the node assignment, and a
50000-step bound (returns `l_undef` on overrun).
* **Integration** in `seq_rewriter::reduce_re_eq` (with a re-entry
guard) and in `seq_regex::propagate_eq` / `propagate_ne` for ground
regexes; on `l_undef` we fall back to the existing axiomatisation.
* **`sls_seq_plugin`**: extend `OP_RE_DIFF` switch arms to also cover
`OP_RE_XOR`.
### Validation
* Full release build with MSVC + Ninja.
* `./test-z3 /a` -- 89/89 tests passing.
* `./test-z3 /seq smt2print_parse` -- PASS.
* Smoke tests with `(a|b)*` vs `(a*b*)*` (equal) and `a*` vs `(a|b)*`
(not equal) return the expected `sat`/`unsat` quickly.
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
`seq_rewriter.cpp` was missing several regex-concat normalizations
around `re.all` (`Σ*`), causing avoidable growth and missed
simplifications. This update fills the four gaps: nullable absorption,
guarded union distribution, intersection suffix elimination, and
nested-star collapse.
- **Nullable/full-seq absorption (A1)**
- Generalizes `Σ*·R → Σ*` and `R·Σ* → Σ*` beyond `Σ*·Σ*`.
- Applies when `R` is interpreted, nullable, and has `min_length = 0`.
- **Guarded distribution over union (A2)**
- Adds `Σ*·(R1 ∪ R2)` distribution when at least one arm is already
`Σ*`-headed.
- Rebuilds via normalized union so the redundant arm collapses to `Σ*`.
- **Intersection + full-seq tail elimination (A3)**
- Adds `(R1 ∩ … ∩ Rn)·Σ* → (R1 ∩ … ∩ Rn)` when every intersection leaf
already ends in `Σ*`.
- **Nested star concat collapse (A4)**
- Adds `R*·(R*·X) → R*·X`, covering non-adjacent star patterns not
handled by the prior adjacent-only rewrite.
```cpp
if (re().is_full_seq(a) && accepts_empty_word(b)) result = a; // A1
if (re().is_full_seq(a) && re().is_union(b, u1, u2) && ...) ... // A2
if (re().is_intersection(a, u1, u2) && re().is_full_seq(b) && ...) result=a; // A3
if (re().is_star(a, a1) && re().is_concat(b, b1, b2) && re().is_star(b1,b3) && a1==b3) result=b; // A4
```
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
- Cache now indexes by (ele, r) pair using obj_pair_map
- Remove eval() function; operator()(ele, r) handles all cases
- Rewrite simplify_ite_rec with path vector of signed conditions
- Add range-based simplification: (lo <= x, false) + (x <= hi, false)
eliminates ite(x = v, t, e) when v is outside [lo, hi]
- Add is_itos case in derive_to_re: guards on n >= 0, digit range,
and first character match
- Port is_reverse normalization (previous commit)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement a new seq::derive class (seq_derive.h/cpp) that computes
symbolic derivatives of regular expressions using ITE-trees, based on
the RE# approach (Varatalu, Veanes, Ernits - POPL 2025).
Key features:
- Two-argument operator()(ele, r): computes derivative of regex r w.r.t.
element ele (concrete character or de Bruijn variable for symbolic mode)
- ACI canonicalization (flatten, stable_sort, dedup) for union/intersection
- ITE-tree combinators for binary/unary operations
- Info-based nullability with recursive fallback
- Complement absorption rules
- Depth-bounded recursion to prevent stack overflow
Integration with seq_rewriter:
- mk_derivative(ele, r) and mk_derivative(r) now delegate to m_derive
- Removed dead mk_derivative_rec function
- Added ITE hoisting in mk_re_star, mk_re_concat, mk_re_union0,
mk_re_inter0, mk_re_complement
- Added depth limiting in Antimirov derivative helpers
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
add simplification rule for at(x, offset) = ""
Introducing j just postpones some rewrites that prevent useful simplifications. Z3 already uses common sub-expressions.
The example highlights some opportunities for simplification, noteworthy at(..) = "".
The example is solved in both versions after adding this simplification.
* Introduce X-macro-based trace tag definition
- Created trace_tags.def to centralize TRACE tag definitions
- Each tag includes a symbolic name and description
- Set up enum class TraceTag for type-safe usage in TRACE macros
* Add script to generate Markdown documentation from trace_tags.def
- Python script parses trace_tags.def and outputs trace_tags.md
* Refactor TRACE_NEW to prepend TraceTag and pass enum to is_trace_enabled
* trace: improve trace tag handling system with hierarchical tagging
- Introduce hierarchical tag-class structure: enabling a tag class activates all child tags
- Unify TRACE, STRACE, SCTRACE, and CTRACE under enum TraceTag
- Implement initial version of trace_tag.def using X(tag, tag_class, description)
(class names and descriptions to be refined in a future update)
* trace: replace all string-based TRACE tags with enum TraceTag
- Migrated all TRACE, STRACE, SCTRACE, and CTRACE macros to use enum TraceTag values instead of raw string literals
* trace : add cstring header
* trace : Add Markdown documentation generation from trace_tags.def via mk_api_doc.py
* trace : rename macro parameter 'class' to 'tag_class' and remove Unicode comment in trace_tags.h.
* trace : Add TODO comment for future implementation of tag_class activation
* trace : Disable code related to tag_class until implementation is ready (#7663).