3
0
Fork 0
mirror of https://github.com/Z3Prover/z3 synced 2026-06-19 15:16:29 +00:00
Commit graph

414 commits

Author SHA1 Message Date
Nikolaj Bjorner
c9cd5147be merge
Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>
2026-06-10 15:31:01 -07:00
copilot-swe-agent[bot]
1e906ba585 Remove is_nullable_rec from seq_rewriter, delegate to derive::nullable 2026-06-10 15:27:42 -07:00
copilot-swe-agent[bot]
bf9707a316 Address PR feedback on derive, nullability, and requested reverts 2026-06-10 15:26:40 -07:00
Nikolaj Bjorner
0e29a35da5 updates
Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>
2026-06-10 15:25:05 -07:00
Nikolaj Bjorner
2e3dd32b90 Address PR review comments: cache, simplify_ite_rec, itos
- Cache now indexes by (ele, r) pair using obj_pair_map
- Remove eval() function; operator()(ele, r) handles all cases
- Rewrite simplify_ite_rec with path vector of signed conditions
- Add range-based simplification: (lo <= x, false) + (x <= hi, false)
  eliminates ite(x = v, t, e) when v is outside [lo, hi]
- Add is_itos case in derive_to_re: guards on n >= 0, digit range,
  and first character match
- Port is_reverse normalization (previous commit)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-10 15:23:56 -07:00
Nikolaj Bjorner
2b06d6ddb2 Add simplify_ite_rec and eval for two-phase derivative
- Add simplify_ite post-processing in operator() to simplify ITE conditions
- Add simplify_ite_rec(cond, sign, r) for propagating condition truth values
- Handles c == cond, x=ch1 vs x=ch2 with different constants
- Add eval(ele, d) for efficient two-phase: symbolic derivative + concrete eval
- mk_derivative uses two-phase pattern: m_derive(r) then m_derive.eval(ele, d)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-10 15:23:55 -07:00
Nikolaj Bjorner
e74d2d2151 move seq_derive and fix include paths, remove antimirov code 2026-06-10 15:23:46 -07:00
Nikolaj Bjorner
52c7e89c31 Add seq::derive class for symbolic regex derivatives
Implement a new seq::derive class (seq_derive.h/cpp) that computes
symbolic derivatives of regular expressions using ITE-trees, based on
the RE# approach (Varatalu, Veanes, Ernits - POPL 2025).

Key features:
- Two-argument operator()(ele, r): computes derivative of regex r w.r.t.
  element ele (concrete character or de Bruijn variable for symbolic mode)
- ACI canonicalization (flatten, stable_sort, dedup) for union/intersection
- ITE-tree combinators for binary/unary operations
- Info-based nullability with recursive fallback
- Complement absorption rules
- Depth-bounded recursion to prevent stack overflow

Integration with seq_rewriter:
- mk_derivative(ele, r) and mk_derivative(r) now delegate to m_derive
- Removed dead mk_derivative_rec function
- Added ITE hoisting in mk_re_star, mk_re_concat, mk_re_union0,
  mk_re_inter0, mk_re_complement
- Added depth limiting in Antimirov derivative helpers

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-10 15:21:48 -07:00
Margus Veanes
513b81253b
Add OP_RE_XOR and union-find bisimulation for ground regex equivalence (#9804)
Implements the algorithm of Eq(p,q) = Empty(p XOR q)' using a union-find
driven bisimulation closure (per the CAV'26 ERE paper).

### What's added

* **New primitive OP_RE_XOR (re.xor)** wired through seq_decl_plugin:
parser signature, info propagation (nullable, min_length), and
pretty-printer.
* **seq_rewriter**: structural XOR rewrites ( XOR r = empty, XOR empty =
r, ull XOR r = comp(r), comp/comp absorption, complement push, AC
normalisation), nullability (Null(p XOR q) = Null(p) != Null(q)),
derivative (D_a(p XOR q) = D_a(p) XOR D_a(q)), reverse, antimirov
derivative, and `check_deriv_normal_form` coverage.
* **New class seq::regex_bisim** in
`src/ast/rewriter/seq_regex_bisim.{h,cpp}` to keep the bisim logic out
of the already-large `seq_rewriter.cpp`. Uses `basic_union_find` from
`util/union_find.h`, an `obj_map` for the node assignment, and a
50000-step bound (returns `l_undef` on overrun).
* **Integration** in `seq_rewriter::reduce_re_eq` (with a re-entry
guard) and in `seq_regex::propagate_eq` / `propagate_ne` for ground
regexes; on `l_undef` we fall back to the existing axiomatisation.
* **`sls_seq_plugin`**: extend `OP_RE_DIFF` switch arms to also cover
`OP_RE_XOR`.

### Validation

* Full release build with MSVC + Ninja.
* `./test-z3 /a` -- 89/89 tests passing.
* `./test-z3 /seq smt2print_parse` -- PASS.
* Smoke tests with `(a|b)*` vs `(a*b*)*` (equal) and `a*` vs `(a|b)*`
(not equal) return the expected `sat`/`unsat` quickly.

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-10 14:58:20 -07:00
copilot-swe-agent[bot]
b6a29b800b
Remove is_nullable_rec from seq_rewriter, delegate to derive::nullable 2026-06-10 18:53:55 +00:00
copilot-swe-agent[bot]
00fcd3a36d
Address PR feedback on derive, nullability, and requested reverts 2026-06-10 18:18:46 +00:00
Nikolaj Bjorner
77ac58484f updates
Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>
2026-06-09 17:42:11 -07:00
Nikolaj Bjorner
f02391a01f Merge branch 'master' of https://github.com/z3prover/z3 into derive 2026-06-09 15:18:29 -07:00
Copilot
e093be8b60
seq_rewriter: add missing concat rewrites for nullable/full-seq/star cases (#9782)
`seq_rewriter.cpp` was missing several regex-concat normalizations
around `re.all` (`Σ*`), causing avoidable growth and missed
simplifications. This update fills the four gaps: nullable absorption,
guarded union distribution, intersection suffix elimination, and
nested-star collapse.

- **Nullable/full-seq absorption (A1)**
  - Generalizes `Σ*·R → Σ*` and `R·Σ* → Σ*` beyond `Σ*·Σ*`.
  - Applies when `R` is interpreted, nullable, and has `min_length = 0`.

- **Guarded distribution over union (A2)**
- Adds `Σ*·(R1 ∪ R2)` distribution when at least one arm is already
`Σ*`-headed.
- Rebuilds via normalized union so the redundant arm collapses to `Σ*`.

- **Intersection + full-seq tail elimination (A3)**
- Adds `(R1 ∩ … ∩ Rn)·Σ* → (R1 ∩ … ∩ Rn)` when every intersection leaf
already ends in `Σ*`.

- **Nested star concat collapse (A4)**
- Adds `R*·(R*·X) → R*·X`, covering non-adjacent star patterns not
handled by the prior adjacent-only rewrite.

```cpp
if (re().is_full_seq(a) && accepts_empty_word(b)) result = a;               // A1
if (re().is_full_seq(a) && re().is_union(b, u1, u2) && ...) ...             // A2
if (re().is_intersection(a, u1, u2) && re().is_full_seq(b) && ...) result=a; // A3
if (re().is_star(a, a1) && re().is_concat(b, b1, b2) && re().is_star(b1,b3) && a1==b3) result=b; // A4
```

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
2026-06-09 14:38:38 -07:00
Copilot
f0956a622f
Refactor regex subset logic into seq_subset with depth-bounded recursion and optimized concat traversal (#9777)
`seq_rewriter::is_subset` was too localized and missed key subset
implications for regex concatenations. This change extracts subset
reasoning into a dedicated component and adds heuristic
closure/monotonicity rules, then tunes the recursion strategy based on
profiling feedback.

- **Architecture: isolate subset reasoning**
  - Introduce `seq_subset` in `src/ast/rewriter` (`seq_subset.h/.cpp`).
- Add `seq_subset` as an attribute on `seq_rewriter` and route
`seq_rewriter::is_subset` through it.
- Keep `seq_rewriter` focused on rewrite orchestration while subset
logic evolves independently.

- **Subset rules: broaden inferable cases**
- Add derive-style subset decomposition across `union`, `intersection`,
`complement`, `concat`, and bounded `loop`.
  - Add E3-style closure rules:
    - `R ⊆ R*`
    - `R1* ⊆ R2*  ⇐  R1 ⊆ R2`
    - `R1+ ⊆ R2+  ⇐  R1 ⊆ R2`
  - Add missing cheap cases:
    - `ε ⊆ R` when `R` is nullable
    - `R ⊆ R+`
    - `R+ ⊆ R*`
    - Range containment: `[c1–c2] ⊆ [c3–c4]` when `c3 ≤ c1 ∧ c2 ≤ c4`
    - `to_re(s) ⊆ range` for single-character string constants
    - Difference monotonicity: `a1 \ a2 ⊆ b` when `a1 ⊆ b`
- Star absorption checks for concat/star combinations (`R·R* ⊆ R*`,
`R*·R ⊆ R*`)
- Preserve nullable-based `. +` handling and top/bottom regular-language
shortcuts.

- **Concatenation reasoning and traversal tuning**
- Remove `flatten_concat` and assume right-associative concatenation
traversal.
- Keep containment shortcuts for both `R ⊆ Σ*·R'` and `R ⊆ R'·Σ*` when
`R ⊆ R'`.
  - Make concat/concat handling tail-recursive on second arguments.

- **Depth-bounded recursion (profiling follow-up)**
- Replace visited-pair hash-table recursion state with an explicit depth
parameter in `is_subset_rec`.
  - Add `m_max_depth = 3` and return `false` when the bound is reached.
- Increment depth on recursive calls, except for the tail-recursive
concat-second-argument step.

- **Build integration**
  - Register `seq_subset.cpp` in `src/ast/rewriter/CMakeLists.txt`.

```cpp
// seq_rewriter.cpp
bool seq_rewriter::is_subset(expr* r1, expr* r2) const {
    return m_subset.is_subset(r1, r2);
}
```

---------

Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Nikolaj Bjorner <nbjorner@microsoft.com>
2026-06-09 13:42:28 -07:00
Nikolaj Bjorner
3afd83103a Address PR review comments: cache, simplify_ite_rec, itos
- Cache now indexes by (ele, r) pair using obj_pair_map
- Remove eval() function; operator()(ele, r) handles all cases
- Rewrite simplify_ite_rec with path vector of signed conditions
- Add range-based simplification: (lo <= x, false) + (x <= hi, false)
  eliminates ite(x = v, t, e) when v is outside [lo, hi]
- Add is_itos case in derive_to_re: guards on n >= 0, digit range,
  and first character match
- Port is_reverse normalization (previous commit)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-03 17:16:23 -07:00
Nikolaj Bjorner
f8925ca6fa Add simplify_ite_rec and eval for two-phase derivative
- Add simplify_ite post-processing in operator() to simplify ITE conditions
- Add simplify_ite_rec(cond, sign, r) for propagating condition truth values
- Handles c == cond, x=ch1 vs x=ch2 with different constants
- Add eval(ele, d) for efficient two-phase: symbolic derivative + concrete eval
- mk_derivative uses two-phase pattern: m_derive(r) then m_derive.eval(ele, d)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-03 14:25:03 -07:00
Nikolaj Bjorner
cb2cf913e3 move seq_derive and fix include paths, remove antimirov code 2026-06-03 11:04:19 -07:00
Nikolaj Bjorner
1f28fd0e6b Add seq::derive class for symbolic regex derivatives
Implement a new seq::derive class (seq_derive.h/cpp) that computes
symbolic derivatives of regular expressions using ITE-trees, based on
the RE# approach (Varatalu, Veanes, Ernits - POPL 2025).

Key features:
- Two-argument operator()(ele, r): computes derivative of regex r w.r.t.
  element ele (concrete character or de Bruijn variable for symbolic mode)
- ACI canonicalization (flatten, stable_sort, dedup) for union/intersection
- ITE-tree combinators for binary/unary operations
- Info-based nullability with recursive fallback
- Complement absorption rules
- Depth-bounded recursion to prevent stack overflow

Integration with seq_rewriter:
- mk_derivative(ele, r) and mk_derivative(r) now delegate to m_derive
- Removed dead mk_derivative_rec function
- Added ITE hoisting in mk_re_star, mk_re_concat, mk_re_union0,
  mk_re_inter0, mk_re_complement
- Added depth limiting in Antimirov derivative helpers

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-03 10:36:19 -07:00
Nikolaj Bjorner
1137d23725 fix bug reported in API coherence report
Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>
2026-03-19 23:20:55 -07:00
Nikolaj Bjorner
ffd9207bc5 fix #8572
Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>
2026-02-10 20:22:27 -08:00
Copilot
c4f75bc85a
Refactor seq_rewriter to use C++17 structured bindings (#8381)
* Initial plan

* Refactor seq_rewriter.cpp to use C++17 structured bindings

Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>

* Address code review feedback - move pair declaration inside loop

Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>
2026-01-27 12:06:11 -08:00
Copilot
2436943794
Standardize for-loop increments to prefix form (++i) (#8199)
* Initial plan

* Convert postfix to prefix increment in for loops

Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>

* Fix member variable increment conversion bug

Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>

* Update API generator to produce prefix increments

Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>
2026-01-14 19:55:31 -08:00
Lev Nachmanson
641741f3a8 parameter eval order
Signed-off-by: Lev Nachmanson <levnach@hotmail.com>
2025-10-07 10:30:58 -07:00
Lev Nachmanson
8af9a20e01 parameter eval order
Signed-off-by: Lev Nachmanson <levnach@hotmail.com>
2025-10-07 10:26:40 -07:00
Lev Nachmanson
6a9520bdc2 parameter eval order
Signed-off-by: Lev Nachmanson <levnach@hotmail.com>
2025-10-07 10:21:09 -07:00
Lev Nachmanson
8ccf4cd8f7 parameter eval order
Signed-off-by: Lev Nachmanson <levnach@hotmail.com>
2025-10-07 10:19:24 -07:00
Lev Nachmanson
40b980079b parameter eval order
Signed-off-by: Lev Nachmanson <levnach@hotmail.com>
2025-10-07 10:14:02 -07:00
Lev Nachmanson
a41549eee6 parameter eval order
Signed-off-by: Lev Nachmanson <levnach@hotmail.com>
2025-10-07 10:06:43 -07:00
Lev Nachmanson
2b3068d85f parameter eval order
Signed-off-by: Lev Nachmanson <levnach@hotmail.com>
2025-10-07 09:17:12 -07:00
Lev Nachmanson
3a2bbf4802 param eval order 2025-10-07 09:13:21 -07:00
Lev Nachmanson
6e52b9584c param eval 2025-10-07 09:04:24 -07:00
Lev Nachmanson
93ff8c76db parameter evaluation order 2025-10-07 08:53:49 -07:00
Lev Nachmanson
00f1e6af7e parameter eval order 2025-10-07 08:40:24 -07:00
Lev Nachmanson
c154b9df90 param order evaluation 2025-10-07 08:34:56 -07:00
Lev Nachmanson
77c70bf812 param order
Signed-off-by: Lev Nachmanson <levnach@hotmail.com>
2025-10-06 15:52:09 -07:00
Lev Nachmanson
63bb367a10 param order
Signed-off-by: Lev Nachmanson <levnach@hotmail.com>
2025-10-06 15:52:09 -07:00
Nikolaj Bjorner
6df8b39718 Update seq_rewriter.cpp 2025-08-14 14:40:26 -07:00
Nikolaj Bjorner
fcd3a70c92 remove theory_str and classes that are only used by it 2025-08-07 21:05:12 -07:00
Nikolaj Bjorner
1f8b08108c #7739 optimization
add simplification rule for at(x, offset) = ""

Introducing j just postpones some rewrites that prevent useful simplifications. Z3 already uses common sub-expressions.
The example highlights some opportunities for simplification, noteworthy at(..) = "".
The example is solved in both versions after adding this simplification.
2025-07-26 14:02:34 -07:00
LeeYoungJoon
0a93ff515d
Centralize and document TRACE tags using X-macros (#7657)
* Introduce X-macro-based trace tag definition
- Created trace_tags.def to centralize TRACE tag definitions
- Each tag includes a symbolic name and description
- Set up enum class TraceTag for type-safe usage in TRACE macros

* Add script to generate Markdown documentation from trace_tags.def
- Python script parses trace_tags.def and outputs trace_tags.md

* Refactor TRACE_NEW to prepend TraceTag and pass enum to is_trace_enabled

* trace: improve trace tag handling system with hierarchical tagging

- Introduce hierarchical tag-class structure: enabling a tag class activates all child tags
- Unify TRACE, STRACE, SCTRACE, and CTRACE under enum TraceTag
- Implement initial version of trace_tag.def using X(tag, tag_class, description)
  (class names and descriptions to be refined in a future update)

* trace: replace all string-based TRACE tags with enum TraceTag
- Migrated all TRACE, STRACE, SCTRACE, and CTRACE macros to use enum TraceTag values instead of raw string literals

* trace : add cstring header

* trace : Add Markdown documentation generation from trace_tags.def via mk_api_doc.py

* trace : rename macro parameter 'class' to 'tag_class' and remove Unicode comment in trace_tags.h.

* trace : Add TODO comment for future implementation of tag_class activation

* trace : Disable code related to tag_class until implementation is ready (#7663).
2025-05-28 14:31:25 +01:00
Nikolaj Bjorner
99ec42c0d7 additional simplifications to seq 2025-03-19 08:57:31 -10:00
Nikolaj Bjorner
80f00f191a fix #7572 and fix #7574
Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>
2025-03-07 10:46:29 -08:00
Nikolaj Bjorner
fb0eb029a8 use lifted bool 2025-01-21 09:13:38 -08:00
Clemens Eisenhofer
1553bae20c
Performance improvements for seq-sls (#7519)
* Improve length repair

* Fixed arguments

* Special case regex membership with constant string

* Trying hybrid eq-repair strategy

* Different heuristic

* Fixed stoi
2025-01-21 08:01:59 -08:00
Nikolaj Bjorner
c6f58c8bf7 updates to some_string_in_re per code review comments
Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>
2025-01-11 17:47:27 -08:00
Clemens Eisenhofer
c572fc2e4f
Regex membership (#7506)
* Make finding a word in the regex iterative

* Fixed gc problem
2025-01-11 17:41:37 -08:00
Nikolaj Bjorner
c1a62d346c add missing return 2025-01-07 21:02:02 -08:00
Nikolaj Bjorner
1cb126f3dd remove assertion that doesn't build 2025-01-07 17:16:33 -08:00
Nikolaj Bjorner
2dd4faf598 sketch expr_inverter approach for eliminating unconstrained regex containment. 2025-01-07 16:53:57 -08:00