Add OP_RE_XOR and union-find bisimulation for ground regex equivalence (#9804)

Implements the algorithm of Eq(p,q) = Empty(p XOR q)' using a union-find driven bisimulation closure (per the CAV'26 ERE paper). ### What's added * **New primitive OP_RE_XOR (re.xor)** wired through seq_decl_plugin: parser signature, info propagation (nullable, min_length), and pretty-printer. * **seq_rewriter**: structural XOR rewrites ( XOR r = empty, XOR empty = r, ull XOR r = comp(r), comp/comp absorption, complement push, AC normalisation), nullability (Null(p XOR q) = Null(p) != Null(q)), derivative (D_a(p XOR q) = D_a(p) XOR D_a(q)), reverse, antimirov derivative, and `check_deriv_normal_form` coverage. * **New class seq::regex_bisim** in `src/ast/rewriter/seq_regex_bisim.{h,cpp}` to keep the bisim logic out of the already-large `seq_rewriter.cpp`. Uses `basic_union_find` from `util/union_find.h`, an `obj_map` for the node assignment, and a 50000-step bound (returns `l_undef` on overrun). * **Integration** in `seq_rewriter::reduce_re_eq` (with a re-entry guard) and in `seq_regex::propagate_eq` / `propagate_ne` for ground regexes; on `l_undef` we fall back to the existing axiomatisation. * **`sls_seq_plugin`**: extend `OP_RE_DIFF` switch arms to also cover `OP_RE_XOR`. ### Validation * Full release build with MSVC + Ninja. * `./test-z3 /a` -- 89/89 tests passing. * `./test-z3 /seq smt2print_parse` -- PASS. * Smoke tests with `(a|b)*` vs `(a*b*)*` (equal) and `a*` vs `(a|b)*` (not equal) return the expected `sat`/`unsat` quickly. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-06-27 10:58:48 +00:00 · 2026-06-10 14:58:20 -07:00 · 2026-06-10 14:58:20 -07:00 · 513b81253b
commit 513b81253b
parent 589bd9e6f5
9 changed files with 664 additions and 20 deletions
--- a/src/smt/seq_regex.cpp
+++ b/src/smt/seq_regex.cpp
@ -21,6 +21,7 @@ Author:
 #include "ast/expr_abstract.h"
 #include "ast/ast_util.h"
 #include "ast/for_each_expr.h"
+#include "ast/rewriter/seq_regex_bisim.h"
 #include <ast/rewriter/expr_safe_replace.h>

 namespace smt {
@ -460,6 +461,23 @@ namespace smt {
        if (re().is_empty(r))
            //trivially true
            return;
+        // Try the bisimulation procedure on ground regexes first.  If it
+        // returns a definite answer, dispatch the corresponding axiom and
+        // bypass the symbolic emptiness/derivative closure.
+        if (is_ground(r1) && is_ground(r2)) {
+            seq::regex_bisim bisim(seq_rw());
+            switch (bisim.are_equivalent(r1, r2)) {
+            case l_true:
+                STRACE(seq_regex_brief, tout << "bisim:eq ";);
+                return; // trivially true
+            case l_false:
+                STRACE(seq_regex_brief, tout << "bisim:neq ";);
+                th.add_axiom(~th.mk_eq(r1, r2, false), false_literal);
+                return;
+            case l_undef:
+                break;
+            }
+        }
        expr_ref emp(re().mk_empty(r->get_sort()), m);
        expr_ref f(m.mk_fresh_const("re.char", seq_sort), m); 
        expr_ref is_empty = sk().mk_is_empty(r, r, f);
@ -478,6 +496,20 @@ namespace smt {
        sort* seq_sort = nullptr;
        VERIFY(u().is_re(r1, seq_sort));
        expr_ref r = symmetric_diff(r1, r2);
+        if (is_ground(r1) && is_ground(r2)) {
+            seq::regex_bisim bisim(seq_rw());
+            switch (bisim.are_equivalent(r1, r2)) {
+            case l_true:
+                STRACE(seq_regex_brief, tout << "bisim:eq ";);
+                th.add_axiom(th.mk_eq(r1, r2, false), false_literal);
+                return;
+            case l_false:
+                STRACE(seq_regex_brief, tout << "bisim:neq ";);
+                return; // trivially satisfied
+            case l_undef:
+                break;
+            }
+        }
        expr_ref emp(re().mk_empty(r->get_sort()), m);
        expr_ref n(m.mk_fresh_const("re.char", seq_sort), m);
        expr_ref is_non_empty = sk().mk_is_non_empty(r, r, n);