From 7f2f66d47c705da489ea6c1b3344c74f7b80f0d2 Mon Sep 17 00:00:00 2001 From: Margus Veanes Date: Sun, 14 Jun 2026 22:30:25 -0700 Subject: [PATCH] Document why CHARCLASS derivative uses nested ITE instead of single OR The shallower single-ITE-with-OR-of-ranges form fails to reduce under constant-string evaluation because elim_condition (and the path simplifier upstream of it) only handle conjunctions via flatten_and. A disjunction at the top of the condition is opaque, so nullability checks after each derivative step cannot resolve. The nested-ITE form keeps each cond as a conjunction of two <= checks, which elim_condition can collapse to true/false when the input character is concrete. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- src/ast/rewriter/seq_rewriter.cpp | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/src/ast/rewriter/seq_rewriter.cpp b/src/ast/rewriter/seq_rewriter.cpp index c8b68bad2..11624698c 100644 --- a/src/ast/rewriter/seq_rewriter.cpp +++ b/src/ast/rewriter/seq_rewriter.cpp @@ -3089,7 +3089,10 @@ void seq_rewriter::mk_antimirov_deriv_rec(expr* e, expr* r, expr* path, expr_ref // CHARCLASS = disjoint union of ranges [lo_0,hi_0] | ... | [lo_{N-1},hi_{N-1}]. // D(e, CHARCLASS) = ite(P(e), epsilon, []) where P(e) is the disjunction // of (lo_i <= e <= hi_i). We build a chain of nested ITEs, one per range, - // so the result is in antimirov BDD normal form. + // so the result is in antimirov BDD normal form. A single ITE with an + // OR-of-ranges condition would be shallower but elim_condition only + // simplifies conjunctions, so constant-string evaluation (which probes + // nullability after each derivative) would no longer reduce. app const* a = to_app(r); func_decl* d = a->get_decl(); unsigned np = d->get_num_parameters();