3
0
Fork 0
mirror of https://github.com/Z3Prover/z3 synced 2025-08-08 12:11:23 +00:00

Integrate new regex solver (#4602)

* std::cout debugging statements

* comment out std::cout debugging as this is now a shared fork

* convert std::cout to TRACE statements for seq_rewriter and seq_regex

* add cases to min_length and max_length for regexes

* bug fix

* update min_length and max_length functions for REs

* initial pass on simplifying derivative normal forms by eliminating redundant predicates locally

* add seq_regex_brief trace statements

* working on debugging ref count issue

* fix ref count bug and convert trace statements to seq_regex_brief

* add compact tracing for cache hits/misses

* seq_regex fix cache hit/miss tracing and wrapper around is_nullable

* minor

* label and disable more experimental changes for testing

* minor documentation / tracing

* a few more @EXP annotations

* dead state elimination skeleton code

* progress on dead state elimination

* more progress on dead state elimination

* refactor dead state class to separate self-contained state_graph class

* finish factoring state_graph to only work with unsigned values, and implement separate functionality for expr* logic

* implement get_all_derivatives, add debug tracing

* trace statements for debugging is_nullable loop bug

* fix is_nullable loop bug

* comment out local nullable change and mark experimental

* pretty printing for state_graph

* rewrite state graph to remove the fragile assumption that all edges from a state are added at a time

* start of general cycle detection check + fix some comments

* implement full cycle detection procedure

* normalize derivative conditions to form 'ele <= a'

* order derivative conditions by character code

* fix confusing names m_to and m_from

* assign increasing state IDs from 1 instead of using get_id on AST node

* remove elim_condition call in get_dall_derivatives

* use u_map instead of uint_map to avoid memory leak

* remove unnecessary call to is_ground

* debugging

* small improvements to seq_regex_brief tracing

* fix bug on evil2 example

* save work

* new propagate code

* work in progress on using same seq sort for deriv calls

* avoid re-computing derivatives: use same head var for every derivative call

* use min_length on regexes to prune search

* simple implementation of can_be_in_cycle using rank function idea

* add a disabled experimental change

* minor cleanup comments, etc.

* seq_rewriter cleanup for PR

* typo noticed by Nikolaj

* move state graph to util/state_graph

* re-add accidentally removed line

* clean up seq_regex code removing obsolete functions and comments

* a few more cleanup items

* remove experimental functionality for integration

* fix compilation

* remove some tracing and TODOs

* remove old comment

* update copyright dates to 2020

* feedback from Nikolaj

* use [] for map access

* make state_graph methods constant

* avoid recursion in mark_dead_recursive and mark_live_recursive

* a possible bug fix in propagate_nonempty

* write down list of invariants in state_graph

* implement partial invariant check and insert CASSERT statements

* expand on invariant check and tracing

* finish state graph invariant check

* minor tweaks

* regex propagation: convert first two axioms to propagations

* remove obsolete regex solver functionality

Co-authored-by: calebstanford-msr <t-casta@microsoft.com>
This commit is contained in:
Caleb Stanford 2020-07-30 16:54:49 -04:00 committed by GitHub
parent 293b0b8cc2
commit 976e4c91b0
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
8 changed files with 922 additions and 257 deletions

182
src/util/state_graph.h Normal file
View file

@ -0,0 +1,182 @@
/*++
Copyright (c) 2020 Microsoft Corporation
Module Name:
state_graph.h
Abstract:
Data structure for incrementally tracking "live" and "dead" states in an
abstract transition system.
Author:
Caleb Stanford (calebstanford-msr / cdstanford) 2020-7
--*/
#pragma once
#include "util/map.h"
#include "util/uint_set.h"
#include "util/union_find.h"
#include "util/vector.h"
/*
state_graph
Data structure which is capable of incrementally tracking
live states and dead states.
"States" are integers. States and edges are added to the data
structure incrementally.
- States can be marked as "live" or "done".
"Done" signals that (1) no more outgoing edges will be
added and (2) the state will not be marked as live. The data
structure then tracks
which other states are live (can reach a live state), dead
(can't reach a live state), or neither.
- Some edges are labeled as not contained in a cycle. This is to
optimize search if it is known by the user of the structure
that no cycle will ever contain this edge.
Internally, we use union_find to identify states within an SCC,
and incrementally update SCCs, while propagating backwards
live and dead SCCs.
*/
class state_graph {
public:
typedef unsigned state;
typedef uint_set state_set;
typedef u_map<state_set> edge_rel;
typedef basic_union_find state_ufind;
private:
/*
All states are internally exactly one of:
- live: known to reach a live state
- dead: known to never reach a live state
- unknown: all outgoing edges have been added, but the
state is not known to be live or dead
- unexplored: not all outgoing edges have been added
As SCCs are merged, some states become aliases, and a
union find data structure collapses a now obsolete
state to its current representative. m_seen keeps track
of states we have seen, including obsolete states.
*/
state_set m_live;
state_set m_dead;
state_set m_unknown;
state_set m_unexplored;
state_set m_seen;
state_ufind m_state_ufind;
/*
Edges are saved in both from and to maps.
A subset of edges are also marked as possibly being
part of a cycle by being stored in m_sources_maybecycle.
*/
edge_rel m_sources;
edge_rel m_targets;
edge_rel m_sources_maybecycle;
/*
CLASS INVARIANTS
*** To enable checking invariants, run z3 with -dbg:state_graph
(must also be in debug mode) ***
State invariants:
- live, dead, unknown, and unexplored form a partition of
the set of roots in m_state_ufind
- all of these are subsets of m_seen
- everything in m_seen is an integer less than the number of variables
in m_state_ufind
Edge invariants:
- all edges are between roots of m_state_ufind
- m_sources and m_targets are converses of each other
- no self-loops
- m_sources_maybecycle is a subrelation of m_sources
Relationship between states and edges:
- every state with a live target is live
- every state with a dead source is dead
- every state with only dead targets is dead
- there are no cycles of unknown states on maybecycle edges
*/
#ifdef Z3DEBUG
bool is_subset(state_set set1, state_set set2) const;
bool is_disjoint(state_set set1, state_set set2) const;
bool check_invariant() const;
#endif
/*
'Core' functions that modify the plain graph, without
updating SCCs or propagating live/dead state information.
These are for internal use only.
*/
void add_state_core(state s); // unexplored + seen
void remove_state_core(state s); // unknown + seen -> seen
void mark_unknown_core(state s); // unexplored -> unknown
void mark_live_core(state s); // unknown -> live
void mark_dead_core(state s); // unknown -> dead
void add_edge_core(state s1, state s2, bool maybecycle);
void remove_edge_core(state s1, state s2);
void rename_edge_core(state old1, state old2, state new1, state new2);
state merge_states(state s1, state s2);
state merge_states(state_set& s_set);
/*
Algorithmic search routines
- live state propagation
- dead state propagation
- cycle / strongly-connected component detection
*/
void mark_live_recursive(state s);
bool all_targets_dead(state s);
void mark_dead_recursive(state s);
state merge_all_cycles(state s);
public:
state_graph():
m_live(), m_dead(), m_unknown(), m_unexplored(), m_seen(),
m_state_ufind(), m_sources(), m_targets(), m_sources_maybecycle()
{
CASSERT("state_graph", check_invariant());
}
/*
Exposed methods
These methods may be called in any order, as long as:
- states are added before edges are added between them
- outgoing edges are not added from a done state
- a done state is not marked as live
- edges are not added creating a cycle containing an edge with
maybecycle = false (this is not necessary for soundness, but
prevents completeness for successfully detecting dead states)
*/
void add_state(state s);
void add_edge(state s1, state s2, bool maybecycle);
void mark_live(state s);
void mark_done(state s);
bool is_seen(state s) const;
bool is_live(state s) const;
bool is_dead(state s) const;
bool is_done(state s) const;
unsigned get_size() const;
/*
Pretty printing
*/
std::ostream& display(std::ostream& o) const;
};