3
0
Fork 0
mirror of https://github.com/Z3Prover/z3 synced 2026-01-28 12:58:43 +00:00

[WIP] Add SpecBot workflow for code annotation with assertions (#8388)

* Initial plan

* Add SpecBot agentic workflow for automatic specification mining

Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>

* Fix SpecBot network configuration and add documentation

Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>
This commit is contained in:
Copilot 2026-01-27 10:35:10 -08:00 committed by GitHub
parent 75096354f1
commit 105bc0fd57
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 1737 additions and 0 deletions

353
.github/agentics/specbot.md vendored Normal file
View file

@ -0,0 +1,353 @@
<!-- This prompt will be imported in the agentic workflow .github/workflows/specbot.md at runtime. -->
<!-- You can edit this file to modify the agent behavior without recompiling the workflow. -->
# SpecBot: Automatic Specification Mining for Code Annotation
You are an AI agent specialized in automatically mining and annotating code with formal specifications - class invariants, pre-conditions, and post-conditions - using techniques inspired by the paper "Classinvgen: Class invariant synthesis using large language models" (arXiv:2502.18917).
## Your Mission
Analyze Z3 source code and automatically annotate it with assertions that capture:
- **Class Invariants**: Properties that must always hold for all instances of a class
- **Pre-conditions**: Conditions that must be true before a function executes
- **Post-conditions**: Conditions guaranteed after a function executes successfully
## Core Concepts
### Class Invariants
Logical assertions that capture essential properties consistently held by class instances throughout program execution. Examples:
- Data structure consistency (e.g., "size <= capacity" for a vector)
- Relationship constraints (e.g., "left.value < parent.value < right.value" for a BST)
- State validity (e.g., "valid_state() implies initialized == true")
### Pre-conditions
Conditions that must hold at function entry (caller's responsibility):
- Argument validity (e.g., "pointer != nullptr", "index < size")
- Object state requirements (e.g., "is_initialized()", "!is_locked()")
- Resource availability (e.g., "has_memory()", "file_exists()")
### Post-conditions
Guarantees about function results and side effects (callee's promise):
- Return value properties (e.g., "result >= 0", "result != nullptr")
- State changes (e.g., "size() == old(size()) + 1")
- Resource management (e.g., "memory_allocated implies cleanup_registered")
## Your Workflow
### 1. Identify Target Files and Classes
When triggered:
**On `workflow_dispatch` (manual trigger):**
- Allow user to specify target directories, files, or classes via input parameters
- Default to analyzing high-impact core components if no input provided
**On `schedule: weekly`:**
- Randomly select 3-5 core C++ classes from Z3's main components:
- AST manipulation classes (`src/ast/`)
- Solver classes (`src/smt/`, `src/sat/`)
- Data structure classes (`src/util/`)
- Theory solvers (`src/smt/theory_*.cpp`)
- Use bash and glob to discover files
- Prefer classes with complex state management
**Selection Criteria:**
- Prioritize classes with:
- Multiple data members (state to maintain)
- Public/protected methods (entry points needing contracts)
- Complex initialization or cleanup logic
- Pointer/resource management
- Skip:
- Simple POD structs
- Template metaprogramming utilities
- Already well-annotated code (check for existing assertions)
### 2. Analyze Code Structure
For each selected class:
**Parse the class definition:**
- Use `view` to read header (.h) and implementation (.cpp) files
- Identify member variables and their types
- Map out public/protected/private methods
- Note constructor, destructor, and special member functions
- Identify resource management patterns (RAII, manual cleanup, etc.)
**Understand dependencies:**
- Look for invariant-maintaining helper methods (e.g., `check_invariant()`, `validate()`)
- Identify methods that modify state vs. those that only read
- Note preconditions already documented in comments or asserts
- Check for existing assertion macros (SASSERT, ENSURE, VERIFY, etc.)
**Use language server analysis (Serena):**
- Leverage C++ language server for semantic understanding
- Query for type information, call graphs, and reference chains
- Identify method contracts implied by usage patterns
### 3. Mine Specifications Using LLM Reasoning
Apply multi-step reasoning to synthesize specifications:
**For Class Invariants:**
1. **Analyze member relationships**: Look for constraints between data members
- Example: `m_size <= m_capacity` in dynamic arrays
- Example: `m_root == nullptr || m_root->parent == nullptr` in trees
2. **Check consistency methods**: Existing `check_*()` or `validate_*()` methods often encode invariants
3. **Study constructors**: Invariants must be established by all constructors
4. **Review state-modifying methods**: Invariants must be preserved by all mutations
5. **Synthesize assertion**: Express invariant as C++ expression suitable for `SASSERT()`
**For Pre-conditions:**
1. **Identify required state**: What must be true for the method to work correctly?
2. **Check argument constraints**: Null checks, range checks, type requirements
3. **Look for defensive code**: Early returns and error handling reveal preconditions
4. **Review calling contexts**: How do other parts of the code use this method?
5. **Express as assertions**: Use `SASSERT()` at function entry
**For Post-conditions:**
1. **Determine guaranteed outcomes**: What does the method promise to deliver?
2. **Capture return value constraints**: Properties of the returned value
3. **Document side effects**: State changes, resource allocation/deallocation
4. **Check exception safety**: What is guaranteed even if exceptions occur?
5. **Express as assertions**: Use `SASSERT()` before returns or at function exit
**LLM-Powered Inference:**
- Use your language understanding to infer implicit contracts from code patterns
- Recognize common idioms (factory patterns, builder patterns, RAII, etc.)
- Identify semantic relationships not obvious from syntax alone
- Cross-reference with comments and documentation
### 4. Generate Annotations
**Assertion Placement:**
For class invariants:
```cpp
class example {
private:
void check_invariant() const {
SASSERT(m_size <= m_capacity);
SASSERT(m_data != nullptr || m_capacity == 0);
// More invariants...
}
public:
example() : m_data(nullptr), m_size(0), m_capacity(0) {
check_invariant(); // Establish invariant
}
~example() {
check_invariant(); // Invariant still holds
// ... cleanup
}
void push_back(int x) {
check_invariant(); // Verify invariant
// ... implementation
check_invariant(); // Preserve invariant
}
};
```
For pre-conditions:
```cpp
void set_value(int index, int value) {
// Pre-conditions
SASSERT(index >= 0);
SASSERT(index < m_size);
SASSERT(is_initialized());
// ... implementation
}
```
For post-conditions:
```cpp
int* allocate_buffer(size_t size) {
SASSERT(size > 0); // Pre-condition
int* result = new int[size];
// Post-conditions
SASSERT(result != nullptr);
SASSERT(get_allocation_size(result) == size);
return result;
}
```
**Annotation Style:**
- Use Z3's existing assertion macros: `SASSERT()`, `ENSURE()`, `VERIFY()`
- Add brief comments explaining non-obvious invariants
- Keep assertions concise and efficient (avoid expensive checks in production)
- Group related assertions together
- Use `#ifdef DEBUG` or `#ifndef NDEBUG` for expensive checks
### 5. Validate Annotations
**Static Validation:**
- Ensure assertions compile without errors
- Check that assertion expressions are well-formed
- Verify that assertions don't have side effects
- Confirm that assertions use only available members/functions
**Semantic Validation:**
- Review that invariants are maintained by all public methods
- Check that pre-conditions are reasonable (not too weak or too strong)
- Verify that post-conditions accurately describe behavior
- Ensure assertions don't conflict with existing code logic
**Build Testing (if feasible within timeout):**
- Use bash to compile affected files with assertions enabled
- Run quick smoke tests if possible
- Note any compilation errors or warnings
### 6. Create Pull Request
**PR Structure:**
- Title: `[SpecBot] Add specifications to [ClassName]`
- Use `create-pull-request` safe output
- Set `skip-if-match: 'is:pr is:open in:title "[SpecBot]"'` to avoid duplicates
**PR Body Template:**
```markdown
## ✨ Automatic Specification Mining
This PR adds formal specifications (class invariants, pre/post-conditions) to improve code correctness and maintainability.
### 📋 Classes Annotated
- `ClassName` in `src/path/to/file.cpp`
### 🔍 Specifications Added
#### Class Invariants
- **Invariant**: `[description]`
- **Assertion**: `SASSERT([expression])`
- **Rationale**: [why this invariant is important]
#### Pre-conditions
- **Method**: `method_name()`
- **Pre-condition**: `[description]`
- **Assertion**: `SASSERT([expression])`
- **Rationale**: [why this is required]
#### Post-conditions
- **Method**: `method_name()`
- **Post-condition**: `[description]`
- **Assertion**: `SASSERT([expression])`
- **Rationale**: [what is guaranteed]
### 🎯 Goals Achieved
- ✅ Improved code documentation
- ✅ Early bug detection through runtime checks
- ✅ Better understanding of class contracts
- ✅ Foundation for formal verification
### ⚠️ Review Notes
- All assertions are guarded by debug macros where appropriate
- Assertions have been validated for correctness
- No behavior changes - only adding checks
- Human review recommended for complex invariants
### 📚 Methodology
Specifications synthesized using LLM-based invariant mining inspired by [arXiv:2502.18917](https://arxiv.org/abs/2502.18917).
---
*🤖 Generated by SpecBot - Automatic Specification Mining Agent*
```
## Guidelines and Best Practices
### DO:
- ✅ Focus on meaningful, non-trivial invariants (not just `ptr != nullptr`)
- ✅ Express invariants clearly using Z3's existing patterns
- ✅ Add explanatory comments for complex assertions
- ✅ Be conservative - only add assertions you're confident about
- ✅ Respect Z3's coding conventions and assertion style
- ✅ Use existing helper methods (e.g., `well_formed()`, `is_valid()`)
- ✅ Group related assertions logically
- ✅ Consider performance impact of assertions
### DON'T:
- ❌ Add trivial or obvious assertions that add no value
- ❌ Write assertions with side effects
- ❌ Make assertions that are expensive to check in every call
- ❌ Duplicate existing assertions already in the code
- ❌ Add assertions that are too strict (would break valid code)
- ❌ Annotate code you don't understand well
- ❌ Change any behavior - only add assertions
- ❌ Create assertions that can't be efficiently evaluated
### Security and Safety:
- Never introduce undefined behavior through assertions
- Ensure assertions don't access invalid memory
- Be careful with assertions in concurrent code
- Don't assume single-threaded execution without verification
### Performance Considerations:
- Use `DEBUG` guards for expensive invariant checks
- Prefer O(1) assertion checks when possible
- Consider caching computed values used in multiple assertions
- Balance thoroughness with runtime overhead
## Output Format
### Success Case (specifications added):
Create a PR with annotated code.
### No Changes Case (already well-annotated):
Exit gracefully with a comment explaining why no changes were made:
```markdown
## SpecBot Analysis Complete
Analyzed the following files:
- `src/path/to/file.cpp`
**Finding**: The selected classes are already well-annotated with assertions and invariants.
No additional specifications needed at this time.
```
### Partial Success Case:
Create a PR with whatever specifications could be confidently added, and note any limitations:
```markdown
### ⚠️ Limitations
Some potential invariants were identified but not added due to:
- Insufficient confidence in correctness
- High computational cost of checking
- Need for deeper semantic analysis
These can be addressed in future iterations or manual review.
```
## Advanced Techniques
### Cross-referencing:
- Check how classes are used in tests to understand expected behavior
- Look at similar classes for specification patterns
- Review git history to understand common bugs (hint at missing preconditions)
### Incremental Refinement:
- Use cache-memory to track which classes have been analyzed
- Build on previous runs to improve specifications over time
- Learn from PR feedback to refine future annotations
### Pattern Recognition:
- Common patterns: container invariants, ownership invariants, state machine invariants
- Learn Z3-specific patterns by analyzing existing assertions
- Adapt to codebase-specific idioms and conventions
## Important Notes
- This is a **specification synthesis** task, not a bug-fixing task
- Focus on documenting what the code *should* do, not changing what it *does*
- Specifications should help catch bugs, not introduce new ones
- Human review is essential - LLMs can hallucinate or miss nuances
- When in doubt, err on the side of not adding an assertion
## Error Handling
- If you can't understand a class well enough, skip it and try another
- If compilation fails, investigate and fix assertion syntax
- If you're unsure about an invariant's correctness, document it as a question in PR
- Always be transparent about confidence levels and limitations

1072
.github/workflows/specbot.lock.yml generated vendored Normal file

File diff suppressed because it is too large Load diff

49
.github/workflows/specbot.md vendored Normal file
View file

@ -0,0 +1,49 @@
---
description: Automatically annotate code with assertions capturing class invariants, pre-conditions, and post-conditions using LLM-based specification mining
on:
schedule: weekly
workflow_dispatch:
inputs:
target_path:
description: 'Target directory or file to analyze (e.g., src/ast/, src/smt/smt_context.cpp)'
required: false
default: ''
target_class:
description: 'Specific class name to analyze (optional)'
required: false
default: ''
roles: [write, maintain, admin]
permissions:
contents: read
issues: read
pull-requests: read
tools:
github:
toolsets: [default]
view: {}
glob: {}
grep: {}
edit: {}
bash:
- ":*"
safe-outputs:
create-pull-request:
if-no-changes: ignore
missing-tool:
create-issue: true
timeout-minutes: 45
steps:
- name: Checkout repository
uses: actions/checkout@v5
---
<!-- Edit the file linked below to modify the agent without recompilation. Feel free to move the entire markdown body to that file. -->
@./agentics/specbot.md

263
SPECBOT.md Normal file
View file

@ -0,0 +1,263 @@
# SpecBot: Automatic Specification Mining Agent
SpecBot is a GitHub Agentic Workflow that automatically annotates Z3 source code with formal specifications using LLM-based invariant synthesis.
## Overview
SpecBot analyzes C++ classes in the Z3 theorem prover codebase and automatically adds:
- **Class Invariants**: Properties that must always hold for all instances of a class
- **Pre-conditions**: Conditions required before a function executes
- **Post-conditions**: Guarantees about function results and side effects
This approach is inspired by the paper ["Classinvgen: Class invariant synthesis using large language models"](https://arxiv.org/abs/2502.18917).
## What It Does
### Automatic Specification Mining
SpecBot uses LLM reasoning to:
1. **Identify target classes** with complex state management
2. **Analyze code structure** including members, methods, and dependencies
3. **Mine specifications** using multi-step reasoning about code semantics
4. **Generate annotations** using Z3's existing assertion macros (`SASSERT`, `ENSURE`, `VERIFY`)
5. **Create pull requests** with the annotated code for human review
### Example Annotations
**Class Invariant:**
```cpp
class vector {
private:
void check_invariant() const {
SASSERT(m_size <= m_capacity);
SASSERT(m_data != nullptr || m_capacity == 0);
}
public:
void push_back(int x) {
check_invariant(); // Verify invariant
// ... implementation
check_invariant(); // Preserve invariant
}
};
```
**Pre-condition:**
```cpp
void set_value(int index, int value) {
SASSERT(index >= 0); // Pre-condition
SASSERT(index < m_size); // Pre-condition
// ... implementation
}
```
**Post-condition:**
```cpp
int* allocate_buffer(size_t size) {
SASSERT(size > 0); // Pre-condition
int* result = new int[size];
SASSERT(result != nullptr); // Post-condition
return result;
}
```
## Triggers
### 1. Weekly Schedule
- Automatically runs every week
- Randomly selects 3-5 core classes for analysis
- Focuses on high-impact components (AST, solvers, data structures)
### 2. Manual Trigger (workflow_dispatch)
You can manually trigger SpecBot with optional parameters:
- **target_path**: Specific directory or file (e.g., `src/ast/`, `src/smt/smt_context.cpp`)
- **target_class**: Specific class name to analyze
To trigger manually:
```bash
# Analyze a specific directory
gh workflow run specbot.lock.yml -f target_path=src/ast/
# Analyze a specific file
gh workflow run specbot.lock.yml -f target_path=src/smt/smt_context.cpp
# Analyze a specific class
gh workflow run specbot.lock.yml -f target_class=ast_manager
```
## Configuration
### Workflow Files
- **`.github/workflows/specbot.md`**: Workflow definition (compile this to update)
- **`.github/agentics/specbot.md`**: Agent prompt (edit without recompilation!)
- **`.github/workflows/specbot.lock.yml`**: Compiled workflow (auto-generated)
### Key Settings
- **Schedule**: Weekly (fuzzy scheduling to distribute load)
- **Timeout**: 45 minutes
- **Permissions**: Read-only (contents, issues, pull-requests)
- **Tools**: GitHub API, bash, file operations (view, glob, grep, edit)
- **Safe Outputs**: Creates pull requests, reports missing tools as issues
## Methodology
SpecBot follows a systematic approach to specification mining:
### 1. Class Selection
- Prioritizes classes with multiple data members and complex state
- Focuses on public/protected methods needing contracts
- Skips simple POD structs and well-annotated code
### 2. Code Analysis
- Parses header (.h) and implementation (.cpp) files
- Maps member variables, methods, and constructors
- Identifies resource management patterns
### 3. Specification Synthesis
Uses LLM reasoning to infer:
- **Invariants**: From member relationships, constructors, and state-modifying methods
- **Pre-conditions**: From argument constraints and defensive code patterns
- **Post-conditions**: From return value properties and guaranteed side effects
### 4. Annotation Generation
- Uses Z3's existing assertion macros
- Adds explanatory comments for complex invariants
- Follows Z3's coding conventions
- Guards expensive checks with `DEBUG` macros
### 5. Pull Request Creation
Creates a PR with:
- Detailed description of specifications added
- Rationale for each assertion
- Human review recommendations
## Best Practices
### What SpecBot Does Well ✅
- Identifies non-trivial invariants (not just null checks)
- Respects Z3's coding conventions
- Uses existing helper methods (e.g., `well_formed()`, `is_valid()`)
- Groups related assertions logically
- Considers performance impact
### What SpecBot Avoids ❌
- Trivial assertions that add no value
- Assertions with side effects
- Expensive checks without DEBUG guards
- Duplicating existing assertions
- Changing any program behavior
## Human Review Required
SpecBot is a **specification synthesis assistant**, not a replacement for human expertise:
- **Review all assertions** for correctness
- **Validate complex invariants** against code semantics
- **Check performance impact** of assertion checks
- **Refine specifications** based on domain knowledge
- **Test changes** before merging
LLMs can occasionally hallucinate or miss nuances, so human oversight is essential.
## Output Format
### Pull Request Structure
```markdown
## ✨ Automatic Specification Mining
### 📋 Classes Annotated
- `ClassName` in `src/path/to/file.cpp`
### 🔍 Specifications Added
#### Class Invariants
- **Invariant**: [description]
- **Assertion**: `SASSERT([expression])`
- **Rationale**: [why this invariant is important]
#### Pre-conditions
- **Method**: `method_name()`
- **Pre-condition**: [description]
- **Assertion**: `SASSERT([expression])`
#### Post-conditions
- **Method**: `method_name()`
- **Post-condition**: [description]
- **Assertion**: `SASSERT([expression])`
### 🎯 Goals Achieved
- ✅ Improved code documentation
- ✅ Early bug detection through runtime checks
- ✅ Better understanding of class contracts
*🤖 Generated by SpecBot - Automatic Specification Mining Agent*
```
## Editing the Agent
### Without Recompilation (Recommended)
Edit `.github/agentics/specbot.md` to modify:
- Agent instructions and guidelines
- Specification synthesis strategies
- Output formatting
- Error handling behavior
Changes take effect immediately on the next run.
### With Recompilation (For Config Changes)
Edit `.github/workflows/specbot.md` and run:
```bash
gh aw compile specbot
```
Recompilation is needed for:
- Changing triggers (schedule, workflow_dispatch)
- Modifying permissions or tools
- Adjusting timeout or safe outputs
## Troubleshooting
### Workflow Not Running
- Check that the compiled `.lock.yml` file is committed
- Verify the workflow is enabled in repository settings
- Review GitHub Actions logs for errors
### No Specifications Generated
- The selected classes may already be well-annotated
- Code may be too complex for confident specification synthesis
- Check workflow logs for analysis details
### Compilation Errors
If assertions cause build errors:
- Review assertion syntax and Z3 macro usage
- Verify that assertions don't access invalid members
- Check that expressions are well-formed
## Benefits
### For Developers
- **Documentation**: Specifications serve as precise documentation
- **Bug Detection**: Runtime assertions catch violations early
- **Understanding**: Clear contracts improve code comprehension
- **Maintenance**: Invariants help prevent bugs during refactoring
### For Verification
- **Foundation**: Specifications enable formal verification
- **Testing**: Assertions strengthen test coverage
- **Debugging**: Contract violations pinpoint error locations
- **Confidence**: Specifications increase correctness confidence
## References
- **Paper**: [Classinvgen: Class invariant synthesis using large language models (arXiv:2502.18917)](https://arxiv.org/abs/2502.18917)
- **Approach**: LLM-based specification mining for object-oriented code
- **Related**: Design by Contract, Programming by Contract (Bertrand Meyer)
## Contributing
To improve SpecBot:
1. Edit `.github/agentics/specbot.md` for prompt improvements
2. Provide feedback on generated specifications via PR reviews
3. Report issues or suggest enhancements through GitHub issues
## License
SpecBot is part of the Z3 theorem prover project and follows the same license (MIT).