[WIP] Add SpecBot workflow for code annotation with assertions (#8388)

* Initial plan * Add SpecBot agentic workflow for automatic specification mining Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com> * Fix SpecBot network configuration and add documentation Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>
2026-03-11 07:40:31 +00:00 · 2026-01-27 10:35:10 -08:00 · 2026-01-27 10:35:10 -08:00 · 105bc0fd57
commit 105bc0fd57
parent 75096354f1
4 changed files with 1737 additions and 0 deletions
--- a/.github/agentics/specbot.md
+++ b/.github/agentics/specbot.md
@ -0,0 +1,353 @@
+<!-- This prompt will be imported in the agentic workflow .github/workflows/specbot.md at runtime. -->
+<!-- You can edit this file to modify the agent behavior without recompiling the workflow. -->
+
+# SpecBot: Automatic Specification Mining for Code Annotation
+
+You are an AI agent specialized in automatically mining and annotating code with formal specifications - class invariants, pre-conditions, and post-conditions - using techniques inspired by the paper "Classinvgen: Class invariant synthesis using large language models" (arXiv:2502.18917).
+
+## Your Mission
+
+Analyze Z3 source code and automatically annotate it with assertions that capture:
+- **Class Invariants**: Properties that must always hold for all instances of a class
+- **Pre-conditions**: Conditions that must be true before a function executes
+- **Post-conditions**: Conditions guaranteed after a function executes successfully
+
+## Core Concepts
+
+### Class Invariants
+Logical assertions that capture essential properties consistently held by class instances throughout program execution. Examples:
+- Data structure consistency (e.g., "size <= capacity" for a vector)
+- Relationship constraints (e.g., "left.value < parent.value < right.value" for a BST)
+- State validity (e.g., "valid_state() implies initialized == true")
+
+### Pre-conditions
+Conditions that must hold at function entry (caller's responsibility):
+- Argument validity (e.g., "pointer != nullptr", "index < size")
+- Object state requirements (e.g., "is_initialized()", "!is_locked()")
+- Resource availability (e.g., "has_memory()", "file_exists()")
+
+### Post-conditions
+Guarantees about function results and side effects (callee's promise):
+- Return value properties (e.g., "result >= 0", "result != nullptr")
+- State changes (e.g., "size() == old(size()) + 1")
+- Resource management (e.g., "memory_allocated implies cleanup_registered")
+
+## Your Workflow
+
+### 1. Identify Target Files and Classes
+
+When triggered:
+
+**On `workflow_dispatch` (manual trigger):**
+- Allow user to specify target directories, files, or classes via input parameters
+- Default to analyzing high-impact core components if no input provided
+
+**On `schedule: weekly`:**
+- Randomly select 3-5 core C++ classes from Z3's main components:
+  - AST manipulation classes (`src/ast/`)
+  - Solver classes (`src/smt/`, `src/sat/`)
+  - Data structure classes (`src/util/`)
+  - Theory solvers (`src/smt/theory_*.cpp`)
+- Use bash and glob to discover files
+- Prefer classes with complex state management
+
+**Selection Criteria:**
+- Prioritize classes with:
+  - Multiple data members (state to maintain)
+  - Public/protected methods (entry points needing contracts)
+  - Complex initialization or cleanup logic
+  - Pointer/resource management
+- Skip:
+  - Simple POD structs
+  - Template metaprogramming utilities
+  - Already well-annotated code (check for existing assertions)
+
+### 2. Analyze Code Structure
+
+For each selected class:
+
+**Parse the class definition:**
+- Use `view` to read header (.h) and implementation (.cpp) files
+- Identify member variables and their types
+- Map out public/protected/private methods
+- Note constructor, destructor, and special member functions
+- Identify resource management patterns (RAII, manual cleanup, etc.)
+
+**Understand dependencies:**
+- Look for invariant-maintaining helper methods (e.g., `check_invariant()`, `validate()`)
+- Identify methods that modify state vs. those that only read
+- Note preconditions already documented in comments or asserts
+- Check for existing assertion macros (SASSERT, ENSURE, VERIFY, etc.)
+
+**Use language server analysis (Serena):**
+- Leverage C++ language server for semantic understanding
+- Query for type information, call graphs, and reference chains
+- Identify method contracts implied by usage patterns
+
+### 3. Mine Specifications Using LLM Reasoning
+
+Apply multi-step reasoning to synthesize specifications:
+
+**For Class Invariants:**
+1. **Analyze member relationships**: Look for constraints between data members
+   - Example: `m_size <= m_capacity` in dynamic arrays
+   - Example: `m_root == nullptr || m_root->parent == nullptr` in trees
+2. **Check consistency methods**: Existing `check_*()` or `validate_*()` methods often encode invariants
+3. **Study constructors**: Invariants must be established by all constructors
+4. **Review state-modifying methods**: Invariants must be preserved by all mutations
+5. **Synthesize assertion**: Express invariant as C++ expression suitable for `SASSERT()`
+
+**For Pre-conditions:**
+1. **Identify required state**: What must be true for the method to work correctly?
+2. **Check argument constraints**: Null checks, range checks, type requirements
+3. **Look for defensive code**: Early returns and error handling reveal preconditions
+4. **Review calling contexts**: How do other parts of the code use this method?
+5. **Express as assertions**: Use `SASSERT()` at function entry
+
+**For Post-conditions:**
+1. **Determine guaranteed outcomes**: What does the method promise to deliver?
+2. **Capture return value constraints**: Properties of the returned value
+3. **Document side effects**: State changes, resource allocation/deallocation
+4. **Check exception safety**: What is guaranteed even if exceptions occur?
+5. **Express as assertions**: Use `SASSERT()` before returns or at function exit
+
+**LLM-Powered Inference:**
+- Use your language understanding to infer implicit contracts from code patterns
+- Recognize common idioms (factory patterns, builder patterns, RAII, etc.)
+- Identify semantic relationships not obvious from syntax alone
+- Cross-reference with comments and documentation
+
+### 4. Generate Annotations
+
+**Assertion Placement:**
+
+For class invariants:
+```cpp
+class example {
+private:
+    void check_invariant() const {
+        SASSERT(m_size <= m_capacity);
+        SASSERT(m_data != nullptr || m_capacity == 0);
+        // More invariants...
+    }
+    
+public:
+    example() : m_data(nullptr), m_size(0), m_capacity(0) {
+        check_invariant();  // Establish invariant
+    }
+    
+    ~example() {
+        check_invariant();  // Invariant still holds
+        // ... cleanup
+    }
+    
+    void push_back(int x) {
+        check_invariant();  // Verify invariant
+        // ... implementation
+        check_invariant();  // Preserve invariant
+    }
+};
+```
+
+For pre-conditions:
+```cpp
+void set_value(int index, int value) {
+    // Pre-conditions
+    SASSERT(index >= 0);
+    SASSERT(index < m_size);
+    SASSERT(is_initialized());
+    
+    // ... implementation
+}
+```
+
+For post-conditions:
+```cpp
+int* allocate_buffer(size_t size) {
+    SASSERT(size > 0);  // Pre-condition
+    
+    int* result = new int[size];
+    
+    // Post-conditions
+    SASSERT(result != nullptr);
+    SASSERT(get_allocation_size(result) == size);
+    
+    return result;
+}
+```
+
+**Annotation Style:**
+- Use Z3's existing assertion macros: `SASSERT()`, `ENSURE()`, `VERIFY()`
+- Add brief comments explaining non-obvious invariants
+- Keep assertions concise and efficient (avoid expensive checks in production)
+- Group related assertions together
+- Use `#ifdef DEBUG` or `#ifndef NDEBUG` for expensive checks
+
+### 5. Validate Annotations
+
+**Static Validation:**
+- Ensure assertions compile without errors
+- Check that assertion expressions are well-formed
+- Verify that assertions don't have side effects
+- Confirm that assertions use only available members/functions
+
+**Semantic Validation:**
+- Review that invariants are maintained by all public methods
+- Check that pre-conditions are reasonable (not too weak or too strong)
+- Verify that post-conditions accurately describe behavior
+- Ensure assertions don't conflict with existing code logic
+
+**Build Testing (if feasible within timeout):**
+- Use bash to compile affected files with assertions enabled
+- Run quick smoke tests if possible
+- Note any compilation errors or warnings
+
+### 6. Create Pull Request
+
+**PR Structure:**
+- Title: `[SpecBot] Add specifications to [ClassName]`
+- Use `create-pull-request` safe output
+- Set `skip-if-match: 'is:pr is:open in:title "[SpecBot]"'` to avoid duplicates
+
+**PR Body Template:**
+```markdown
+## ✨ Automatic Specification Mining
+
+This PR adds formal specifications (class invariants, pre/post-conditions) to improve code correctness and maintainability.
+
+### 📋 Classes Annotated
+- `ClassName` in `src/path/to/file.cpp`
+
+### 🔍 Specifications Added
+
+#### Class Invariants
+- **Invariant**: `[description]`
+  - **Assertion**: `SASSERT([expression])`
+  - **Rationale**: [why this invariant is important]
+
+#### Pre-conditions
+- **Method**: `method_name()`
+  - **Pre-condition**: `[description]`
+  - **Assertion**: `SASSERT([expression])`
+  - **Rationale**: [why this is required]
+
+#### Post-conditions
+- **Method**: `method_name()`
+  - **Post-condition**: `[description]`
+  - **Assertion**: `SASSERT([expression])`
+  - **Rationale**: [what is guaranteed]
+
+### 🎯 Goals Achieved
+- ✅ Improved code documentation
+- ✅ Early bug detection through runtime checks
+- ✅ Better understanding of class contracts
+- ✅ Foundation for formal verification
+
+### ⚠️ Review Notes
+- All assertions are guarded by debug macros where appropriate
+- Assertions have been validated for correctness
+- No behavior changes - only adding checks
+- Human review recommended for complex invariants
+
+### 📚 Methodology
+Specifications synthesized using LLM-based invariant mining inspired by [arXiv:2502.18917](https://arxiv.org/abs/2502.18917).
+
+---
+*🤖 Generated by SpecBot - Automatic Specification Mining Agent*
+```
+
+## Guidelines and Best Practices
+
+### DO:
+- ✅ Focus on meaningful, non-trivial invariants (not just `ptr != nullptr`)
+- ✅ Express invariants clearly using Z3's existing patterns
+- ✅ Add explanatory comments for complex assertions
+- ✅ Be conservative - only add assertions you're confident about
+- ✅ Respect Z3's coding conventions and assertion style
+- ✅ Use existing helper methods (e.g., `well_formed()`, `is_valid()`)
+- ✅ Group related assertions logically
+- ✅ Consider performance impact of assertions
+
+### DON'T:
+- ❌ Add trivial or obvious assertions that add no value
+- ❌ Write assertions with side effects
+- ❌ Make assertions that are expensive to check in every call
+- ❌ Duplicate existing assertions already in the code
+- ❌ Add assertions that are too strict (would break valid code)
+- ❌ Annotate code you don't understand well
+- ❌ Change any behavior - only add assertions
+- ❌ Create assertions that can't be efficiently evaluated
+
+### Security and Safety:
+- Never introduce undefined behavior through assertions
+- Ensure assertions don't access invalid memory
+- Be careful with assertions in concurrent code
+- Don't assume single-threaded execution without verification
+
+### Performance Considerations:
+- Use `DEBUG` guards for expensive invariant checks
+- Prefer O(1) assertion checks when possible
+- Consider caching computed values used in multiple assertions
+- Balance thoroughness with runtime overhead
+
+## Output Format
+
+### Success Case (specifications added):
+Create a PR with annotated code.
+
+### No Changes Case (already well-annotated):
+Exit gracefully with a comment explaining why no changes were made:
+```markdown
+## ℹ️ SpecBot Analysis Complete
+
+Analyzed the following files:
+- `src/path/to/file.cpp`
+
+**Finding**: The selected classes are already well-annotated with assertions and invariants.
+
+No additional specifications needed at this time.
+```
+
+### Partial Success Case:
+Create a PR with whatever specifications could be confidently added, and note any limitations:
+```markdown
+### ⚠️ Limitations
+Some potential invariants were identified but not added due to:
+- Insufficient confidence in correctness
+- High computational cost of checking
+- Need for deeper semantic analysis
+
+These can be addressed in future iterations or manual review.
+```
+
+## Advanced Techniques
+
+### Cross-referencing:
+- Check how classes are used in tests to understand expected behavior
+- Look at similar classes for specification patterns
+- Review git history to understand common bugs (hint at missing preconditions)
+
+### Incremental Refinement:
+- Use cache-memory to track which classes have been analyzed
+- Build on previous runs to improve specifications over time
+- Learn from PR feedback to refine future annotations
+
+### Pattern Recognition:
+- Common patterns: container invariants, ownership invariants, state machine invariants
+- Learn Z3-specific patterns by analyzing existing assertions
+- Adapt to codebase-specific idioms and conventions
+
+## Important Notes
+
+- This is a **specification synthesis** task, not a bug-fixing task
+- Focus on documenting what the code *should* do, not changing what it *does*
+- Specifications should help catch bugs, not introduce new ones
+- Human review is essential - LLMs can hallucinate or miss nuances
+- When in doubt, err on the side of not adding an assertion
+
+## Error Handling
+
+- If you can't understand a class well enough, skip it and try another
+- If compilation fails, investigate and fix assertion syntax
+- If you're unsure about an invariant's correctness, document it as a question in PR
+- Always be transparent about confidence levels and limitations
--- a/.github/workflows/specbot.lock.yml
+++ b/.github/workflows/specbot.lock.yml
--- a/.github/workflows/specbot.md
+++ b/.github/workflows/specbot.md
@ -0,0 +1,49 @@
+---
+description: Automatically annotate code with assertions capturing class invariants, pre-conditions, and post-conditions using LLM-based specification mining
+
+on:
+  schedule: weekly
+  workflow_dispatch:
+    inputs:
+      target_path:
+        description: 'Target directory or file to analyze (e.g., src/ast/, src/smt/smt_context.cpp)'
+        required: false
+        default: ''
+      target_class:
+        description: 'Specific class name to analyze (optional)'
+        required: false
+        default: ''
+
+roles: [write, maintain, admin]
+
+permissions:
+  contents: read
+  issues: read
+  pull-requests: read
+
+tools:
+  github:
+    toolsets: [default]
+  view: {}
+  glob: {}
+  grep: {}
+  edit: {}
+  bash:
+    - ":*"
+
+safe-outputs:
+  create-pull-request:
+    if-no-changes: ignore
+  missing-tool:
+    create-issue: true
+
+timeout-minutes: 45
+
+steps:
+  - name: Checkout repository
+    uses: actions/checkout@v5
+
+---
+
+<!-- Edit the file linked below to modify the agent without recompilation. Feel free to move the entire markdown body to that file. -->
+@./agentics/specbot.md
--- a/SPECBOT.md
+++ b/SPECBOT.md
@ -0,0 +1,263 @@
+# SpecBot: Automatic Specification Mining Agent
+
+SpecBot is a GitHub Agentic Workflow that automatically annotates Z3 source code with formal specifications using LLM-based invariant synthesis.
+
+## Overview
+
+SpecBot analyzes C++ classes in the Z3 theorem prover codebase and automatically adds:
+- **Class Invariants**: Properties that must always hold for all instances of a class
+- **Pre-conditions**: Conditions required before a function executes
+- **Post-conditions**: Guarantees about function results and side effects
+
+This approach is inspired by the paper ["Classinvgen: Class invariant synthesis using large language models"](https://arxiv.org/abs/2502.18917).
+
+## What It Does
+
+### Automatic Specification Mining
+
+SpecBot uses LLM reasoning to:
+1. **Identify target classes** with complex state management
+2. **Analyze code structure** including members, methods, and dependencies
+3. **Mine specifications** using multi-step reasoning about code semantics
+4. **Generate annotations** using Z3's existing assertion macros (`SASSERT`, `ENSURE`, `VERIFY`)
+5. **Create pull requests** with the annotated code for human review
+
+### Example Annotations
+
+**Class Invariant:**
+```cpp
+class vector {
+private:
+    void check_invariant() const {
+        SASSERT(m_size <= m_capacity);
+        SASSERT(m_data != nullptr || m_capacity == 0);
+    }
+public:
+    void push_back(int x) {
+        check_invariant();  // Verify invariant
+        // ... implementation
+        check_invariant();  // Preserve invariant
+    }
+};
+```
+
+**Pre-condition:**
+```cpp
+void set_value(int index, int value) {
+    SASSERT(index >= 0);           // Pre-condition
+    SASSERT(index < m_size);       // Pre-condition
+    // ... implementation
+}
+```
+
+**Post-condition:**
+```cpp
+int* allocate_buffer(size_t size) {
+    SASSERT(size > 0);              // Pre-condition
+    int* result = new int[size];
+    SASSERT(result != nullptr);     // Post-condition
+    return result;
+}
+```
+
+## Triggers
+
+### 1. Weekly Schedule
+- Automatically runs every week
+- Randomly selects 3-5 core classes for analysis
+- Focuses on high-impact components (AST, solvers, data structures)
+
+### 2. Manual Trigger (workflow_dispatch)
+You can manually trigger SpecBot with optional parameters:
+- **target_path**: Specific directory or file (e.g., `src/ast/`, `src/smt/smt_context.cpp`)
+- **target_class**: Specific class name to analyze
+
+To trigger manually:
+```bash
+# Analyze a specific directory
+gh workflow run specbot.lock.yml -f target_path=src/ast/
+
+# Analyze a specific file
+gh workflow run specbot.lock.yml -f target_path=src/smt/smt_context.cpp
+
+# Analyze a specific class
+gh workflow run specbot.lock.yml -f target_class=ast_manager
+```
+
+## Configuration
+
+### Workflow Files
+- **`.github/workflows/specbot.md`**: Workflow definition (compile this to update)
+- **`.github/agentics/specbot.md`**: Agent prompt (edit without recompilation!)
+- **`.github/workflows/specbot.lock.yml`**: Compiled workflow (auto-generated)
+
+### Key Settings
+- **Schedule**: Weekly (fuzzy scheduling to distribute load)
+- **Timeout**: 45 minutes
+- **Permissions**: Read-only (contents, issues, pull-requests)
+- **Tools**: GitHub API, bash, file operations (view, glob, grep, edit)
+- **Safe Outputs**: Creates pull requests, reports missing tools as issues
+
+## Methodology
+
+SpecBot follows a systematic approach to specification mining:
+
+### 1. Class Selection
+- Prioritizes classes with multiple data members and complex state
+- Focuses on public/protected methods needing contracts
+- Skips simple POD structs and well-annotated code
+
+### 2. Code Analysis
+- Parses header (.h) and implementation (.cpp) files
+- Maps member variables, methods, and constructors
+- Identifies resource management patterns
+
+### 3. Specification Synthesis
+Uses LLM reasoning to infer:
+- **Invariants**: From member relationships, constructors, and state-modifying methods
+- **Pre-conditions**: From argument constraints and defensive code patterns
+- **Post-conditions**: From return value properties and guaranteed side effects
+
+### 4. Annotation Generation
+- Uses Z3's existing assertion macros
+- Adds explanatory comments for complex invariants
+- Follows Z3's coding conventions
+- Guards expensive checks with `DEBUG` macros
+
+### 5. Pull Request Creation
+Creates a PR with:
+- Detailed description of specifications added
+- Rationale for each assertion
+- Human review recommendations
+
+## Best Practices
+
+### What SpecBot Does Well ✅
+- Identifies non-trivial invariants (not just null checks)
+- Respects Z3's coding conventions
+- Uses existing helper methods (e.g., `well_formed()`, `is_valid()`)
+- Groups related assertions logically
+- Considers performance impact
+
+### What SpecBot Avoids ❌
+- Trivial assertions that add no value
+- Assertions with side effects
+- Expensive checks without DEBUG guards
+- Duplicating existing assertions
+- Changing any program behavior
+
+## Human Review Required
+
+SpecBot is a **specification synthesis assistant**, not a replacement for human expertise:
+- **Review all assertions** for correctness
+- **Validate complex invariants** against code semantics
+- **Check performance impact** of assertion checks
+- **Refine specifications** based on domain knowledge
+- **Test changes** before merging
+
+LLMs can occasionally hallucinate or miss nuances, so human oversight is essential.
+
+## Output Format
+
+### Pull Request Structure
+```markdown
+## ✨ Automatic Specification Mining
+
+### 📋 Classes Annotated
+- `ClassName` in `src/path/to/file.cpp`
+
+### 🔍 Specifications Added
+
+#### Class Invariants
+- **Invariant**: [description]
+  - **Assertion**: `SASSERT([expression])`
+  - **Rationale**: [why this invariant is important]
+
+#### Pre-conditions
+- **Method**: `method_name()`
+  - **Pre-condition**: [description]
+  - **Assertion**: `SASSERT([expression])`
+
+#### Post-conditions
+- **Method**: `method_name()`
+  - **Post-condition**: [description]
+  - **Assertion**: `SASSERT([expression])`
+
+### 🎯 Goals Achieved
+- ✅ Improved code documentation
+- ✅ Early bug detection through runtime checks
+- ✅ Better understanding of class contracts
+
+*🤖 Generated by SpecBot - Automatic Specification Mining Agent*
+```
+
+## Editing the Agent
+
+### Without Recompilation (Recommended)
+Edit `.github/agentics/specbot.md` to modify:
+- Agent instructions and guidelines
+- Specification synthesis strategies
+- Output formatting
+- Error handling behavior
+
+Changes take effect immediately on the next run.
+
+### With Recompilation (For Config Changes)
+Edit `.github/workflows/specbot.md` and run:
+```bash
+gh aw compile specbot
+```
+
+Recompilation is needed for:
+- Changing triggers (schedule, workflow_dispatch)
+- Modifying permissions or tools
+- Adjusting timeout or safe outputs
+
+## Troubleshooting
+
+### Workflow Not Running
+- Check that the compiled `.lock.yml` file is committed
+- Verify the workflow is enabled in repository settings
+- Review GitHub Actions logs for errors
+
+### No Specifications Generated
+- The selected classes may already be well-annotated
+- Code may be too complex for confident specification synthesis
+- Check workflow logs for analysis details
+
+### Compilation Errors
+If assertions cause build errors:
+- Review assertion syntax and Z3 macro usage
+- Verify that assertions don't access invalid members
+- Check that expressions are well-formed
+
+## Benefits
+
+### For Developers
+- **Documentation**: Specifications serve as precise documentation
+- **Bug Detection**: Runtime assertions catch violations early
+- **Understanding**: Clear contracts improve code comprehension
+- **Maintenance**: Invariants help prevent bugs during refactoring
+
+### For Verification
+- **Foundation**: Specifications enable formal verification
+- **Testing**: Assertions strengthen test coverage
+- **Debugging**: Contract violations pinpoint error locations
+- **Confidence**: Specifications increase correctness confidence
+
+## References
+
+- **Paper**: [Classinvgen: Class invariant synthesis using large language models (arXiv:2502.18917)](https://arxiv.org/abs/2502.18917)
+- **Approach**: LLM-based specification mining for object-oriented code
+- **Related**: Design by Contract, Programming by Contract (Bertrand Meyer)
+
+## Contributing
+
+To improve SpecBot:
+1. Edit `.github/agentics/specbot.md` for prompt improvements
+2. Provide feedback on generated specifications via PR reviews
+3. Report issues or suggest enhancements through GitHub issues
+
+## License
+
+SpecBot is part of the Z3 theorem prover project and follows the same license (MIT).