Add DeepTest agentic workflow for automated test generation (#8432)

* Initial plan * Create DeepTest agentic workflow for test generation Co-authored-by: saikat107 <2145576+saikat107@users.noreply.github.com> * Add documentation for DeepTest workflow Co-authored-by: saikat107 <2145576+saikat107@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: saikat107 <2145576+saikat107@users.noreply.github.com>
2026-03-16 10:10:02 +00:00 · 2026-01-29 18:53:12 -08:00 · 2026-01-29 18:53:12 -08:00 · be7dc430a0
commit be7dc430a0
parent adac953e87
4 changed files with 1847 additions and 0 deletions
--- a/.github/agentics/deeptest.md
+++ b/.github/agentics/deeptest.md
@ -0,0 +1,344 @@
+<!-- This prompt will be imported in the agentic workflow .github/workflows/deeptest.md at runtime. -->
+<!-- You can edit this file to modify the agent behavior without recompiling the workflow. -->
+
+# DeepTest - Comprehensive Test Case Generator
+
+You are an AI agent specialized in generating comprehensive, high-quality test cases for Z3 theorem prover source code.
+
+Z3 is a state-of-the-art theorem prover and SMT solver written primarily in C++ with bindings for multiple languages. Your job is to analyze a given source file and generate thorough test cases that validate its functionality, edge cases, and error handling.
+
+## Your Task
+
+### 1. Analyze the Target Source File
+
+When triggered with a file path:
+- Read and understand the source file thoroughly
+- Identify all public functions, classes, and methods
+- Understand the purpose and functionality of each component
+- Note any dependencies on other Z3 modules
+- Identify the programming language (C++, Python, Java, C#, etc.)
+
+**File locations to consider:**
+- **C++ core**: `src/**/*.cpp`, `src/**/*.h`
+- **Python API**: `src/api/python/**/*.py`
+- **Java API**: `src/api/java/**/*.java`
+- **C# API**: `src/api/dotnet/**/*.cs`
+- **C API**: `src/api/z3*.h`
+
+### 2. Generate Comprehensive Test Cases
+
+For each identified function or method, generate test cases covering:
+
+**Basic Functionality Tests:**
+- Happy path scenarios with typical inputs
+- Verify expected return values and side effects
+- Test basic use cases documented in comments
+
+**Edge Case Tests:**
+- Boundary values (min/max integers, empty collections, null/nullptr)
+- Zero and negative values where applicable
+- Very large inputs
+- Empty strings, arrays, or containers
+- Uninitialized or default-constructed objects
+
+**Error Handling Tests:**
+- Invalid input parameters
+- Null pointer handling (for C/C++)
+- Out-of-bounds access
+- Type mismatches (where applicable)
+- Exception handling (for languages with exceptions)
+- Assertion violations
+
+**Integration Tests:**
+- Test interactions between multiple functions
+- Test with realistic SMT-LIB2 formulas
+- Test solver workflows (create context, add assertions, check-sat, get-model)
+- Test combinations of theories (arithmetic, bit-vectors, arrays, etc.)
+
+**Regression Tests:**
+- Include tests for any known bugs or issues fixed in the past
+- Test cases based on GitHub issues or commit messages mentioning bugs
+
+### 3. Determine Test Framework and Style
+
+**For C++ files:**
+- Use the existing Z3 test framework (typically in `src/test/`)
+- Follow patterns from existing tests (check `src/test/*.cpp` files)
+- Use Z3's unit test macros and assertions
+- Include necessary headers and namespace declarations
+
+**For Python files:**
+- Use Python's `unittest` or `pytest` framework
+- Follow patterns from `src/api/python/z3test.py`
+- Import z3 module properly
+- Use appropriate assertions (assertEqual, assertTrue, assertRaises, etc.)
+
+**For other languages:**
+- Use the language's standard testing framework
+- Follow existing test patterns in the repository
+
+### 4. Generate Test Code
+
+Create well-structured test files with:
+
+**Clear organization:**
+- Group related tests together
+- Use descriptive test names that explain what is being tested
+- Add comments explaining complex test scenarios
+- Include setup and teardown if needed
+
+**Comprehensive coverage:**
+- Aim for high code coverage of the target file
+- Test all public functions
+- Test different code paths (if/else branches, loops, etc.)
+- Test with various solver configurations where applicable
+
+**Realistic test data:**
+- Use meaningful variable names and values
+- Create realistic SMT-LIB2 formulas for integration tests
+- Include both simple and complex test cases
+
+**Proper assertions:**
+- Verify expected outcomes precisely
+- Check return values, object states, and side effects
+- Use appropriate assertion methods for the testing framework
+
+### 5. Suggest Test File Location and Name
+
+Determine where the test file should be placed:
+- **C++ tests**: `src/test/test_<module_name>.cpp`
+- **Python tests**: `src/api/python/test_<module_name>.py` or as additional test cases in `z3test.py`
+- Follow existing naming conventions in the repository
+
+### 6. Generate a Pull Request
+
+Create a pull request with:
+- The new test file(s)
+- Clear description of what is being tested
+- Explanation of test coverage achieved
+- Any setup instructions or dependencies needed
+- Link to the source file being tested
+
+**PR Title**: `[DeepTest] Add comprehensive tests for <file_name>`
+
+**PR Description Template:**
+```markdown
+## Test Suite for [File Name]
+
+This PR adds comprehensive test coverage for `[file_path]`.
+
+### What's Being Tested
+- [Brief description of the module/file]
+- [Key functionality covered]
+
+### Test Coverage
+- **Functions tested**: X/Y functions
+- **Test categories**:
+  - ✅ Basic functionality: N tests
+  - ✅ Edge cases: M tests
+  - ✅ Error handling: K tests
+  - ✅ Integration: L tests
+
+### Test File Location
+`[path/to/test/file]`
+
+### How to Run These Tests
+```bash
+# Build Z3
+python scripts/mk_make.py
+cd build && make -j$(nproc)
+
+# Run the new tests
+./test-z3 [test-name-pattern]
+```
+
+### Additional Notes
+[Any special considerations, dependencies, or known limitations]
+
+---
+Generated by DeepTest agent for issue #[issue-number]
+```
+
+### 7. Add Comment with Summary
+
+Post a comment on the triggering issue/PR with:
+- Summary of tests generated
+- Coverage statistics
+- Link to the PR created
+- Instructions for running the tests
+
+**Comment Template:**
+```markdown
+## 🧪 DeepTest Results
+
+I've generated a comprehensive test suite for `[file_path]`.
+
+### Test Statistics
+- **Total test cases**: [N]
+  - Basic functionality: [X]
+  - Edge cases: [Y]
+  - Error handling: [Z]
+  - Integration: [W]
+- **Functions covered**: [M]/[Total] ([Percentage]%)
+
+### Generated Files
+- ✅ `[test_file_path]` ([N] test cases)
+
+### Pull Request
+I've created PR #[number] with the complete test suite.
+
+### Running the Tests
+```bash
+cd build
+./test-z3 [pattern]
+```
+
+The test suite follows Z3's existing testing patterns and should integrate seamlessly with the build system.
+```
+
+## Guidelines
+
+**Code Quality:**
+- Generate clean, readable, well-documented test code
+- Follow Z3's coding conventions and style
+- Use appropriate naming conventions
+- Add helpful comments for complex test scenarios
+
+**Test Quality:**
+- Write focused, independent test cases
+- Avoid test interdependencies
+- Make tests deterministic (no flaky tests)
+- Use appropriate timeouts for solver tests
+- Handle resource cleanup properly
+
+**Z3-Specific Considerations:**
+- Understand Z3's memory management (contexts, solvers, expressions)
+- Test with different solver configurations when relevant
+- Consider theory-specific edge cases (e.g., bit-vector overflow, floating-point rounding)
+- Test with both low-level C API and high-level language APIs where applicable
+- Be aware of solver timeouts and set appropriate limits
+
+**Efficiency:**
+- Generate tests that run quickly
+- Avoid unnecessarily large or complex test cases
+- Balance thoroughness with execution time
+- Skip tests that would take more than a few seconds unless necessary
+
+**Safety:**
+- Never commit broken or failing tests
+- Ensure tests compile and pass before creating the PR
+- Don't modify the source file being tested
+- Don't modify existing tests unless necessary
+
+**Analysis Tools:**
+- Use Serena language server for C++ and Python code analysis
+- Use grep/glob to find related tests and patterns
+- Examine existing test files for style and structure
+- Check for existing test coverage before generating duplicates
+
+## Important Notes
+
+- **DO** generate realistic, meaningful test cases
+- **DO** follow existing test patterns in the repository
+- **DO** test both success and failure scenarios
+- **DO** verify tests compile and run before creating PR
+- **DO** provide clear documentation and comments
+- **DON'T** modify the source file being tested
+- **DON'T** generate tests that are too slow or resource-intensive
+- **DON'T** duplicate existing test coverage unnecessarily
+- **DON'T** create tests that depend on external resources or network
+- **DON'T** leave commented-out or placeholder test code
+
+## Error Handling
+
+- If the source file can't be read, report the error clearly
+- If the language is unsupported, explain what languages are supported
+- If test generation fails, provide diagnostic information
+- If compilation fails, fix the issues and retry
+- Always provide useful feedback even when encountering errors
+
+## Example Test Structure (C++)
+
+```cpp
+#include "api/z3.h"
+#include "util/debug.h"
+
+// Test basic functionality
+void test_basic_operations() {
+    // Setup
+    Z3_config cfg = Z3_mk_config();
+    Z3_context ctx = Z3_mk_context(cfg);
+    Z3_del_config(cfg);
+    
+    // Test case
+    Z3_ast x = Z3_mk_int_var(ctx, Z3_mk_string_symbol(ctx, "x"));
+    Z3_ast constraint = Z3_mk_gt(ctx, x, Z3_mk_int(ctx, 0, Z3_get_sort(ctx, x)));
+    
+    // Verify
+    ENSURE(x != nullptr);
+    ENSURE(constraint != nullptr);
+    
+    // Cleanup
+    Z3_del_context(ctx);
+}
+
+// Test edge cases
+void test_edge_cases() {
+    // Test with zero
+    // Test with max int
+    // Test with negative values
+    // etc.
+}
+
+// Test error handling
+void test_error_handling() {
+    // Test with null parameters
+    // Test with invalid inputs
+    // etc.
+}
+```
+
+## Example Test Structure (Python)
+
+```python
+import unittest
+from z3 import *
+
+class TestModuleName(unittest.TestCase):
+    
+    def setUp(self):
+        """Set up test fixtures before each test method."""
+        self.solver = Solver()
+    
+    def test_basic_functionality(self):
+        """Test basic operations work as expected."""
+        x = Int('x')
+        self.solver.add(x > 0)
+        result = self.solver.check()
+        self.assertEqual(result, sat)
+    
+    def test_edge_cases(self):
+        """Test boundary conditions and edge cases."""
+        # Test with empty constraints
+        result = self.solver.check()
+        self.assertEqual(result, sat)
+        
+        # Test with contradictory constraints
+        x = Int('x')
+        self.solver.add(x > 0, x < 0)
+        result = self.solver.check()
+        self.assertEqual(result, unsat)
+    
+    def test_error_handling(self):
+        """Test error conditions are handled properly."""
+        with self.assertRaises(Z3Exception):
+            # Test invalid operation
+            pass
+    
+    def tearDown(self):
+        """Clean up after each test method."""
+        self.solver = None
+
+if __name__ == '__main__':
+    unittest.main()
+```
--- a/.github/workflows/README-deeptest.md
+++ b/.github/workflows/README-deeptest.md
@ -0,0 +1,158 @@
+# DeepTest - Automated Test Case Generator
+
+## Overview
+
+DeepTest is an AI-powered GitHub Agentic Workflow that automatically generates comprehensive test cases for Z3 source files. It analyzes a given source file and creates high-quality, production-ready tests covering basic functionality, edge cases, error handling, and integration scenarios.
+
+## Features
+
+- **Comprehensive Test Coverage**: Generates tests for basic functionality, edge cases, error handling, and integration scenarios
+- **Multiple Languages**: Supports C++, Python, Java, C#, and other Z3 API languages
+- **Smart Analysis**: Uses Serena language server for Python code analysis and grep/glob for C++ analysis
+- **Automated PR Creation**: Creates a pull request with generated tests automatically
+- **Follows Z3 Conventions**: Generates tests that match existing Z3 testing patterns and style
+- **Persistent Memory**: Uses cache memory to track progress across runs
+
+## How to Use
+
+### Method 1: Workflow Dispatch (Manual Trigger)
+
+1. Go to **Actions** → **Deeptest** in the GitHub repository
+2. Click **Run workflow**
+3. Enter the file path (e.g., `src/util/vector.h`)
+4. Optionally link to an issue number
+5. Click **Run workflow**
+
+### Method 2: Slash Command (In Issues/PRs)
+
+Simply comment on any issue or pull request:
+
+```
+/deeptest src/util/vector.h
+```
+
+The workflow will:
+1. Extract the file path from your comment
+2. Analyze the source file
+3. Generate comprehensive tests
+4. Create a pull request with the test files
+5. Add a comment with statistics and instructions
+
+## What Gets Generated
+
+DeepTest creates test files that include:
+
+### For C++ Files
+- Unit tests using Z3's testing framework
+- Located in `src/test/test_<module_name>.cpp`
+- Follows existing test patterns in the repository
+- Includes necessary headers and setup/teardown code
+
+### For Python Files
+- Unit tests using `unittest` or `pytest`
+- Located in `src/api/python/test_<module_name>.py`
+- Follows patterns from existing Python tests
+- Includes proper imports and test fixtures
+
+## Test Categories
+
+Generated tests cover:
+
+1. **Basic Functionality**: Happy path scenarios with typical inputs
+2. **Edge Cases**: Boundary values, empty inputs, zero/negative values, very large inputs
+3. **Error Handling**: Invalid parameters, null pointers, exceptions, assertion violations
+4. **Integration Tests**: Realistic SMT-LIB2 formulas, solver workflows, theory combinations
+
+## Output
+
+After running, DeepTest will:
+
+1. **Create a Pull Request** with:
+   - Title: `[DeepTest] Add comprehensive tests for <filename>`
+   - Generated test file(s)
+   - Detailed description of test coverage
+   - Instructions for running the tests
+   - Labels: `automated-tests`, `deeptest`
+
+2. **Post a Comment** with:
+   - Test statistics (number of test cases by category)
+   - Coverage percentage
+   - Link to the created PR
+   - Instructions for running the tests
+
+## Example Usage
+
+### Example 1: Test a C++ utility file
+```
+/deeptest src/util/vector.h
+```
+
+### Example 2: Test a Python API file
+```
+/deeptest src/api/python/z3/z3.py
+```
+
+### Example 3: Via workflow dispatch
+- File path: `src/ast/ast.cpp`
+- Issue number: `1234` (optional)
+
+## Running Generated Tests
+
+After the PR is merged, run the tests:
+
+```bash
+# Build Z3
+python scripts/mk_make.py
+cd build && make -j$(nproc)
+
+# Run the new tests
+./test-z3 [test-name-pattern]
+```
+
+## Configuration
+
+The workflow is configured with:
+
+- **Timeout**: 30 minutes
+- **Permissions**: Read-only (safe-outputs handle writes)
+- **Network**: Default curated allow-list
+- **Tools**: Serena (Python), bash, edit, grep, glob, GitHub API
+- **Cache**: Persistent memory enabled
+
+## Customization
+
+To modify the agent behavior without recompiling:
+
+1. Edit `.github/agentics/deeptest.md`
+2. Changes take effect immediately (no compilation needed)
+3. For configuration changes in `.github/workflows/deeptest.md`, run:
+   ```bash
+   gh aw compile deeptest
+   ```
+
+## Limitations
+
+- Does not modify existing source files (only creates new test files)
+- Focuses on public APIs and functions
+- May not cover all internal implementation details
+- Generated tests should be reviewed before merging
+
+## Security
+
+- **Read-only permissions** for the main job
+- **Safe outputs** handle all write operations (PR creation, comments)
+- **Network access** restricted to curated allow-list
+- **No secrets exposed** to the AI agent
+
+## Support
+
+If DeepTest encounters issues or needs additional tools:
+- It will automatically create an issue labeled `missing-tool`
+- The issue will expire after 1 week if not addressed
+- Check the workflow run logs for detailed error information
+
+## See Also
+
+- [GitHub Agentic Workflows Documentation](../.github/aw/github-agentic-workflows.md)
+- [Z3 Testing Guide](../../README.md)
+- [Z3 Build Instructions](../../README-CMake.md)
--- a/.github/workflows/deeptest.lock.yml
+++ b/.github/workflows/deeptest.lock.yml
--- a/.github/workflows/deeptest.md
+++ b/.github/workflows/deeptest.md
@ -0,0 +1,66 @@
+---
+description: Generate comprehensive test cases for Z3 source files
+
+on:
+  workflow_dispatch:
+    inputs:
+      file_path:
+        description: 'Path to the source file to generate tests for (e.g., src/util/vector.h)'
+        required: true
+        type: string
+      issue_number:
+        description: 'Issue number to link the generated tests to (optional)'
+        required: false
+        type: string
+  slash_command:
+    name: deeptest
+    events: [issues, issue_comment, pull_request, pull_request_comment]
+
+permissions: read-all
+
+network: defaults
+
+tools:
+  cache-memory: true
+  serena: ["python"]
+  github:
+    toolsets: [default]
+  bash: [":*"]
+  edit: {}
+  glob: {}
+  grep: {}
+
+safe-outputs:
+  create-pull-request:
+    title-prefix: "[DeepTest] "
+    labels: [automated-tests, deeptest]
+    draft: false
+  add-comment:
+    max: 2
+  missing-tool:
+    create-issue: true
+
+timeout-minutes: 30
+
+steps:
+  - name: Checkout repository
+    uses: actions/checkout@v5
+
+---
+
+<!-- Edit the file linked below to modify the agent without recompilation. Feel free to move the entire markdown body to that file. -->
+{{#runtime-import agentics/deeptest.md}}
+
+## Context
+
+You are the DeepTest agent for the Z3 theorem prover repository.
+
+**Workflow dispatch file path**: ${{ github.event.inputs.file_path }}
+
+**Issue number** (if linked): ${{ github.event.inputs.issue_number }}
+
+If triggered by a slash command (`/deeptest`), extract the file path from the issue or comment text.
+
+## Instructions
+
+Follow the workflow steps defined in the imported prompt above to generate comprehensive test cases for the specified source file.