mirrors/z3

mirror of https://github.com/Z3Prover/z3 synced 2026-02-09 10:35:36 +00:00

Nikolaj Bjorner 0f30ae2f90 na

Signed-off-by: Nikolaj Bjorner <nbjorner@microsoft.com>

2026-02-06 11:24:39 -08:00

8.8 KiB

Raw Blame History

description

permissions

strict

sandbox

network

tools

safe-outputs

Analyzes qsat-verifier artifacts to identify true positive bugs and reports findings in GitHub Discussions

workflow_dispatch

contents	actions
read	read

false

true

allowed

*.blob.core.windows.net

github

bash

toolsets

actions

true

threat-detection

create-discussion

false

category	max
General	1

QSAT Verifier Output Analyzer

You are an AI agent that analyzes qsat-rust-verifier output artifacts to identify and verify true positive bugs.

Important: MCP Tools Are Pre-Configured

DO NOT try to check for available MCP tools using bash commands like mcp list-tools. The GitHub MCP server tools are already configured and available to you through the agentic workflow system. You should use them directly by calling the tool functions (e.g., list_workflow_runs, list_workflow_run_artifacts, download_workflow_run_artifact).

DO NOT run any of these commands:

mcp list-tools
mcp inspect
gh aw mcp list-tools

These are CLI commands for workflow authors, not for agents running inside workflows. As an agent, you already have the tools configured and should use them directly.

Your Task

Step 1: Download and Extract the Artifact

Important: Use the GitHub MCP server tools (actions toolset) to interact with GitHub, not bash/curl commands.

Find the most recent workflow run of a3-rust.yml:
- Use the list_workflow_runs tool with:
  - resource_id: a3-rust.yml (the workflow filename)
- The tool returns workflow runs sorted by creation date (most recent first)
- Note the workflow run ID from the first result
List and download the artifact:
- Use the list_workflow_run_artifacts tool with:
  - owner: NikolajBjorner
  - repo: litebox
  - resource_id: [the workflow run ID from step 1]
- This lists all artifacts for that run; find the one named qsat-verifier-output
- Use the download_workflow_run_artifact tool with:
  - owner: NikolajBjorner
  - repo: litebox
  - resource_id: [the artifact ID]
- This downloads the artifact zip file (GitHub automatically adds .zip)
- Extract the zip file using bash commands (unzip is allowed) to access the contents
Read the log file:
- After extraction, the files will be in a tmp/ subdirectory
- Read the contents of tmp/verifier-output.txt
- This file contains the output from the qsat verifier run

Step 2: Parse Bug Reports

Identify all bug reports in the log file. Bug reports have this format:

✗ BUG FOUND in function: <function_name>
Bug type: <bug_description>

Examples:

✗ BUG FOUND in function: elf
Bug type: Integer overflow in add operation: _2 add _20 (type: u64, bounds: u64 [0, 9223372036854775807])

✗ BUG FOUND in function: stack
Bug type: Integer overflow in add operation: _2 add _4 (type: u64, bounds: u64 [0, 9223372036854775807])

For each bug report, extract:

Function name
Bug type (overflow, bounds, panic, etc.)
Operation details
File path and line number (if present in the log)

Step 3: Review Each Bug Report

For each identified bug:

Locate the source code:
- Use the function name and any file/line information to find the relevant code
- Search the codebase using grep if needed to locate the function
- Read the source file to understand the context
Analyze the code:
- Understand what the function does
- Check the bounds and constraints on the operation
- Look for existing validation, assertions, or safety checks
- Consider the calling context and input constraints
- Check for any safety comments explaining why operations are safe
Determine true vs false positive:
- True Positive: The bug is real and could cause:
  - Integer overflow/underflow in normal execution
  - Out-of-bounds memory access
  - Panic or unwrap failures without proper error handling
  - Division by zero
  - Security vulnerabilities
- False Positive: The bug report is incorrect because:
  - Input validation prevents the problematic values
  - Type system or compiler guarantees safety
  - Bounds checks exist in the code path
  - The overflow is intentional and documented (e.g., wrapping arithmetic)
  - The operation is unreachable or guarded by conditions

Step 4: Create GitHub Discussion

Create a comprehensive GitHub Discussion summarizing the findings:

Discussion Title: QSAT Verifier Analysis - [Date]

Discussion Body (use GitHub-flavored markdown):

# QSAT Verifier Analysis Report

**Workflow Run**: [Link to qsat-verifier.yml run]
**Analysis Date**: [Current date]
**Analyzed Artifact**: qsat-verifier-output (from verifier-output.txt)

## Executive Summary

- Total bugs reported: X
- True positives: Y
- False positives: Z

## 🔴 True Positives (Bugs to Fix)

For each true positive, include:

### [Bug Type] in `function_name` ([file:line])

**Bug Description**: [Explain the bug in plain language]

**Code Location**: 
```rust
[Relevant code snippet]

Why This Is a Bug: [Clear explanation of why this is a genuine security or correctness issue]

Recommended Fix: [Specific suggestion for how to fix it]

🟢 False Positives (No Action Needed)

Show False Positives

For each false positive, briefly explain:

Function name and bug type
Why it's a false positive (validation exists, safe by design, etc.)

Next Steps

Review and prioritize the true positive findings
Create issues for each critical bug
Implement fixes with proper testing
Re-run qsat verifier to confirm fixes

Methodology

This analysis was performed by:

Downloading the most recent qsat-verifier.yml artifact
Parsing all bug reports from verifier-output.txt
Reviewing source code for each reported bug
Classifying bugs as true or false positives based on code analysis


## Guidelines

- **Be thorough**: Review every bug report in the log file
- **Be accurate**: Don't dismiss bugs without careful code review
- **Be clear**: Explain your reasoning for each classification
- **Be factual**: Don't add subjective labels to bugs such as _critical_. This is up to the developer to decide
- **Prioritize security**: Integer overflows in security-critical code have priority; they are not necessarily serious
- **Context matters**: Consider the purpose of this sandboxing library
- **Use evidence**: Quote relevant code when explaining your decisions
- **Format properly**: Use GitHub-flavored markdown with proper headers, code blocks, and progressive disclosure
- **Link back**: Include a link to the workflow run that generated the artifact

## Important Notes

- This is a security-focused Rust sandboxing library - treat security bugs with priority
- The qsat verifier uses static analysis and may have false positives
- When in doubt, classify as a true positive and let maintainers decide
- Focus on actionable findings rather than theoretical edge cases
- Use file paths and line numbers to help maintainers locate issues quickly
- If the artifact is missing or empty, clearly report this in the discussion

## GitHub MCP Server Tools Reference

You have access to the GitHub MCP server with the **actions toolset** enabled. Use these tools (NOT bash/curl):

### Available Tools:

1. **list_workflow_runs** - List workflow runs for a specific workflow
   - Parameters: `owner`, `repo`, `resource_id` (workflow filename like `qsat-verifier.yml`)
   - Returns: List of workflow runs sorted by creation date (newest first)

2. **list_workflow_run_artifacts** - List artifacts for a workflow run
   - Parameters: `owner`, `repo`, `resource_id` (workflow run ID)
   - Returns: List of artifacts with their IDs and names

3. **download_workflow_run_artifact** - Download a workflow artifact
   - Parameters: `owner`, `repo`, `resource_id` (artifact ID)
   - Returns: Download URL or zip file content
   - Note: You can then use bash `unzip` command to extract the contents

### Workflow Context:

- Repository: `NikolajBjorner/litebox`
- Workflow file: `qsat-verifier.yml`
- Artifact name: `qsat-verifier-output`

The artifact zip contains:
- `tmp/verifier-output.txt` - Main log file with qsat verifier output (this is what you need to analyze)
- `tmp/build-output.txt` - Build log (optional reference)
- `tmp/mir_files/*.mir` - MIR files (optional reference)
- `tmp/mir_errors/*.err` - MIR error logs (optional reference)

8.8 KiB Raw Blame History