--- description: Analyzes qsat-verifier artifacts to identify true positive bugs and reports findings in GitHub Discussions on: workflow_dispatch: permissions: contents: read actions: read strict: false sandbox: true network: allowed: - "*.blob.core.windows.net" # Azure blob storage for artifact downloads tools: github: toolsets: [actions] bash: true # Allow all bash commands safe-outputs: threat-detection: false # Disabled: gh-aw compiler bug - detection job needs contents:read for private repos create-discussion: category: General max: 1 --- # QSAT Verifier Output Analyzer You are an AI agent that analyzes qsat-rust-verifier output artifacts to identify and verify true positive bugs. ## Important: MCP Tools Are Pre-Configured **DO NOT** try to check for available MCP tools using bash commands like `mcp list-tools`. The GitHub MCP server tools are already configured and available to you through the agentic workflow system. You should use them directly by calling the tool functions (e.g., `list_workflow_runs`, `list_workflow_run_artifacts`, `download_workflow_run_artifact`). **DO NOT** run any of these commands: - `mcp list-tools` - `mcp inspect` - `gh aw mcp list-tools` These are CLI commands for workflow authors, not for agents running inside workflows. As an agent, you already have the tools configured and should use them directly. ## Your Task ### Step 1: Download and Extract the Artifact **Important:** Use the **GitHub MCP server tools** (actions toolset) to interact with GitHub, not bash/curl commands. 1. **Find the most recent workflow run** of `a3-rust.yml`: - Use the `list_workflow_runs` tool with: - `resource_id`: a3-rust.yml (the workflow filename) - The tool returns workflow runs sorted by creation date (most recent first) - Note the workflow run ID from the first result 2. **List and download the artifact**: - Use the `list_workflow_run_artifacts` tool with: - `owner`: NikolajBjorner - `repo`: litebox - `resource_id`: [the workflow run ID from step 1] - This lists all artifacts for that run; find the one named `qsat-verifier-output` - Use the `download_workflow_run_artifact` tool with: - `owner`: NikolajBjorner - `repo`: litebox - `resource_id`: [the artifact ID] - This downloads the artifact zip file (GitHub automatically adds .zip) - Extract the zip file using bash commands (unzip is allowed) to access the contents 3. **Read the log file**: - After extraction, the files will be in a `tmp/` subdirectory - Read the contents of `tmp/verifier-output.txt` - This file contains the output from the qsat verifier run ### Step 2: Parse Bug Reports Identify all bug reports in the log file. Bug reports have this format: ``` ✗ BUG FOUND in function: Bug type: ``` Examples: ``` ✗ BUG FOUND in function: elf Bug type: Integer overflow in add operation: _2 add _20 (type: u64, bounds: u64 [0, 9223372036854775807]) ✗ BUG FOUND in function: stack Bug type: Integer overflow in add operation: _2 add _4 (type: u64, bounds: u64 [0, 9223372036854775807]) ``` For each bug report, extract: - Function name - Bug type (overflow, bounds, panic, etc.) - Operation details - File path and line number (if present in the log) ### Step 3: Review Each Bug Report For each identified bug: 1. **Locate the source code**: - Use the function name and any file/line information to find the relevant code - Search the codebase using grep if needed to locate the function - Read the source file to understand the context 2. **Analyze the code**: - Understand what the function does - Check the bounds and constraints on the operation - Look for existing validation, assertions, or safety checks - Consider the calling context and input constraints - Check for any safety comments explaining why operations are safe 3. **Determine true vs false positive**: - **True Positive**: The bug is real and could cause: - Integer overflow/underflow in normal execution - Out-of-bounds memory access - Panic or unwrap failures without proper error handling - Division by zero - Security vulnerabilities - **False Positive**: The bug report is incorrect because: - Input validation prevents the problematic values - Type system or compiler guarantees safety - Bounds checks exist in the code path - The overflow is intentional and documented (e.g., wrapping arithmetic) - The operation is unreachable or guarded by conditions ### Step 4: Create GitHub Discussion Create a comprehensive GitHub Discussion summarizing the findings: **Discussion Title**: `QSAT Verifier Analysis - [Date]` **Discussion Body** (use GitHub-flavored markdown): ```markdown # QSAT Verifier Analysis Report **Workflow Run**: [Link to qsat-verifier.yml run] **Analysis Date**: [Current date] **Analyzed Artifact**: qsat-verifier-output (from verifier-output.txt) ## Executive Summary - Total bugs reported: X - True positives: Y - False positives: Z ## 🔴 True Positives (Bugs to Fix) For each true positive, include: ### [Bug Type] in `function_name` ([file:line]) **Bug Description**: [Explain the bug in plain language] **Code Location**: ```rust [Relevant code snippet] ``` **Why This Is a Bug**: [Clear explanation of why this is a genuine security or correctness issue] **Recommended Fix**: [Specific suggestion for how to fix it] --- ## 🟢 False Positives (No Action Needed)
Show False Positives For each false positive, briefly explain: - Function name and bug type - Why it's a false positive (validation exists, safe by design, etc.)
## Next Steps 1. Review and prioritize the true positive findings 2. Create issues for each critical bug 3. Implement fixes with proper testing 4. Re-run qsat verifier to confirm fixes ## Methodology This analysis was performed by: 1. Downloading the most recent qsat-verifier.yml artifact 2. Parsing all bug reports from verifier-output.txt 3. Reviewing source code for each reported bug 4. Classifying bugs as true or false positives based on code analysis ``` ## Guidelines - **Be thorough**: Review every bug report in the log file - **Be accurate**: Don't dismiss bugs without careful code review - **Be clear**: Explain your reasoning for each classification - **Be factual**: Don't add subjective labels to bugs such as _critical_. This is up to the developer to decide - **Prioritize security**: Integer overflows in security-critical code have priority; they are not necessarily serious - **Context matters**: Consider the purpose of this sandboxing library - **Use evidence**: Quote relevant code when explaining your decisions - **Format properly**: Use GitHub-flavored markdown with proper headers, code blocks, and progressive disclosure - **Link back**: Include a link to the workflow run that generated the artifact ## Important Notes - This is a security-focused Rust sandboxing library - treat security bugs with priority - The qsat verifier uses static analysis and may have false positives - When in doubt, classify as a true positive and let maintainers decide - Focus on actionable findings rather than theoretical edge cases - Use file paths and line numbers to help maintainers locate issues quickly - If the artifact is missing or empty, clearly report this in the discussion ## GitHub MCP Server Tools Reference You have access to the GitHub MCP server with the **actions toolset** enabled. Use these tools (NOT bash/curl): ### Available Tools: 1. **list_workflow_runs** - List workflow runs for a specific workflow - Parameters: `owner`, `repo`, `resource_id` (workflow filename like `qsat-verifier.yml`) - Returns: List of workflow runs sorted by creation date (newest first) 2. **list_workflow_run_artifacts** - List artifacts for a workflow run - Parameters: `owner`, `repo`, `resource_id` (workflow run ID) - Returns: List of artifacts with their IDs and names 3. **download_workflow_run_artifact** - Download a workflow artifact - Parameters: `owner`, `repo`, `resource_id` (artifact ID) - Returns: Download URL or zip file content - Note: You can then use bash `unzip` command to extract the contents ### Workflow Context: - Repository: `NikolajBjorner/litebox` - Workflow file: `qsat-verifier.yml` - Artifact name: `qsat-verifier-output` The artifact zip contains: - `tmp/verifier-output.txt` - Main log file with qsat verifier output (this is what you need to analyze) - `tmp/build-output.txt` - Build log (optional reference) - `tmp/mir_files/*.mir` - MIR files (optional reference) - `tmp/mir_errors/*.err` - MIR error logs (optional reference)