fix memory-safety-report to download artifacts via MCP tools (#8979)

gh CLI is not available inside AWF so the agent could not download artifacts. Switch to GitHub MCP actions toolset for artifact URLs and add helper scripts for download and parsing.
2026-05-14 06:05:28 +00:00 · 2026-03-15 10:12:49 -07:00 · 2026-03-15 10:12:49 -07:00 · db46d52056
commit db46d52056
parent 6fb68ac010
18 changed files with 734 additions and 700 deletions
--- a/.github/workflows/memory-safety-report.md
+++ b/.github/workflows/memory-safety-report.md
@ -1,13 +1,14 @@
 ---
 description: >
-  Generates a detailed Memory Safety report for Z3 by analyzing ASan/UBSan
-  sanitizer logs from the memory-safety workflow, posting findings as a
-  GitHub Discussion.
+  Analyze ASan/UBSan sanitizer logs from the memory-safety workflow
+  and post findings as a GitHub Discussion.

 on:
  workflow_run:
    workflows: ["Memory Safety Analysis"]
    types: [completed]
+    branches:
+      - master
  workflow_dispatch:

 timeout-minutes: 30
@ -16,6 +17,8 @@ permissions:
  actions: read
  contents: read
  discussions: read
+  issues: read
+  pull-requests: read

 env:
  GH_TOKEN: ${{ github.token }}
@ -25,16 +28,20 @@ network: defaults
 tools:
  cache-memory: true
  github:
-    toolsets: [default]
+    toolsets: [default, actions]
  bash: [":*"]
  glob: {}
  view: {}

 safe-outputs:
+  mentions: false
+  allowed-github-references: []
+  max-bot-mentions: 1
  create-discussion:
    title-prefix: "[Memory Safety] "
    category: "Agentic Workflows"
    close-older-discussions: true
+    expires: 7
  missing-tool:
    create-issue: true
  noop:
@ -54,34 +61,30 @@ steps:

 Your name is ${{ github.workflow }}. You are an expert memory safety analyst for the Z3 theorem prover repository `${{ github.repository }}`. Your task is to download, analyze, and report on the results from the Memory Safety Analysis workflow, covering runtime sanitizer (ASan/UBSan) findings.

+**The `gh` CLI is not authenticated inside AWF.** Use GitHub MCP tools for all GitHub API interaction. Do not use `gh run download` or any other `gh` command.
+
 ## Your Task

 ### 1. Download Artifacts from the Triggering Workflow Run

-If triggered by `workflow_run`, download the artifacts from the completed Memory Safety Analysis run:
+If triggered by `workflow_run`, the run ID is `${{ github.event.workflow_run.id }}`. If manual dispatch (empty run ID), call `github-mcp-server-actions_list` with method `list_workflow_runs` for the "Memory Safety Analysis" workflow and pick the latest completed run.
+
+Get the artifact list and download URLs:
+
+1. Call `github-mcp-server-actions_list` with method `list_workflow_run_artifacts` and the run ID. The run produces two artifacts: `asan-reports` and `ubsan-reports`.
+2. For each artifact, call `github-mcp-server-actions_get` with method `download_workflow_run_artifact` and the artifact ID. This returns a temporary download URL.
+3. Run the helper scripts to download, extract, and parse:

 ```bash
-# Get the triggering run ID
-RUN_ID="${{ github.event.workflow_run.id }}"
-
-# If manual dispatch, find the latest Memory Safety Analysis run
-if [ -z "$RUN_ID" ] || [ "$RUN_ID" = "" ]; then
-  echo "Manual dispatch — finding latest Memory Safety Analysis run..."
-  gh run list --workflow="Memory Safety Analysis" --limit=1 --json databaseId --jq '.[0].databaseId'
-fi
+bash .github/scripts/fetch-artifacts.sh "$ASAN_URL" "$UBSAN_URL"
+python3 .github/scripts/parse_sanitizer_reports.py /tmp/reports
 ```

-Download all artifacts:
-
-```bash
-mkdir -p /tmp/reports
-gh run download "$RUN_ID" --dir /tmp/reports 2>&1 || echo "Some artifacts may not be available"
-ls -la /tmp/reports/
-```
+After this, `/tmp/reports/{asan,ubsan}-reports/` contain the extracted files, `/tmp/parsed-report.json` has structured findings, and `/tmp/fetch-artifacts.log` has the download log.

 ### 2. Analyze Sanitizer Reports

-Parse the ASan and UBSan report files:
+Read `/tmp/parsed-report.json` for structured data. Also inspect the raw files if needed:

 ```bash
 # Check ASan results
@ -112,17 +115,16 @@ Check cache memory for previous run results:

 ### 4. Generate the Discussion Report

-Create a comprehensive GitHub Discussion with this structure:
+Create a GitHub Discussion. Use `###` or lower for section headers, never `##` or `#`. Wrap verbose sections in `<details>` tags to keep the report scannable.

 ```markdown
-# Memory Safety Analysis Report
-
 **Date**: YYYY-MM-DD
-**Commit**: `<short SHA>` on branch `<branch>`
-**Triggered by**: push / workflow_dispatch
-**Workflow Run**: [#<run_id>](link)
+**Commit**: `<short SHA>` ([full_sha](link)) on branch `<branch>`
+**Commit message**: first line of commit message
+**Triggered by**: push / workflow_dispatch (Memory Safety Analysis run [#<run_id>](link))
+**Report run**: [#<run_id>](link)

-## Executive Summary
+### Executive Summary

 | Category | ASan | UBSan | Total |
 |----------|------|-------|-------|
@ -135,51 +137,63 @@ Create a comprehensive GitHub Discussion with this structure:
 | Other | Y | Z | Z |
 | **Total** | **Y** | **Z** | **N** |

-## Trend
+### Trend

 - New findings since last run: N
 - Resolved since last run: N
 - Unchanged: N

-## Critical Findings (Immediate Action Needed)
+### Critical Findings (Immediate Action Needed)

 [List any high-severity findings: buffer overflows, use-after-free, double-free]

-## Important Findings (Should Fix)
+### Important Findings (Should Fix)

 [List medium-severity: null derefs, integer overflows]

-## Low-Severity / Informational
+### Low-Severity / Informational

 [List warnings: potential issues]

-## ASan Findings
+<details>
+<summary><b>ASan Findings</b></summary>

 [Each finding with error type, location, and stack trace snippet]

-## UBSan Findings
+</details>
+
+<details>
+<summary><b>UBSan Findings</b></summary>

 [Each finding with error type, location, and explanation]

-## Top Affected Files
+</details>
+
+### Top Affected Files

 | File | Findings |
 |------|----------|
 | src/... | N |

-## Recommendations
+### Known Suppressions
+
+[List from parsed-report.json suppressions field]
+
+### Recommendations

 1. [Actionable recommendations based on the findings]
 2. [Patterns to address]

 <details>
-<summary>Raw Data</summary>
+<summary><b>Raw Data</b></summary>

 [Compressed summary of all data for future reference]

 </details>
 ```

+If zero findings across all tools, create a discussion noting a clean run with the commit and workflow run link.
+
 ### 5. Update Cache Memory

 Store the current run's results in cache memory for future comparison:
@ -191,20 +205,20 @@ Store the current run's results in cache memory for future comparison:

 - If the triggering workflow failed entirely, report that analysis could not complete and include any partial results.
 - If no artifacts are available, report that and suggest running the workflow manually.
- If zero findings across all tools, create a discussion noting the clean bill of health.
+- If the helper scripts fail, report the error in the discussion body and stop.

 ## Guidelines

- **Be thorough**: Analyze every available artifact and log file.
- **Be accurate**: Distinguish between ASan and UBSan findings.
- **Be actionable**: For each finding, include enough context to locate and understand the issue.
- **Track trends**: Use cache memory to identify regressions and improvements over time.
- **Prioritize**: Critical memory safety issues (buffer overflow, UAF, double-free) should be prominently highlighted.
+- Be thorough: analyze every available artifact and log file.
+- Be accurate: distinguish between ASan and UBSan findings.
+- Be actionable: for each finding, include enough context to locate and understand the issue.
+- Track trends: use cache memory to identify regressions and improvements over time.
+- Prioritize: critical memory safety issues (buffer overflow, UAF, double-free) should be prominently highlighted.

 ## Important Notes

- **DO NOT** create pull requests or modify source files.
- **DO NOT** attempt to fix the findings automatically.
- **DO** close older Memory Safety discussions automatically (configured via `close-older-discussions: true`).
- **DO** always report the commit SHA so findings can be correlated with specific code versions.
- **DO** use cache memory to track trends over multiple runs.
+- DO NOT create pull requests or modify source files.
+- DO NOT attempt to fix the findings automatically.
+- DO close older Memory Safety discussions automatically (configured via `close-older-discussions: true`).
+- DO always report the commit SHA so findings can be correlated with specific code versions.
+- DO use cache memory to track trends over multiple runs.