Merge pull request #8856 from angelica-moreira/memory-safety-pipeline

Add ASan/UBSan memory safety CI workflow
2026-06-02 15:17:54 +00:00 · 2026-03-04 13:05:57 -08:00 · 2026-03-04 13:05:57 -08:00 · 51ae6eeccc
commit 51ae6eeccc
parent d40c4721d8 123bb620d4
2 changed files with 457 additions and 0 deletions
--- a/.github/workflows/memory-safety-report.md
+++ b/.github/workflows/memory-safety-report.md
@ -0,0 +1,205 @@
 ---
 description: >
  Generates a detailed Memory Safety report for Z3 by analyzing ASan/UBSan
  sanitizer logs from the memory-safety workflow, posting findings as a
  GitHub Discussion.
 on:
  workflow_run:
    workflows: ["Memory Safety Analysis"]
    types: [completed]
  workflow_dispatch:
 timeout-minutes: 30
 permissions:
  actions: read
  contents: read
  discussions: write
 network: defaults
 tools:
  cache-memory: true
  github:
    toolsets: [default]
  bash: [":*"]
  glob: {}
  view: {}
 safe-outputs:
  create-discussion:
    title-prefix: "[Memory Safety] "
    category: "Agentic Workflows"
    close-older-discussions: true
  missing-tool:
    create-issue: true
 steps:
  - name: Checkout repository
    uses: actions/checkout@v5
    with:
      persist-credentials: false
 ---
 # Memory Safety Analysis Report Generator
 ## Job Description
 Your name is ${{ github.workflow }}. You are an expert memory safety analyst for the Z3 theorem prover repository `${{ github.repository }}`. Your task is to download, analyze, and report on the results from the Memory Safety Analysis workflow, covering runtime sanitizer (ASan/UBSan) findings.
 ## Your Task
 ### 1. Download Artifacts from the Triggering Workflow Run
 If triggered by `workflow_run`, download the artifacts from the completed Memory Safety Analysis run:
 ```bash
 # Get the triggering run ID
 RUN_ID="${{ github.event.workflow_run.id }}"
 # If manual dispatch, find the latest Memory Safety Analysis run
 if [ -z "$RUN_ID" ] || [ "$RUN_ID" = "" ]; then
  echo "Manual dispatch — finding latest Memory Safety Analysis run..."
  gh run list --workflow="Memory Safety Analysis" --limit=1 --json databaseId --jq '.[0].databaseId'
 fi
 ```
 Download all artifacts:
 ```bash
 mkdir -p /tmp/reports
 gh run download "$RUN_ID" --dir /tmp/reports 2>&1 || echo "Some artifacts may not be available"
 ls -la /tmp/reports/
 ```
 ### 2. Analyze Sanitizer Reports
 Parse the ASan and UBSan report files:
 ```bash
 # Check ASan results
 if [ -d /tmp/reports/asan-reports ]; then
  cat /tmp/reports/asan-reports/summary.md
  ls /tmp/reports/asan-reports/
 fi
 # Check UBSan results
 if [ -d /tmp/reports/ubsan-reports ]; then
  cat /tmp/reports/ubsan-reports/summary.md
  ls /tmp/reports/ubsan-reports/
 fi
 ```
 For each sanitizer finding, extract:
 - **Error type** (heap-buffer-overflow, heap-use-after-free, stack-buffer-overflow, signed-integer-overflow, null-pointer-dereference, etc.)
 - **Source location** (file, line, column)
 - **Stack trace** (first 5 frames)
 - **Allocation/deallocation site** (for memory errors)
 ### 3. Compare with Previous Results
 Check cache memory for previous run results:
 - Total findings from last run (ASan + UBSan)
 - List of previously known issues
 - Identify new findings (regressions) vs. resolved findings (improvements)
 ### 4. Generate the Discussion Report
 Create a comprehensive GitHub Discussion with this structure:
 ```markdown
 # Memory Safety Analysis Report
 **Date**: YYYY-MM-DD
 **Commit**: `<short SHA>` on branch `<branch>`
 **Triggered by**: push / workflow_dispatch
 **Workflow Run**: [#<run_id>](link)
 ## Executive Summary
 | Category | ASan | UBSan | Total |
 |----------|------|-------|-------|
 | Buffer Overflow | Y | - | Z |
 | Use-After-Free | Y | - | Z |
 | Double-Free | Y | - | Z |
 | Null Dereference | - | - | Z |
 | Integer Overflow | - | Y | Z |
 | Undefined Behavior | - | Y | Z |
 | Other | Y | Z | Z |
 | **Total** | **Y** | **Z** | **N** |
 ## Trend
 - New findings since last run: N
 - Resolved since last run: N
 - Unchanged: N
 ## Critical Findings (Immediate Action Needed)
 [List any high-severity findings: buffer overflows, use-after-free, double-free]
 ## Important Findings (Should Fix)
 [List medium-severity: null derefs, integer overflows]
 ## Low-Severity / Informational
 [List warnings: potential issues]
 ## ASan Findings
 [Each finding with error type, location, and stack trace snippet]
 ## UBSan Findings
 [Each finding with error type, location, and explanation]
 ## Top Affected Files
 | File | Findings |
 |------|----------|
 | src/... | N |
 ## Recommendations
 1. [Actionable recommendations based on the findings]
 2. [Patterns to address]
 <details>
 <summary>Raw Data</summary>
 [Compressed summary of all data for future reference]
 </details>
 ```
 ### 5. Update Cache Memory
 Store the current run's results in cache memory for future comparison:
 - Total count by category
 - List of file:line pairs with findings
 - Run metadata (commit SHA, date, run ID)
 ### 6. Handle Edge Cases
 - If the triggering workflow failed entirely, report that analysis could not complete and include any partial results.
 - If no artifacts are available, report that and suggest running the workflow manually.
 - If zero findings across all tools, create a discussion noting the clean bill of health.
 ## Guidelines
 - **Be thorough**: Analyze every available artifact and log file.
 - **Be accurate**: Distinguish between ASan and UBSan findings.
 - **Be actionable**: For each finding, include enough context to locate and understand the issue.
 - **Track trends**: Use cache memory to identify regressions and improvements over time.
 - **Prioritize**: Critical memory safety issues (buffer overflow, UAF, double-free) should be prominently highlighted.
 ## Important Notes
 - **DO NOT** create pull requests or modify source files.
 - **DO NOT** attempt to fix the findings automatically.
 - **DO** close older Memory Safety discussions automatically (configured via `close-older-discussions: true`).
 - **DO** always report the commit SHA so findings can be correlated with specific code versions.
 - **DO** use cache memory to track trends over multiple runs.
--- a/.github/workflows/memory-safety.yml
+++ b/.github/workflows/memory-safety.yml
@ -0,0 +1,252 @@
 name: Memory Safety Analysis
 on:
  push:
    branches: ["**"]
    paths:
      - 'src/**'
      - '.github/workflows/memory-safety.yml'
  workflow_dispatch:
    inputs:
      full_scan:
        description: 'Run full codebase scan (not just changed files)'
        required: false
        default: 'false'
        type: boolean
 permissions:
  contents: read
  actions: read
 concurrency:
  group: memory-safety-${{ github.ref }}
  cancel-in-progress: true
 jobs:
  # ============================================================================
  # Job 1: AddressSanitizer Build and Tests
  # ============================================================================
  asan-test:
    name: "ASan Build & Test"
    runs-on: ubuntu-latest
    timeout-minutes: 120
    env:
      ASAN_OPTIONS: "detect_leaks=1:halt_on_error=0:print_stats=1:log_path=/tmp/asan"
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.x'
      - name: Install dependencies
        run: sudo apt-get update && sudo apt-get install -y ninja-build clang
      - name: Configure with ASan
        run: |
          mkdir -p build-asan
          cd build-asan
          CC=clang CXX=clang++ cmake \
            -DCMAKE_BUILD_TYPE=Debug \
            -DCMAKE_C_FLAGS="-fsanitize=address -fno-omit-frame-pointer -fno-optimize-sibling-calls" \
            -DCMAKE_CXX_FLAGS="-fsanitize=address -fno-omit-frame-pointer -fno-optimize-sibling-calls" \
            -DCMAKE_EXE_LINKER_FLAGS="-fsanitize=address" \
            -DCMAKE_SHARED_LINKER_FLAGS="-fsanitize=address" \
            -G Ninja ../
      - name: Build Z3 with ASan
        run: |
          cd build-asan
          ninja -j$(nproc)
          ninja test-z3
      - name: Run unit tests under ASan
        run: |
          cd build-asan
          ./test-z3 -a 2>&1 | tee /tmp/asan-unit-test.log
        continue-on-error: true
      - name: Run SMT-LIB2 benchmarks under ASan
        run: |
          cd build-asan
          for f in ../examples/SMT-LIB2/bounded\ model\ checking/*.smt2; do
            echo "=== Testing: $f ==="
            timeout 60 ./z3 "$f" 2>&1 || true
          done | tee /tmp/asan-benchmark.log
        continue-on-error: true
      - name: Run regression tests under ASan
        run: |
          git clone --depth=1 https://github.com/z3prover/z3test z3test
          python z3test/scripts/test_benchmarks.py build-asan/z3 z3test/regressions/smt2 2>&1 | tee /tmp/asan-regression.log
        continue-on-error: true
      - name: Collect ASan reports
        if: always()
        run: |
          mkdir -p /tmp/asan-reports
          cp /tmp/asan* /tmp/asan-reports/ 2>/dev/null || true
          if ls /tmp/asan.* 1>/dev/null 2>&1; then
            cp /tmp/asan.* /tmp/asan-reports/
          fi
          echo "# ASan Summary" > /tmp/asan-reports/summary.md
          echo "" >> /tmp/asan-reports/summary.md
          if ls /tmp/asan-reports/asan.* 1>/dev/null 2>&1; then
            echo "## Errors Found" >> /tmp/asan-reports/summary.md
            for f in /tmp/asan-reports/asan.*; do
              echo '```' >> /tmp/asan-reports/summary.md
              head -50 "$f" >> /tmp/asan-reports/summary.md
              echo '```' >> /tmp/asan-reports/summary.md
              echo "" >> /tmp/asan-reports/summary.md
            done
          else
            echo "No ASan errors detected." >> /tmp/asan-reports/summary.md
          fi
      - name: Upload ASan reports
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: asan-reports
          path: /tmp/asan-reports/
          retention-days: 30
  # ============================================================================
  # Job 2: UndefinedBehaviorSanitizer Build and Tests
  # ============================================================================
  ubsan-test:
    name: "UBSan Build & Test"
    runs-on: ubuntu-latest
    timeout-minutes: 120
    env:
      UBSAN_OPTIONS: "print_stacktrace=1:halt_on_error=0:log_path=/tmp/ubsan"
    steps:
      - name: Checkout repository
        uses: actions/checkout@v4
      - name: Setup Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.x'
      - name: Install dependencies
        run: sudo apt-get update && sudo apt-get install -y ninja-build clang
      - name: Configure with UBSan
        run: |
          mkdir -p build-ubsan
          cd build-ubsan
          CC=clang CXX=clang++ cmake \
            -DCMAKE_BUILD_TYPE=Debug \
            -DCMAKE_C_FLAGS="-fsanitize=undefined -fno-omit-frame-pointer -fno-sanitize-recover=all" \
            -DCMAKE_CXX_FLAGS="-fsanitize=undefined -fno-omit-frame-pointer -fno-sanitize-recover=all" \
            -DCMAKE_EXE_LINKER_FLAGS="-fsanitize=undefined" \
            -DCMAKE_SHARED_LINKER_FLAGS="-fsanitize=undefined" \
            -G Ninja ../
      - name: Build Z3 with UBSan
        run: |
          cd build-ubsan
          ninja -j$(nproc)
          ninja test-z3
      - name: Run unit tests under UBSan
        run: |
          cd build-ubsan
          ./test-z3 -a 2>&1 | tee /tmp/ubsan-unit-test.log
        continue-on-error: true
      - name: Run SMT-LIB2 benchmarks under UBSan
        run: |
          cd build-ubsan
          for f in ../examples/SMT-LIB2/bounded\ model\ checking/*.smt2; do
            echo "=== Testing: $f ==="
            timeout 60 ./z3 "$f" 2>&1 || true
          done | tee /tmp/ubsan-benchmark.log
        continue-on-error: true
      - name: Run regression tests under UBSan
        run: |
          git clone --depth=1 https://github.com/z3prover/z3test z3test
          python z3test/scripts/test_benchmarks.py build-ubsan/z3 z3test/regressions/smt2 2>&1 | tee /tmp/ubsan-regression.log
        continue-on-error: true
      - name: Collect UBSan reports
        if: always()
        run: |
          mkdir -p /tmp/ubsan-reports
          cp /tmp/ubsan* /tmp/ubsan-reports/ 2>/dev/null || true
          if ls /tmp/ubsan.* 1>/dev/null 2>&1; then
            cp /tmp/ubsan.* /tmp/ubsan-reports/
          fi
          echo "# UBSan Summary" > /tmp/ubsan-reports/summary.md
          echo "" >> /tmp/ubsan-reports/summary.md
          if ls /tmp/ubsan-reports/ubsan.* 1>/dev/null 2>&1; then
            echo "## Errors Found" >> /tmp/ubsan-reports/summary.md
            for f in /tmp/ubsan-reports/ubsan.*; do
              echo '```' >> /tmp/ubsan-reports/summary.md
              head -50 "$f" >> /tmp/ubsan-reports/summary.md
              echo '```' >> /tmp/ubsan-reports/summary.md
              echo "" >> /tmp/ubsan-reports/summary.md
            done
          else
            echo "No UBSan errors detected." >> /tmp/ubsan-reports/summary.md
          fi
      - name: Upload UBSan reports
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: ubsan-reports
          path: /tmp/ubsan-reports/
          retention-days: 30
  # ============================================================================
  # Job 3: Summary Report
  # ============================================================================
  summary:
    name: "Memory Safety Summary"
    runs-on: ubuntu-latest
    needs: [asan-test, ubsan-test]
    if: always()
    steps:
      - name: Download all artifacts
        uses: actions/download-artifact@v4
        with:
          path: reports/
      - name: Generate summary
        run: |
          echo "# Memory Safety Analysis Report" >> $GITHUB_STEP_SUMMARY
          echo "" >> $GITHUB_STEP_SUMMARY
          echo "**Commit**: \`${{ github.sha }}\`" >> $GITHUB_STEP_SUMMARY
          echo "**Branch**: \`${{ github.ref_name }}\`" >> $GITHUB_STEP_SUMMARY
          echo "**Trigger**: \`${{ github.event_name }}\`" >> $GITHUB_STEP_SUMMARY
          echo "" >> $GITHUB_STEP_SUMMARY
          echo "## Job Results" >> $GITHUB_STEP_SUMMARY
          echo "" >> $GITHUB_STEP_SUMMARY
          echo "| Analysis | Status |" >> $GITHUB_STEP_SUMMARY
          echo "|----------|--------|" >> $GITHUB_STEP_SUMMARY
          echo "| AddressSanitizer | \`${{ needs.asan-test.result }}\` |" >> $GITHUB_STEP_SUMMARY
          echo "| UndefinedBehaviorSanitizer | \`${{ needs.ubsan-test.result }}\` |" >> $GITHUB_STEP_SUMMARY
          echo "" >> $GITHUB_STEP_SUMMARY
          if [ -f reports/asan-reports/summary.md ]; then
            echo "## ASan Results" >> $GITHUB_STEP_SUMMARY
            cat reports/asan-reports/summary.md >> $GITHUB_STEP_SUMMARY
            echo "" >> $GITHUB_STEP_SUMMARY
          fi
          if [ -f reports/ubsan-reports/summary.md ]; then
            echo "## UBSan Results" >> $GITHUB_STEP_SUMMARY
            cat reports/ubsan-reports/summary.md >> $GITHUB_STEP_SUMMARY
            echo "" >> $GITHUB_STEP_SUMMARY
          fi
          echo "## Artifacts" >> $GITHUB_STEP_SUMMARY
          echo "" >> $GITHUB_STEP_SUMMARY
          echo "- Sanitizer logs are available as workflow artifacts" >> $GITHUB_STEP_SUMMARY
          echo "- Run with \`workflow_dispatch\` and \`full_scan: true\` for complete codebase analysis" >> $GITHUB_STEP_SUMMARY