From d29a1ebd38113e77915abb8cc917c54f05bb0a94 Mon Sep 17 00:00:00 2001 From: Copilot <198982749+Copilot@users.noreply.github.com> Date: Sat, 11 Apr 2026 11:16:00 -0700 Subject: [PATCH] Fix Ostrich Benchmark workflow: allow NuGet and guarantee safe output (#9267) * Initial plan * fix ostrich-benchmark: add api.nuget.org to network allowlist and ensure safe output is always produced Agent-Logs-Url: https://github.com/Z3Prover/z3/sessions/7eb3a93e-e81b-4b79-b84b-080a7bacfec0 Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com> Co-authored-by: Nikolaj Bjorner --- .github/workflows/ostrich-benchmark.lock.yml | 2 +- .github/workflows/ostrich-benchmark.md | 16 +++++++++++++++- 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/.github/workflows/ostrich-benchmark.lock.yml b/.github/workflows/ostrich-benchmark.lock.yml index 14a1a177e..e8585ca99 100644 --- a/.github/workflows/ostrich-benchmark.lock.yml +++ b/.github/workflows/ostrich-benchmark.lock.yml @@ -90,7 +90,7 @@ jobs: GH_AW_INFO_EXPERIMENTAL: "false" GH_AW_INFO_SUPPORTS_TOOLS_ALLOWLIST: "true" GH_AW_INFO_STAGED: "false" - GH_AW_INFO_ALLOWED_DOMAINS: '["defaults"]' + GH_AW_INFO_ALLOWED_DOMAINS: '["defaults","api.nuget.org"]' GH_AW_INFO_FIREWALL_ENABLED: "true" GH_AW_INFO_AWF_VERSION: "v0.25.18" GH_AW_INFO_AWMG_VERSION: "" diff --git a/.github/workflows/ostrich-benchmark.md b/.github/workflows/ostrich-benchmark.md index 6e7c450e4..22cbc4cff 100644 --- a/.github/workflows/ostrich-benchmark.md +++ b/.github/workflows/ostrich-benchmark.md @@ -8,7 +8,10 @@ on: permissions: read-all -network: defaults +network: + allowed: + - defaults + - api.nuget.org tools: bash: true @@ -402,3 +405,14 @@ Post the Markdown report as a new GitHub Discussion using the `create-discussion - **Handle build failures gracefully**: If Z3 fails to build, report the error and create a brief discussion noting the build failure. If ZIPT fails to build, continue with only the seq/nseq columns and note `n/a` for ZIPT results. - **Large report**: Always put the per-file table in a `
` collapsible section since there may be many files. - **Progress logging**: Print a line per file as you run it (e.g., `[N] [filename] seq=...`) so the workflow log shows progress even for large benchmark sets. + +## Safe Output Guarantee + +You **MUST** call either `create_discussion` or `noop` before the workflow ends, regardless of what happened during execution: + +- **Build succeeded, benchmarks ran**: Call `create_discussion` with the full report. +- **Build succeeded, benchmarks partially ran**: Call `create_discussion` with whatever results were collected and a note about what could not be completed. +- **Z3 build failed**: Call `noop` with a brief message describing the build error. +- **No benchmarks could be run**: Call `noop` with a summary of what failed and why. + +Failing to produce any safe output triggers an automatic workflow-failure issue that clutters the repository.