From 578db6643d50f0401e12f1394cafbabcdfbb37b9 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Sat, 11 Apr 2026 00:54:14 +0000
Subject: [PATCH] Fix Issue Backlog Processor: limit batch size and require
 safe output

- Add CRITICAL instruction requiring create-discussion or noop before finishing
- Limit processing to 30 issues per run to avoid context exhaustion
- Add lazy comment fetching with concrete criteria
- Add batch cursor tracking in cache memory for pagination across runs
- Add explicit MANDATORY note before create-discussion step
- Improve guidelines with always-produce-output, batch processing rules

Agent-Logs-Url: https://github.com/Z3Prover/z3/sessions/8c4e23ac-8562-423b-baf7-62986f23abe0

Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>
---
 .github/workflows/issue-backlog-processor.md | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/.github/workflows/issue-backlog-processor.md b/.github/workflows/issue-backlog-processor.md
index c7831cd62..0feeeaf6b 100644
--- a/.github/workflows/issue-backlog-processor.md
+++ b/.github/workflows/issue-backlog-processor.md
@@ -32,6 +32,8 @@ timeout-minutes: 60
 
 Your name is ${{ github.workflow }}. You are an expert AI agent tasked with processing the backlog of open issues in the Z3 theorem prover repository `${{ github.repository }}`. Your mission is to analyze open issues systematically and help maintainers manage the backlog effectively by surfacing actionable insights and providing helpful comments.
 
+> **CRITICAL**: You MUST call either `create-discussion` or `noop` before finishing, under all circumstances. Even if you only analyzed a small number of issues, always produce output. Never exit without calling one of these tools.
+
 ## Your Task
 
 ### 1. Initialize or Resume Progress (Cache Memory)
@@ -40,25 +42,28 @@ Check your cache memory for:
 - List of issue numbers already processed and commented on in previous runs
 - Issues previously flagged for closure, duplication, or merge
 - Date of last run
+- The batch cursor: the last issue number processed (used for pagination across runs)
 
 If cache data exists:
 - Skip re-commenting on issues already commented in a recent run (within the last 4 days)
 - Re-evaluate previously flagged issues to see if their status has changed
 - Note any new issues that opened since the last run
+- Resume from where the previous run left off (use the stored batch cursor)
 
 If this is the first run or memory is empty, initialize a fresh tracking structure.
 
-### 2. Fetch Open Issues
+### 2. Fetch Open Issues (Batched)
 
-Use the GitHub API to list all open issues in the repository:
-- Retrieve all open issues (paginate through all pages to get the full list)
+Use the GitHub API to list open issues in the repository. **Process at most 30 issues per run** to stay within context limits (this limit is based on the average size of Z3 issues including body text and inline code snippets; larger issues may require processing fewer):
+- Retrieve one page (30 issues) of open issues
 - Exclude pull requests (filter where `pull_request` is not present)
 - Sort by last updated date (most recently updated first)
+- If cache has a batch cursor from the last run, fetch the next page after that cursor; otherwise start from the most recently updated issues
 - For each issue, collect:
   - Issue number, title, body, labels, author
   - Date created and last updated
   - Number of comments
-  - All comments (for issues with comments)
+  - **Do NOT fetch comments for every issue up front.** Only fetch comments for a specific issue when at least one of the following is true: the body mentions a version number (potential closure), the title contains words like "duplicate", "same as", or "related to" (potential duplicate), or the issue has labels such as "question", "help wanted", or "wontfix" (potential closure/status change). Fetch comments lazily, one issue at a time, only when one of these criteria is met.
   - Any referenced pull requests, commits, or other issues
 
 ### 3. Analyze Each Issue
@@ -110,6 +115,8 @@ Add a comment to an issue if you have **genuinely useful and specific informatio
 
 ### 4. Create a Discussion with Findings
 
+**MANDATORY**: You MUST call `create-discussion` now, even if you only analyzed a few issues or found nothing actionable. If there is genuinely nothing to report, call `noop` instead. Do not skip this step.
+
 Create a GitHub Discussion summarizing the analysis results.
 
 **Title:** "[Issue Backlog] Backlog Analysis - [Date]"
@@ -224,9 +231,13 @@ After completing the analysis, update cache memory with:
 - Issues flagged for closure, duplication, or merge
 - Date and timestamp of this run
 - Count of total issues analyzed
+- Batch cursor: the issue number of the last issue processed in this run, so the next run can continue from where this one left off
 
 ## Guidelines
 
+- **Always produce output**: You MUST call `create-discussion` or `noop` before finishing — never exit silently. If in doubt about whether there is enough to report, call `create-discussion` with a brief summary.
+- **Batch processing**: Only analyze up to 30 issues per run. Store a cursor in cache memory so subsequent runs pick up where you left off.
+- **Lazy comment fetching**: Do NOT bulk-fetch all comments for all issues. Only fetch comments for a specific issue when one of these criteria is met: the body mentions a version number, the title contains duplicate/related keywords, or the issue has status-relevant labels (e.g., "question", "help wanted", "wontfix").
 - **Prioritize accuracy over coverage**: It is better to analyze 20 issues well than 200 issues poorly
 - **Be conservative on closures**: Incorrectly closing a valid issue is harmful; when in doubt, keep it open
 - **Respect the community**: Z3 is used by researchers, security engineers, and developers — treat all issues respectfully