mirror of
https://github.com/Z3Prover/z3
synced 2026-05-28 21:06:29 +00:00
Monthly agentic workflow that discovers community-contributed SMT-LIB benchmark repositories on GitHub, excluding official distributions from smtlib.org and Zenodo, and posts a categorised summary as a GitHub Discussion. ## Workflow steps - **Exclusion list** — scrapes `smtlib.cs.uiowa.edu` + paginates the [Zenodo SMT-LIB community](https://zenodo.org/communities/smt-lib/) (up to 300 records) to extract official repo URLs; hard-excludes `SMT-LIB/*` and `SMT-Competition/*` orgs - **GitHub search** — 7 search strategies (topic tags `smtlib`/`smt-lib`/`smt2`, filename/path patterns, README mentions) scoped to repos pushed since the last run (90-day window on first run) - **Filter + classify** — removes official sets and noise (homework repos, empty repos); categorises survivors into: solver evaluation, verification, security/CTF, theory/logic, tool output, education, other - **Discussion report** — posts `[SMT-LIB Benchmarks] Community Benchmark Repository Survey — [Month YYYY]`; older discussions auto-closed after 90 days - **Cache** — stores classified repo sets and run date so subsequent runs are incremental ## Files - `.github/workflows/smtlib-benchmark-finder.md` — workflow definition - `.github/workflows/smtlib-benchmark-finder.lock.yml` — compiled GitHub Actions YAML (`gh aw compile`) --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
342 lines
12 KiB
Markdown
342 lines
12 KiB
Markdown
---
|
||
description: >
|
||
Monthly SMTLIB Benchmark Finder.
|
||
Searches GitHub for repositories containing SMT-LIB benchmarks (.smt2 files),
|
||
excludes repositories that belong to the official SMT-LIB benchmark sets
|
||
(linked from smtlib.org and hosted on Zenodo), and posts a curated summary
|
||
of community-contributed benchmark links as a GitHub Discussion.
|
||
|
||
on:
|
||
schedule:
|
||
- cron: "0 8 1 * *"
|
||
workflow_dispatch:
|
||
|
||
timeout-minutes: 60
|
||
|
||
permissions: read-all
|
||
|
||
network:
|
||
allowed:
|
||
- defaults
|
||
- github
|
||
- smtlib.cs.uiowa.edu
|
||
- zenodo.org
|
||
|
||
tools:
|
||
cache-memory: true
|
||
web-fetch: {}
|
||
github:
|
||
toolsets: [default, repos]
|
||
bash: [":*"]
|
||
|
||
safe-outputs:
|
||
mentions: false
|
||
allowed-github-references: []
|
||
max-bot-mentions: 1
|
||
create-discussion:
|
||
title-prefix: "[SMT-LIB Benchmarks] "
|
||
category: "Agentic Workflows"
|
||
close-older-discussions: true
|
||
expires: 90d
|
||
missing-tool:
|
||
create-issue: true
|
||
noop:
|
||
report-as-issue: false
|
||
|
||
---
|
||
|
||
# SMTLIB Benchmark Finder
|
||
|
||
## Job Description
|
||
|
||
Your name is ${{ github.workflow }}. You are a research analyst for the Z3 theorem
|
||
prover repository `${{ github.repository }}`. Your mission is to discover GitHub
|
||
repositories that host SMT-LIB benchmarks, exclude the ones that are already part
|
||
of the official SMT-LIB benchmark distribution (linked from smtlib.org and published
|
||
on Zenodo), and post a curated summary of community-contributed benchmark links as a
|
||
GitHub Discussion.
|
||
|
||
## Step 1: Load Cache and Determine Run Mode
|
||
|
||
Check cache memory for:
|
||
- `official_repos`: set of GitHub repository full names (`owner/repo`) and Zenodo
|
||
record IDs already identified as official SMT-LIB benchmark sets
|
||
- `known_community_repos`: set of repo full names already listed in a previous report
|
||
- `last_run_date`: ISO-8601 date string of the previous run
|
||
|
||
Use the cache to skip repos already classified. On the very first run (no cache),
|
||
perform a full discovery pass. On subsequent runs focus on repos pushed or created
|
||
since `last_run_date`.
|
||
|
||
## Step 2: Collect Official SMT-LIB Benchmark Sets to Exclude
|
||
|
||
### 2.1 Scrape smtlib.org
|
||
|
||
Fetch the SMT-LIB benchmarks page and extract all linked Zenodo DOIs and GitHub URLs:
|
||
|
||
```bash
|
||
curl -s "https://smtlib.cs.uiowa.edu/benchmarks.shtml" -o /tmp/smtlib-benchmarks.html
|
||
# Also try the main page and any mirror
|
||
curl -s "https://smtlib.cs.uiowa.edu/" -o /tmp/smtlib-home.html
|
||
```
|
||
|
||
Parse both files for:
|
||
- Zenodo DOI links (`doi.org/10.5281/zenodo.*` or `zenodo.org/record/*`)
|
||
- GitHub repository URLs (`github.com/...`)
|
||
- Any other hosted benchmark archive links
|
||
|
||
### 2.2 Enumerate Zenodo SMT-LIB Community Records
|
||
|
||
Query the Zenodo API for all records in the SMT-LIB community:
|
||
|
||
```bash
|
||
curl -s "https://zenodo.org/api/records?communities=smt-lib&size=100&page=1" \
|
||
-o /tmp/zenodo-smtlib-page1.json
|
||
|
||
# Check if there are more pages (paginate until empty)
|
||
curl -s "https://zenodo.org/api/records?communities=smt-lib&size=100&page=2" \
|
||
-o /tmp/zenodo-smtlib-page2.json 2>/dev/null || true
|
||
|
||
curl -s "https://zenodo.org/api/records?communities=smt-lib&size=100&page=3" \
|
||
-o /tmp/zenodo-smtlib-page3.json 2>/dev/null || true
|
||
```
|
||
|
||
For each Zenodo record extract:
|
||
- Record ID (e.g. `5827900`)
|
||
- Title
|
||
- Any GitHub repository URLs listed in the description or related identifiers
|
||
|
||
```bash
|
||
python3 - <<'PYEOF'
|
||
import json, re
|
||
|
||
def extract_github_repos(text):
|
||
pattern = r'github\.com/([A-Za-z0-9_.-]+/[A-Za-z0-9_.-]+)'
|
||
return set(re.findall(pattern, text or ''))
|
||
|
||
official_repos = set()
|
||
official_zenodo_ids = set()
|
||
|
||
for fname in ['/tmp/zenodo-smtlib-page1.json', '/tmp/zenodo-smtlib-page2.json',
|
||
'/tmp/zenodo-smtlib-page3.json']:
|
||
try:
|
||
data = json.load(open(fname))
|
||
except Exception:
|
||
continue
|
||
for hit in data.get('hits', {}).get('hits', []):
|
||
rid = str(hit.get('id', ''))
|
||
official_zenodo_ids.add(rid)
|
||
metadata = hit.get('metadata', {})
|
||
description = metadata.get('description', '')
|
||
related = ' '.join(
|
||
r.get('identifier', '')
|
||
for r in metadata.get('related_identifiers', [])
|
||
)
|
||
title = metadata.get('title', '')
|
||
for repo in extract_github_repos(description + ' ' + related + ' ' + title):
|
||
official_repos.add(repo.lower().rstrip('.'))
|
||
|
||
with open('/tmp/smtlib-benchmarks.html') as f:
|
||
html = f.read()
|
||
for repo in extract_github_repos(html):
|
||
official_repos.add(repo.lower().rstrip('.'))
|
||
|
||
print("OFFICIAL_ZENODO_IDS:", ','.join(sorted(official_zenodo_ids)) or '(none)')
|
||
print("OFFICIAL_GITHUB_REPOS:", ','.join(sorted(official_repos)) or '(none)')
|
||
PYEOF
|
||
```
|
||
|
||
### 2.3 Well-Known Official Repository Patterns
|
||
|
||
Regardless of the above scrape, always exclude:
|
||
- Any repo under the `SMT-LIB` GitHub organization (`SMT-LIB/*`)
|
||
- Any repo whose name matches `smt-comp-*` that is under `SMT-Competition` org
|
||
- The Z3 repo itself (`Z3Prover/z3`) and its forks
|
||
|
||
Combine all official sources into a single exclusion set stored at
|
||
`/tmp/official_exclusions.txt` (one `owner/repo` pattern per line, lowercase).
|
||
|
||
## Step 3: Search GitHub for Community SMT-LIB Benchmark Repositories
|
||
|
||
Use multiple GitHub search strategies to find repos containing `.smt2` benchmark
|
||
files that are NOT part of the official distribution. Search for repos updated
|
||
since the last run date (or the last 90 days for the initial run).
|
||
|
||
Compute the cutoff date first:
|
||
```bash
|
||
CUTOFF=$(date -d "90 days ago" +%Y-%m-%d 2>/dev/null || date -v-90d +%Y-%m-%d)
|
||
echo "Using cutoff date: $CUTOFF"
|
||
```
|
||
|
||
### Search Strategies
|
||
|
||
Use the GitHub MCP server tools to run these searches:
|
||
|
||
1. **Topic search**: `topic:smtlib pushed:>$CUTOFF`
|
||
2. **Topic search variant**: `topic:smt-lib pushed:>$CUTOFF`
|
||
3. **Topic search variant**: `topic:smt2 pushed:>$CUTOFF`
|
||
4. **Benchmark filename pattern**: `filename:*.smt2 pushed:>$CUTOFF` (limit to top results)
|
||
5. **Benchmarks directory pattern**: `path:benchmarks *.smt2 in:path pushed:>$CUTOFF`
|
||
6. **README mention**: `SMT-LIB benchmarks in:readme stars:>2 pushed:>$CUTOFF`
|
||
7. **Organization-level search**: repos under `SMT-Competition` org, if any are not already excluded
|
||
|
||
For each search, collect: `full_name`, `html_url`, `description`, `stargazers_count`,
|
||
`updated_at`, `pushed_at`, `default_branch`, `topics`.
|
||
|
||
Deduplicate by `full_name`. Limit to 200 total candidates before filtering.
|
||
|
||
## Step 4: Filter Out Official Benchmark Sets
|
||
|
||
For each candidate repo:
|
||
|
||
1. Check if `full_name.lower()` is in the exclusion set from Step 2.
|
||
2. Check if the repo is owned by `SMT-LIB` or `SMT-Competition` org (case-insensitive).
|
||
3. Check if the repo is a fork of a known official repo (if `fork: true` and parent is
|
||
in the exclusion set).
|
||
|
||
Discard repos that match any of the above. Keep the rest as community benchmarks.
|
||
|
||
Also apply quality filters to reduce noise — skip repos that:
|
||
- Have 0 stars and fewer than 3 `.smt2` files (likely a student homework or test repo
|
||
with minimal public value); use your judgement — if the repo description clearly
|
||
describes a research benchmark, keep it regardless of star count.
|
||
- Were created but never pushed after creation (empty repos).
|
||
- Have names that are clearly course assignment repositories
|
||
(e.g. contain `homework`, `assignment`, `hw[0-9]`, `cs[0-9]{3}`).
|
||
|
||
## Step 5: Classify Remaining Repos
|
||
|
||
For each repo that survives filtering, classify it into one of these categories:
|
||
|
||
| Category | Description |
|
||
|----------|-------------|
|
||
| **Solver evaluation** | Benchmarks used to evaluate or compare SMT solvers |
|
||
| **Verification** | Benchmarks from program verification or model checking |
|
||
| **Security / CTF** | Benchmarks from security research or CTF challenges |
|
||
| **Theory / logic** | Benchmarks exploring specific SMT theories or logics |
|
||
| **Tool output** | Benchmarks generated by another tool (e.g. a compiler, fuzzer) |
|
||
| **Education** | Course materials or tutorials with benchmark examples |
|
||
| **Other / unknown** | Does not fit another category |
|
||
|
||
Base the classification on: repo description, README (fetch if web-fetch is available),
|
||
topics, and directory structure. A single brief `web-fetch` of the repo's README is
|
||
sufficient; do not fetch individual `.smt2` files.
|
||
|
||
Note the dominant SMT logic(s) present, if discernible from the description or topics
|
||
(e.g. QF_BV, QF_LIA, QF_S, NIA, …).
|
||
|
||
## Step 6: Generate the Discussion Report
|
||
|
||
Create a GitHub Discussion. Use heading level 3 or deeper (`###`, `####`, …) for all
|
||
section headers; never use `##` or `#` in the body.
|
||
Wrap long tables in `<details>` tags to keep the report scannable.
|
||
|
||
Title: `[SMT-LIB Benchmarks] Community Benchmark Repository Survey — [Month YYYY]`
|
||
|
||
Structure the report as follows:
|
||
|
||
```markdown
|
||
**Period covered**: [cutoff date] – [today's date]
|
||
**Repositories found**: N community repos (after excluding M official sets)
|
||
**New this run**: N (not listed in previous report)
|
||
|
||
### Overview
|
||
|
||
1–2 sentences summarising the breadth of community SMT-LIB benchmarks found.
|
||
|
||
### Community Benchmark Repositories
|
||
|
||
Use `###` for category headers. Within each category, list repos as a markdown table.
|
||
For each repo include:
|
||
- Repo link (`[owner/repo](html_url)`)
|
||
- Stars
|
||
- Last pushed date
|
||
- Dominant logic(s) (if known)
|
||
- Brief description (from repo description or README, max 120 chars)
|
||
|
||
#### Solver Evaluation
|
||
|
||
| Repository | ⭐ | Last pushed | Logic(s) | Description |
|
||
|------------|-----|------------|---------|-------------|
|
||
| [owner/repo](url) | N | YYYY-MM-DD | QF_BV, QF_LIA | … |
|
||
|
||
#### Verification
|
||
|
||
[same table structure]
|
||
|
||
#### Security / CTF
|
||
|
||
[same table structure]
|
||
|
||
#### Theory / Logic
|
||
|
||
[same table structure]
|
||
|
||
#### Tool Output
|
||
|
||
[same table structure]
|
||
|
||
#### Education
|
||
|
||
[same table structure]
|
||
|
||
#### Other / Unknown
|
||
|
||
[same table structure]
|
||
|
||
---
|
||
|
||
### Exclusions Applied
|
||
|
||
<details>
|
||
<summary>Official SMT-LIB sets excluded from this report</summary>
|
||
|
||
List Zenodo record IDs and GitHub repos identified as official distributions.
|
||
|
||
| Source | Identifier | Notes |
|
||
|--------|-----------|-------|
|
||
| Zenodo | [10.5281/zenodo.XXXXXXX](https://zenodo.org/record/XXXXXXX) | Official QF_BV benchmark set |
|
||
| GitHub | [SMT-LIB/benchmarks-non-incremental](https://github.com/SMT-LIB/benchmarks-non-incremental) | |
|
||
|
||
</details>
|
||
|
||
---
|
||
|
||
### Methodology
|
||
|
||
Brief note on search queries used, cutoff date, and any quality filters applied.
|
||
```
|
||
|
||
## Step 7: Update Cache Memory
|
||
|
||
After posting the discussion, update cache memory with:
|
||
- `official_repos`: updated exclusion set (union of previous + newly found)
|
||
- `known_community_repos`: union of previous + repos listed in this report
|
||
- `last_run_date`: today's ISO-8601 date
|
||
- `report_url`: URL of the GitHub Discussion created
|
||
|
||
## Guidelines
|
||
|
||
- **Be conservative with exclusions**: when in doubt whether a repo is "official",
|
||
keep it in the community list rather than silently dropping it.
|
||
- **Be accurate**: only include repos that genuinely contain SMT-LIB `.smt2` files
|
||
or clearly describe themselves as SMT-LIB benchmark collections.
|
||
- **Avoid noise**: student homework repos and trivially small repos add clutter;
|
||
apply the quality filters from Step 4 judiciously.
|
||
- **No source code changes**: DO NOT create pull requests or modify any source files.
|
||
- **No copyrighted content**: DO NOT reproduce benchmark file contents; only post
|
||
links and metadata.
|
||
- **Always cite sources**: include the full GitHub URL for every listed repository.
|
||
- **Use cache**: skip repos already classified in a previous run to keep runtime short.
|
||
- **Fail gracefully**: if GitHub search rate-limits the workflow, post whatever was
|
||
collected so far with a note that the search was incomplete.
|
||
|
||
## Important Notes
|
||
|
||
- DO NOT create pull requests or modify source files.
|
||
- DO close older SMT-LIB Benchmarks discussions automatically (configured).
|
||
- DO always call `create_discussion` or `noop` before the workflow ends.
|
||
Failing to produce any safe output triggers an automatic failure issue.
|
||
- DO use cache memory to avoid re-processing repos already surveyed.
|
||
- DO limit individual `web-fetch` calls (README fetches) to repos where the
|
||
description alone is insufficient for classification.
|