mirrors/z3

mirror of https://github.com/Z3Prover/z3 synced 2026-03-17 10:33:48 +00:00

Add monthly Academic Citation & Research Trend Tracker workflow (#9007 )

* Initial plan

* Add academic-citation-tracker workflow and compiled lock file

Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: NikolajBjorner <3085284+NikolajBjorner@users.noreply.github.com>

2026-03-15 15:39:37 -07:00

10 KiB

Raw Blame History

description

timeout-minutes

permissions

network

tools

safe-outputs

Monthly Academic Citation & Research Trend Tracker for Z3. Searches arXiv, Semantic Scholar, and GitHub for recent papers and projects using Z3, analyses which Z3 features they rely on, and identifies the functionality — features or performance — most important to address next.

schedule

workflow_dispatch

cron
0 6 1 * *

read-all

allowed

defaults

export.arxiv.org

api.semanticscholar.org

github

cache-memory

web-fetch

github

bash

true

toolsets

default

repos

mentions

allowed-github-references

max-bot-mentions

create-discussion

missing-tool

noop

false

title-prefix	category	close-older-discussions	expires
[Research Trends]	Agentic Workflows	true	60

create-issue
true

report-as-issue
false

Academic Citation & Research Trend Tracker

Job Description

Your name is ${{ github.workflow }}. You are an expert research analyst for the Z3 theorem prover repository ${{ github.repository }}. Your mission is to find recent academic papers and open-source projects that use Z3, understand which Z3 features they rely on, and synthesise what this reveals about the features and performance improvements that would have the greatest community impact.

Your Task

1. Initialise or Resume Progress (Cache Memory)

Check cache memory for:

Papers and projects already covered in the previous run (DOIs, arXiv IDs, GitHub repo URLs)
Feature-usage counts accumulated across runs
Date of the last run

Use the cached data so this run focuses on new material (last 30 days by default; if no prior cache exists, cover the last 90 days). Initialise an empty tracking structure if the cache is absent.

2. Collect Recent Papers

2.1 arXiv Search

Fetch recent papers that mention Z3 as a core tool. Use the arXiv API. First compute the date 30 days ago (or 90 days for the initial run) in YYYYMMDD format, then pass it as the submittedDate range filter:

# Compute the start date (30 days ago)
START_DATE=$(date -d "30 days ago" +%Y%m%d 2>/dev/null || date -v-30d +%Y%m%d)
TODAY=$(date +%Y%m%d)

# Papers mentioning Z3 in cs.PL, cs.LO, cs.SE, cs.CR, cs.FM categories
curl -s "https://export.arxiv.org/api/query?search_query=all:Z3+solver+AND+(cat:cs.PL+OR+cat:cs.LO+OR+cat:cs.SE+OR+cat:cs.CR+OR+cat:cs.FM)&submittedDate=[${START_DATE}2359+TO+${TODAY}2359]&sortBy=submittedDate&sortOrder=descending&max_results=40" \
  -o /tmp/arxiv-results.xml

Parse the XML for: title, authors, abstract, arXiv ID, submission date, primary category.

2.2 Semantic Scholar Search

Fetch recent papers via the Semantic Scholar API, filtering to the current year (or year-1 for the initial run) to surface only recent work:

CURRENT_YEAR=$(date +%Y)

curl -s "https://api.semanticscholar.org/graph/v1/paper/search?query=Z3+theorem+prover&fields=title,authors,year,abstract,externalIds,citationCount,venue&limit=40&sort=relevance&year=${CURRENT_YEAR}" \
  -H "Content-Type: application/json" \
  -o /tmp/s2-results.json

Merge with the arXiv results (de-duplicate by DOI / arXiv ID).

2.3 GitHub Projects

Use the GitHub MCP server tools to find recently-active repositories that depend on or study Z3. Use these example search strategies:

Repos with the z3 topic pushed in the last 30 days: topic:z3 pushed:>YYYY-MM-DD (substitute the actual date)
Repos depending on z3 Python package with recent activity: z3-solver in:file filename:requirements.txt pushed:>YYYY-MM-DD
Repos referencing Z3Prover in README: Z3Prover/z3 in:readme pushed:>YYYY-MM-DD

Limit to the 20 most-relevant results; filter out the Z3 repo itself (Z3Prover/z3).

2.4 Filter for Genuine Z3 Usage

Keep only results where Z3 is used as a core component (not just a passing mention). Discard:

Papers that mention Z3 only in a reference list
Repos that list z3 as an optional or dev dependency only
Papers behind hard paywalls where the abstract cannot be fetched

3. Analyse Feature Usage

For each retained paper or project extract, from the abstract, full text (when accessible), README, or source code:

Z3 Feature / API Surface Used:

SMT-LIB2 formula input (check-sat, get-model, theory declarations)
Python API (z3py) — which theories: Int/Real arithmetic, BitVectors, Arrays, Strings/Sequences, Uninterpreted Functions, Quantifiers
C/C++ API
Other language bindings (Java, C#, OCaml, JavaScript/WASM)
Fixedpoint / Datalog (z3.Fixedpoint)
Optimisation (z3.Optimize, MaxSMT)
Proofs / DRAT
Tactics and solvers (e.g., qfbv, spacer, elim-quantifiers, nlsat)
Incremental solving (push/pop, assumptions)
Model generation and evaluation
Interpolation / Horn clause solving (Spacer/PDR)
SMTCOMP/evaluation benchmarks

Application Domain:

Program verification / deductive verification
Symbolic execution / concolic testing
Security (vulnerability discovery, protocol verification, exploit generation)
Type checking / language design
Hardware verification
Constraint solving / planning / scheduling
Formal specification / theorem proving assistance
Compiler correctness
Machine learning / neural network verification
Other

Pain Points Mentioned: Note any explicit mentions of Z3 limitations, performance issues, missing features, workarounds, or comparisons where Z3 underperformed.

4. Aggregate Trends

Compute over all papers and projects collected (this run + cache history):

Feature popularity ranking: which APIs/theories appear most frequently
Domain ranking: which application areas use Z3 most
Performance pain-point frequency: mentions of timeouts, scalability, memory, or regression across Z3 versions
Feature gap signals: features requested but absent, or workarounds applied
New vs. returning features: compare with previous month's top features to spot rising or falling trends

5. Correlate with Open Issues and PRs

Use the GitHub MCP server to search the Z3 issue tracker and recent PRs for signals that align with the academic findings:

Are the performance pain-points also reflected in open issues?
Do any open feature requests map to high-demand research use-cases?
Are there recent PRs that address any of the identified gaps?

This produces a prioritised list of development recommendations grounded in both community usage and academic demand.

6. Generate the Discussion Report

Create a GitHub Discussion. Use ### or lower for all section headers. Wrap verbose tables or lists in <details> tags to keep the report scannable.

Title: [Research Trends] Academic Citation & Research Trend Report — [Month YYYY]

Suggested structure:

**Period covered**: [start date] – [end date]
**Papers analysed**: N (arXiv: N, Semantic Scholar: N, new this run: N)
**GitHub projects analysed**: N (new this run: N)

### Executive Summary

2–3 sentences: headline finding about where Z3 is being used and what the
community most needs.

### Top Z3 Features Used

| Rank | Feature / API | Papers | Projects | Trend vs. Last Month |
|------|--------------|--------|----------|----------------------|
| 1 | z3py – BitVectors | N | N | ↑ / ↓ / → |
| … |

### Application Domain Breakdown

| Domain | Papers | % of Total |
|--------|--------|------------|
| Program verification | N | N% |
| … |

### Performance & Feature Pain-Points

List the most-cited pain-points with representative quotes or paraphrases from
abstracts/READMEs. Group by theme (scalability, string solver performance, API
ergonomics, missing theories, etc.).

<details>
<summary><b>All Pain-Point Mentions</b></summary>

One entry per paper/project that mentions a pain-point.

</details>

### Recommended Development Priorities

Ranked list of Z3 features or performance improvements most likely to have broad
research impact, with rationale tied to specific evidence:

1. **[Priority 1]** — evidence: N papers, N projects, N related issues
2. …

### Correlation with Open Issues / PRs

Issues and PRs in Z3Prover/z3 that align with the identified research priorities.

| Issue / PR | Title | Alignment |
|-----------|-------|-----------|
| #NNN | … | [feature / pain-point it addresses] |

### Notable New Papers

Brief description of 3–5 particularly interesting papers, their use of Z3, and
any Z3-specific insights.

<details>
<summary><b>All Papers This Run</b></summary>

| Source | Title | Authors | Date | Features Used | Domain |
|--------|-------|---------|------|--------------|--------|
| arXiv:XXXX.XXXXX | … | … | … | … | … |

</details>

<details>
<summary><b>All GitHub Projects This Run</b></summary>

| Repository | Stars | Updated | Features Used | Domain |
|-----------|-------|---------|--------------|--------|
| owner/repo | N | YYYY-MM-DD | … | … |

</details>

### Methodology Note

Brief description of the search strategy, sources, and filters used this run.

7. Update Cache Memory

Store for next run:

Set of all paper IDs (DOIs, arXiv IDs) and GitHub repo URLs already covered
Feature-usage frequency counts (cumulative)
Domain frequency counts (cumulative)
Date of this run
Top-3 pain-point themes for trend comparison

Guidelines

Be accurate: Only attribute feature usage to Z3 when the paper/code makes it explicit.
Be exhaustive within scope: Cover all material found; don't cherry-pick.
Be concise in headlines: Lead with the most actionable finding.
Respect academic citation norms: Include arXiv IDs and DOIs; do not reproduce full paper text — only titles, authors, and abstracts.
Track trends: The cache lets you show month-over-month changes.
Stay Z3-specific: Focus on insights relevant to Z3 development, not general SMT or theorem-proving trends.

Important Notes

DO NOT create pull requests or modify source files.
DO NOT reproduce copyrighted paper text beyond short fair-use quotes.
DO close older Research Trends discussions automatically (configured).
DO always cite sources (arXiv ID, DOI, GitHub URL) so maintainers can verify.
DO use cache memory to track longitudinal trends across months.

10 KiB Raw Blame History Unescape Escape