-
Notifications
You must be signed in to change notification settings - Fork 14
Description
π Executive Summary
gh-aw-firewall is a highly mature agentic workflow repository β among the most advanced outside the gh-aw factory itself β with 28 compiled agentic workflows spanning security, CI/CD, documentation, and multi-engine smoke testing. However, three actionable gaps stand out: no issue triage/labeling agent, no meta-agent to audit workflow health, and a missing Firewall Escape Test Agent that is referenced in the security review workflow but doesn't yet exist.
π Patterns Learned from Pelis Agent Factory
From crawling the full Pelis blog series and exploring the githubnext/agentics reference repo, the key patterns are:
| Pattern | Description | Present here? |
|---|---|---|
| Specialization | Many focused workflows vs one monolithic agent | β Yes β 28 specialized workflows |
| Multi-engine | Different AI models for different tasks | β Yes β claude, codex, copilot |
| Meta-agents | Agents that monitor other agents (Audit Workflows, Workflow Health Manager) | β Missing |
| Cascade workflows | Issues β downstream PR chains via issue-monster | β Partial β issue-monster exists |
| Cache-memory | Cross-run persistent state (e.g., issue-duplication-detector) | β Yes |
| skip-if-match | Preventing duplicate outputs | |
| Observability | Metrics Collector, Portfolio Analyst | β Missing |
| Issue triage | Automated labeling + triage comments | β Missing |
| Code quality agents | Continuous Simplicity, Refactoring, Style | β Missing |
| Breaking change detection | Alerting on backward-incompatible changes | β Missing |
| Daily malicious code scan | Supply chain defense | β Missing |
π Current Agentic Workflow Inventory
| Workflow | Purpose | Trigger | Engine | Assessment |
|---|---|---|---|---|
build-test-{bun,cpp,deno,dotnet,go,java,node,rust} |
Build & test PRs in 8 ecosystems | PR opened/sync | copilot | β Excellent coverage |
ci-cd-gaps-assessment |
Daily CI/CD gap analysis | Schedule daily | copilot | β Active, creating discussions |
ci-doctor |
Investigate CI failures, open issues | workflow_run failed | copilot | β Core workflow |
cli-flag-consistency-checker |
Weekly CLI flag consistency check | Schedule weekly | copilot | β Good hygiene |
dependency-security-monitor |
Daily CVE monitoring + dep PRs | Schedule daily | copilot | β Very active (3 open PRs) |
doc-maintainer |
Daily docs sync with code changes | Schedule daily | copilot | β Good coverage |
issue-duplication-detector |
Detect duplicate issues | Issue opened | copilot | β Uses cache-memory |
issue-monster |
Dispatch issues to Copilot SWE agent | Issue opened + hourly | copilot | β Core orchestrator |
pelis-agent-factory-advisor |
This workflow | Schedule daily | copilot | |
plan |
/plan slash command |
Discussion/issue comment | copilot | β Interactive |
secret-digger-claude/codex/copilot |
Hourly secret scanning (3 engines) | Hourly cron | all 3 | |
security-guard |
PR security review | PR opened/sync | claude | β Excellent for this repo |
security-review |
Daily comprehensive security review | Schedule daily | copilot | β Very thorough |
smoke-{chroot,claude,codex,copilot} |
End-to-end smoke tests | PR + schedule | all 3 + copilot | β Multi-engine, excellent |
test-coverage-improver |
Weekly test coverage PRs | Schedule weekly | copilot | |
update-release-notes |
Enhance release notes on publish | Release published | copilot | β Good |
π¨ Immediate Issues to Address
These are operational problems with existing workflows that need fixing now.
1. Two workflows are uncompiled (pelis-agent-factory-advisor, test-coverage-improver)
- These will not run because GitHub Actions executes the
.lock.ymlfiles, not the.mdfiles - Run
gh aw compile .github/workflows/test-coverage-improver.mdandgh aw compile .github/workflows/pelis-agent-factory-advisor.mdfollowed by the post-processing script
2. Duplicate discussions accumulating
[CI/CD Assessment]has two open issues ([CI/CD Assessment] CI/CD Pipelines and Integration Tests Gap Assessment β March 2026Β #1113 and [CI/CD Assessment] CI/CD Pipelines and Integration Tests Gap AssessmentΒ #1109),[Security Review]has two ([Security Review] Daily Security Review and Threat Modeling β 2026-03-01Β #1112 and [Security Review] Daily Security Review and Threat Modeling β 2026-02-28Β #1108),[Pelis Agent Factory Advisor]has two ([Pelis Agent Factory Advisor] Agentic Workflow Maturity Report β Mar 2026Β #1111 and [Pelis Agent Factory Advisor] Agentic Workflow Maturity Report β Feb 2026Β #1106)- The
skip-if-matchqueries in these workflows may need tuning; discussion titles include date suffixes which prevent deduplication
3. Secret Digger failing for Codex + Copilot engines (#1107, #1105)
- Three parallel hourly secret scanners β the codex and copilot variants are failing; investigate and fix
4. Three open Dependency PRs stacking without merge (#1114, #1110, #1104)
dependency-security-monitoris creating PRs faster than they're being merged; consider adding auto-merge for patch-level safe updates or a stale-PR cleanup
π Actionable Recommendations
P0 β Implement Immediately
P0.1: Issue Triage Agent
What: Automatically label incoming issues with appropriate categories (bug, security, enhancement, documentation, question, good-first-issue)
Why: Currently 10 open issues have zero labels, making the issue tracker hard to navigate. The issue-monster dispatches issues but skips unlabeled/un-triaged ones. A triage agent feeds better quality issues into the cascade. From the factory: issue triage is the "hello world" of agentic workflows with immediate, clear value.
How: Add a new issue-triage.md workflow triggered on issues: [opened] with safe-outputs: add-labels and add-comment. Uses codebase context to label issues by analyzing title + body.
Effort: Low
---
on:
issues:
types: [opened]
permissions:
issues: read
contents: read
tools:
github:
toolsets: [issues, labels]
safe-outputs:
add-labels:
allowed: [bug, security, enhancement, documentation, question, good-first-issue, firewall, proxy, docker, ci]
add-comment: {}
timeout-minutes: 5
---
# Issue Triage Agent
Analyze issue #$\{\{ github.event.issue.number }} in $\{\{ github.repository }}...P1 β Plan for Near-Term
P1.1: Firewall Escape Test Agent π₯
What: A dedicated daily agent that attempts to escape the AWF network firewall using known techniques and reports findings as a discussion
Why: The security-review.md workflow already references this agent ("Read the Firewall Escape Test Agent's Report") but it doesn't exist β this is a gap in the security review pipeline. For a security firewall repository, continuous adversarial escape testing is uniquely domain-relevant. This workflow would try known bypass techniques (DNS tunneling, HTTP CONNECT abuse, IPv6 bypass, localhost tricks) and report on which ones are properly blocked.
How: A daily scheduled workflow using bash: true that runs actual awf commands with various bypass attempts inside the container, checks squid logs, and reports success/failure per technique.
Effort: Medium
Unique to this repo: No other repository type can benefit from this as directly as a network firewall tool. Each test run validates real security invariants.
P1.2: Workflow Health Monitor (Meta-Agent)
What: A weekly meta-agent that reviews all other agentic workflow runs and creates a health report with issues for unhealthy agents
Why: The factory learned that meta-agents are incredibly valuable. Currently there's no observability on the 28 workflows themselves β nobody is watching the watchers. The duplicate discussion problem (#1111/#1106, etc.) would be caught automatically. Secret Digger failures (#1107, #1105) linger as issues but there's no systematic health check.
How: Weekly scheduled workflow using agentic-workflows tool to inspect recent runs of all workflows, identify failure rates, duplicate outputs, and cost anomalies. Creates issues for unhealthy workflows.
Effort: LowβMedium
---
on:
schedule: weekly
tools:
agentic-workflows:
github:
toolsets: [default, actions]
cache-memory: true
safe-outputs:
create-discussion:
title-prefix: "[Workflow Health] "
create-issue:
title-prefix: "[Workflow Health] "
labels: [agentic-workflows]
max: 5P1.3: Breaking Change Checker
What: On each PR, detect backward-incompatible CLI changes (removed flags, changed defaults, renamed options, Docker API changes)
Why: AWF is a distributed CLI tool consumed by users who script it. Breaking changes in --allow-domains semantics, flag names, or Docker compose configuration need early detection. The factory uses this pattern with a 100% causal chain merge rate. Recent PRs adding --build-local, changing --image-tag behavior, and adding API proxy ports are exactly the type of changes this catches.
How: PR-triggered workflow that diffs src/cli.ts, src/types.ts, and containers/ against base branch, identifies potentially breaking changes, and comments on the PR.
Effort: Low
P2 β Consider for Roadmap
P2.1: Daily Malicious Code Scan
What: Daily scan of recent commits for suspicious patterns β obfuscated code, unusual network calls, hardcoded credentials, suspicious shell commands
Why: AWF runs as root with NET_ADMIN capability and accesses docker.sock. A supply chain compromise here would be particularly dangerous. The factory runs this daily in gh-aw. For a security-critical tool, this defensive layer is especially important.
Effort: Low (based on existing secret-digger pattern, just different analysis focus)
P2.2: Sub Issue Closer
What: Automatically close sub-issues when parent issues are resolved
Why: As issue-monster creates more Copilot SWE agent tasks, sub-issue tracking will accumulate stale closed/merged items. From the factory: "keeps the issue tracker clean."
Effort: Low
P2.3: Changeset Generator
What: On merging to main, analyze commits since last release and auto-generate a PR with version bump + CHANGELOG entry
Why: update-release-notes improves notes after a release is published, but there's no automation for preparing releases. The factory's Changeset workflow had a 78% merge rate across 28 proposed PRs. Given AWF releases container images via GHCR, having well-tracked version bumps matters.
Effort: Medium
P2.4: Fix skip-if-match for Discussion-Creating Workflows
What: Update ci-cd-gaps-assessment, security-review, and pelis-agent-factory-advisor to use better deduplication to avoid accumulating stale duplicate discussions/issues
Why: Currently 6 open duplicate issues (#1113/#1109, #1112/#1108, #1111/#1106). The skip-if-match queries need to match the title prefixes + date patterns.
Effort: Low β just adjust the skip-if-match queries in each workflow
P3 β Future Ideas
P3.1: Portfolio Analyst (Token Cost Optimizer)
What: Weekly analysis of workflow token usage and costs across all 28 workflows, identifying expensive agents and optimization opportunities
Why: With 28 workflows running daily/hourly/weekly, token costs accumulate. The factory found some agents were "way too chatty" with LLM calls. Secret-digger alone runs 3Γ per hour.
Effort: Low (read-only analysis)
P3.2: Weekly Issue & PR Summary
What: Weekly digest of repository activity β open issues, PR status, workflow health β posted as a discussion
Why: With automated agents creating many issues/PRs, maintainers need a curated weekly digest to stay informed without reading every individual output.
Effort: Low
P3.3: Contribution Guidelines Checker
What: On new PRs from external contributors, check that contribution guidelines (conventional commits, scope, PR title format) are followed and comment with guidance
Why: AWF enforces strict conventional commits (with a limited scope allowlist β cli, docker, squid, proxy, ci, deps). External contributors frequently get PR title check failures. An early-comment agent reduces frustration.
Effort: Low
π Maturity Assessment
Current Level: 4/5 β Advanced Factory
This is one of the most sophisticated agentic workflow setups outside the gh-aw factory itself. Strengths:
- β 28 compiled agentic workflows across all major categories
- β Multi-engine support (Claude, Codex, Copilot)
- β Domain-specific workflows (security-guard, smoke tests, secret-digger Γ 3)
- β Good cascade design (ci-doctor β issues β issue-monster β PRs)
- β Cache-memory usage for stateful agents
Target Level: 4.5/5 β Add meta-monitoring and triage
Gap Analysis:
- Add issue triage (P0) β improves issue quality entering issue-monster cascade
- Add workflow health monitor (P1) β closes the observability gap for 28 workflows
- Fix uncompiled workflows (operational) β pelis-advisor and test-coverage-improver aren't running
- Build the escape test agent (P1) β unique to this repo's security mission
π Comparison with Best Practices
| Best Practice | This Repo | Notes |
|---|---|---|
| Issue triage | β | Missing; all auto-created issues unlabeled |
| Fault investigation | β | ci-doctor is excellent |
| Security compliance | β β | Above average β security-guard, security-review, secret-diggerΓ3 |
| Documentation sync | β | doc-maintainer + cli-flag-consistency-checker |
| Meta-agent monitoring | β | No workflow health manager or audit workflows |
| Release automation | update-release-notes exists but no changeset generation | |
| Code quality agents | β | No simplicity/refactoring/style agents |
| Interactive/ChatOps | β | /plan slash command |
| Multi-engine testing | β β | Unique strength β smoke tests on 4 configs |
| Observability/metrics | β | No portfolio analyst or metrics collector |
What this repo does uniquely well: The triple-engine secret digger (running hourly on claude/codex/copilot) and the four-way smoke testing matrix are standout patterns not seen in the factory itself. The security-guard PR reviewer using Claude is particularly well-suited to this security-critical codebase.
Domain opportunity: A Firewall Escape Test Agent is uniquely valuable here β no other repository type can leverage this pattern. It would turn the firewall into its own test subject, continuously verifying security invariants.
Generated by Pelis Agent Factory Advisor Β· 2026-03-02
Note: This was intended to be a discussion, but discussions could not be created due to permissions issues. This issue was created as a fallback.
Tip: Discussion creation may fail if the specified category is not announcement-capable. Consider using the "Announcements" category or another announcement-capable category in your workflow configuration.
Generated by Pelis Agent Factory Advisor
- expires on Mar 9, 2026, 3:25 AM UTC