[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report — Mar 2026

## 📊 Executive Summary

`gh-aw-firewall` is a **highly mature** agentic workflow repository — among the most advanced outside the gh-aw factory itself — with **28 compiled agentic workflows** spanning security, CI/CD, documentation, and multi-engine smoke testing. However, three actionable gaps stand out: no **issue triage/labeling** agent, no **meta-agent** to audit workflow health, and a missing **Firewall Escape Test Agent** that is referenced in the security review workflow but doesn't yet exist.

---

## 🎓 Patterns Learned from Pelis Agent Factory

From crawling the full Pelis blog series and exploring the `githubnext/agentics` reference repo, the key patterns are:

| Pattern | Description | Present here? |
|---------|-------------|---------------|
| **Specialization** | Many focused workflows vs one monolithic agent | ✅ Yes — 28 specialized workflows |
| **Multi-engine** | Different AI models for different tasks | ✅ Yes — claude, codex, copilot |
| **Meta-agents** | Agents that monitor other agents (Audit Workflows, Workflow Health Manager) | ❌ Missing |
| **Cascade workflows** | Issues → downstream PR chains via issue-monster | ✅ Partial — issue-monster exists |
| **Cache-memory** | Cross-run persistent state (e.g., issue-duplication-detector) | ✅ Yes |
| **skip-if-match** | Preventing duplicate outputs | ⚠️ Partially broken — duplicates observed |
| **Observability** | Metrics Collector, Portfolio Analyst | ❌ Missing |
| **Issue triage** | Automated labeling + triage comments | ❌ Missing |
| **Code quality agents** | Continuous Simplicity, Refactoring, Style | ❌ Missing |
| **Breaking change detection** | Alerting on backward-incompatible changes | ❌ Missing |
| **Daily malicious code scan** | Supply chain defense | ❌ Missing |

---

## 📋 Current Agentic Workflow Inventory

| Workflow | Purpose | Trigger | Engine | Assessment |
|----------|---------|---------|--------|------------|
| `build-test-{bun,cpp,deno,dotnet,go,java,node,rust}` | Build & test PRs in 8 ecosystems | PR opened/sync | copilot | ✅ Excellent coverage |
| `ci-cd-gaps-assessment` | Daily CI/CD gap analysis | Schedule daily | copilot | ✅ Active, creating discussions |
| `ci-doctor` | Investigate CI failures, open issues | workflow_run failed | copilot | ✅ Core workflow |
| `cli-flag-consistency-checker` | Weekly CLI flag consistency check | Schedule weekly | copilot | ✅ Good hygiene |
| `dependency-security-monitor` | Daily CVE monitoring + dep PRs | Schedule daily | copilot | ✅ Very active (3 open PRs) |
| `doc-maintainer` | Daily docs sync with code changes | Schedule daily | copilot | ✅ Good coverage |
| `issue-duplication-detector` | Detect duplicate issues | Issue opened | copilot | ✅ Uses cache-memory |
| `issue-monster` | Dispatch issues to Copilot SWE agent | Issue opened + hourly | copilot | ✅ Core orchestrator |
| `pelis-agent-factory-advisor` | This workflow | Schedule daily | copilot | ⚠️ **UNCOMPILED** |
| `plan` | `/plan` slash command | Discussion/issue comment | copilot | ✅ Interactive |
| `secret-digger-claude/codex/copilot` | Hourly secret scanning (3 engines) | Hourly cron | all 3 | ⚠️ Codex + Copilot failing |
| `security-guard` | PR security review | PR opened/sync | claude | ✅ Excellent for this repo |
| `security-review` | Daily comprehensive security review | Schedule daily | copilot | ✅ Very thorough |
| `smoke-{chroot,claude,codex,copilot}` | End-to-end smoke tests | PR + schedule | all 3 + copilot | ✅ Multi-engine, excellent |
| `test-coverage-improver` | Weekly test coverage PRs | Schedule weekly | copilot | ⚠️ **UNCOMPILED** |
| `update-release-notes` | Enhance release notes on publish | Release published | copilot | ✅ Good |

---

## 🚨 Immediate Issues to Address

> These are **operational problems** with existing workflows that need fixing now.

**1. Two workflows are uncompiled (`pelis-agent-factory-advisor`, `test-coverage-improver`)**
- These will not run because GitHub Actions executes the `.lock.yml` files, not the `.md` files
- Run `gh aw compile .github/workflows/test-coverage-improver.md` and `gh aw compile .github/workflows/pelis-agent-factory-advisor.md` followed by the post-processing script

**2. Duplicate discussions accumulating**
- `[CI/CD Assessment]` has two open issues (#1113 and #1109), `[Security Review]` has two (#1112 and #1108), `[Pelis Agent Factory Advisor]` has two (#1111 and #1106)
- The `skip-if-match` queries in these workflows may need tuning; discussion titles include date suffixes which prevent deduplication

**3. Secret Digger failing for Codex + Copilot engines (#1107, #1105)**
- Three parallel hourly secret scanners — the codex and copilot variants are failing; investigate and fix

**4. Three open Dependency PRs stacking without merge (#1114, #1110, #1104)**
- `dependency-security-monitor` is creating PRs faster than they're being merged; consider adding auto-merge for patch-level safe updates or a stale-PR cleanup

---

## 🚀 Actionable Recommendations

### P0 — Implement Immediately

#### P0.1: Issue Triage Agent

**What**: Automatically label incoming issues with appropriate categories (`bug`, `security`, `enhancement`, `documentation`, `question`, `good-first-issue`)

**Why**: Currently 10 open issues have **zero labels**, making the issue tracker hard to navigate. The issue-monster dispatches issues but skips unlabeled/un-triaged ones. A triage agent feeds better quality issues into the cascade. From the factory: issue triage is the "hello world" of agentic workflows with immediate, clear value.

**How**: Add a new `issue-triage.md` workflow triggered on `issues: [opened]` with `safe-outputs: add-labels` and `add-comment`. Uses codebase context to label issues by analyzing title + body.

**Effort**: Low

````yaml
---
on:
  issues:
    types: [opened]
permissions:
  issues: read
  contents: read
tools:
  github:
    toolsets: [issues, labels]
safe-outputs:
  add-labels:
    allowed: [bug, security, enhancement, documentation, question, good-first-issue, firewall, proxy, docker, ci]
  add-comment: {}
timeout-minutes: 5
---
# Issue Triage Agent
Analyze issue #$\{\{ github.event.issue.number }} in $\{\{ github.repository }}...
````

---

### P1 — Plan for Near-Term

#### P1.1: Firewall Escape Test Agent 🔥

**What**: A dedicated daily agent that attempts to escape the AWF network firewall using known techniques and reports findings as a discussion

**Why**: The `security-review.md` workflow already references this agent ("Read the Firewall Escape Test Agent's Report") but it **doesn't exist** — this is a gap in the security review pipeline. For a security firewall repository, continuous adversarial escape testing is uniquely domain-relevant. This workflow would try known bypass techniques (DNS tunneling, HTTP CONNECT abuse, IPv6 bypass, localhost tricks) and report on which ones are properly blocked.

**How**: A daily scheduled workflow using `bash: true` that runs actual `awf` commands with various bypass attempts inside the container, checks squid logs, and reports success/failure per technique.

**Effort**: Medium

**Unique to this repo**: No other repository type can benefit from this as directly as a network firewall tool. Each test run validates real security invariants.

---

#### P1.2: Workflow Health Monitor (Meta-Agent)

**What**: A weekly meta-agent that reviews all other agentic workflow runs and creates a health report with issues for unhealthy agents

**Why**: The factory learned that *meta-agents are incredibly valuable*. Currently there's no observability on the 28 workflows themselves — nobody is watching the watchers. The duplicate discussion problem (#1111/#1106, etc.) would be caught automatically. Secret Digger failures (#1107, #1105) linger as issues but there's no systematic health check.

**How**: Weekly scheduled workflow using `agentic-workflows` tool to inspect recent runs of all workflows, identify failure rates, duplicate outputs, and cost anomalies. Creates issues for unhealthy workflows.

**Effort**: Low–Medium

````yaml
---
on:
  schedule: weekly
tools:
  agentic-workflows:
  github:
    toolsets: [default, actions]
  cache-memory: true
safe-outputs:
  create-discussion:
    title-prefix: "[Workflow Health] "
  create-issue:
    title-prefix: "[Workflow Health] "
    labels: [agentic-workflows]
    max: 5
````

---

#### P1.3: Breaking Change Checker

**What**: On each PR, detect backward-incompatible CLI changes (removed flags, changed defaults, renamed options, Docker API changes)

**Why**: AWF is a distributed CLI tool consumed by users who script it. Breaking changes in `--allow-domains` semantics, flag names, or Docker compose configuration need early detection. The factory uses this pattern with a 100% causal chain merge rate. Recent PRs adding `--build-local`, changing `--image-tag` behavior, and adding API proxy ports are exactly the type of changes this catches.

**How**: PR-triggered workflow that diffs `src/cli.ts`, `src/types.ts`, and `containers/` against base branch, identifies potentially breaking changes, and comments on the PR.

**Effort**: Low

---

### P2 — Consider for Roadmap

#### P2.1: Daily Malicious Code Scan

**What**: Daily scan of recent commits for suspicious patterns — obfuscated code, unusual network calls, hardcoded credentials, suspicious shell commands

**Why**: AWF runs as root with NET_ADMIN capability and accesses docker.sock. A supply chain compromise here would be particularly dangerous. The factory runs this daily in gh-aw. For a security-critical tool, this defensive layer is especially important.

**Effort**: Low (based on existing `secret-digger` pattern, just different analysis focus)

---

#### P2.2: Sub Issue Closer

**What**: Automatically close sub-issues when parent issues are resolved

**Why**: As issue-monster creates more Copilot SWE agent tasks, sub-issue tracking will accumulate stale closed/merged items. From the factory: "keeps the issue tracker clean."

**Effort**: Low

---

#### P2.3: Changeset Generator

**What**: On merging to main, analyze commits since last release and auto-generate a PR with version bump + CHANGELOG entry

**Why**: `update-release-notes` improves notes *after* a release is published, but there's no automation for *preparing* releases. The factory's Changeset workflow had a 78% merge rate across 28 proposed PRs. Given AWF releases container images via GHCR, having well-tracked version bumps matters.

**Effort**: Medium

---

#### P2.4: Fix `skip-if-match` for Discussion-Creating Workflows

**What**: Update `ci-cd-gaps-assessment`, `security-review`, and `pelis-agent-factory-advisor` to use better deduplication to avoid accumulating stale duplicate discussions/issues

**Why**: Currently 6 open duplicate issues (#1113/#1109, #1112/#1108, #1111/#1106). The `skip-if-match` queries need to match the title prefixes + date patterns.

**Effort**: Low — just adjust the `skip-if-match` queries in each workflow

---

### P3 — Future Ideas

#### P3.1: Portfolio Analyst (Token Cost Optimizer)

**What**: Weekly analysis of workflow token usage and costs across all 28 workflows, identifying expensive agents and optimization opportunities

**Why**: With 28 workflows running daily/hourly/weekly, token costs accumulate. The factory found some agents were "way too chatty" with LLM calls. Secret-digger alone runs 3× per hour.

**Effort**: Low (read-only analysis)

---

#### P3.2: Weekly Issue & PR Summary

**What**: Weekly digest of repository activity — open issues, PR status, workflow health — posted as a discussion

**Why**: With automated agents creating many issues/PRs, maintainers need a curated weekly digest to stay informed without reading every individual output.

**Effort**: Low

---

#### P3.3: Contribution Guidelines Checker

**What**: On new PRs from external contributors, check that contribution guidelines (conventional commits, scope, PR title format) are followed and comment with guidance

**Why**: AWF enforces strict conventional commits (with a limited scope allowlist — `cli, docker, squid, proxy, ci, deps`). External contributors frequently get PR title check failures. An early-comment agent reduces frustration.

**Effort**: Low

---

## 📈 Maturity Assessment

**Current Level: 4/5 — Advanced Factory**

This is one of the most sophisticated agentic workflow setups outside the gh-aw factory itself. Strengths:
- ✅ 28 compiled agentic workflows across all major categories  
- ✅ Multi-engine support (Claude, Codex, Copilot)
- ✅ Domain-specific workflows (security-guard, smoke tests, secret-digger × 3)
- ✅ Good cascade design (ci-doctor → issues → issue-monster → PRs)
- ✅ Cache-memory usage for stateful agents

**Target Level: 4.5/5 — Add meta-monitoring and triage**

**Gap Analysis**:
1. Add issue triage (P0) → improves issue quality entering issue-monster cascade
2. Add workflow health monitor (P1) → closes the observability gap for 28 workflows
3. Fix uncompiled workflows (operational) → pelis-advisor and test-coverage-improver aren't running
4. Build the escape test agent (P1) → unique to this repo's security mission

---

## 🔄 Comparison with Best Practices

| Best Practice | This Repo | Notes |
|--------------|-----------|-------|
| Issue triage | ❌ | Missing; all auto-created issues unlabeled |
| Fault investigation | ✅ | ci-doctor is excellent |
| Security compliance | ✅✅ | Above average — security-guard, security-review, secret-digger×3 |
| Documentation sync | ✅ | doc-maintainer + cli-flag-consistency-checker |
| Meta-agent monitoring | ❌ | No workflow health manager or audit workflows |
| Release automation | ⚠️ | update-release-notes exists but no changeset generation |
| Code quality agents | ❌ | No simplicity/refactoring/style agents |
| Interactive/ChatOps | ✅ | `/plan` slash command |
| Multi-engine testing | ✅✅ | Unique strength — smoke tests on 4 configs |
| Observability/metrics | ❌ | No portfolio analyst or metrics collector |

**What this repo does uniquely well**: The triple-engine secret digger (running hourly on claude/codex/copilot) and the four-way smoke testing matrix are standout patterns not seen in the factory itself. The security-guard PR reviewer using Claude is particularly well-suited to this security-critical codebase.

**Domain opportunity**: A **Firewall Escape Test Agent** is uniquely valuable here — no other repository type can leverage this pattern. It would turn the firewall into its own test subject, continuously verifying security invariants.

---

_Generated by Pelis Agent Factory Advisor · 2026-03-02_

---

> **Note:** This was intended to be a discussion, but discussions could not be created due to permissions issues. This issue was created as a fallback.
>
> **Tip:** Discussion creation may fail if the specified category is not announcement-capable. Consider using the "Announcements" category or another announcement-capable category in your workflow configuration.




> Generated by [Pelis Agent Factory Advisor](https://github.com/github/gh-aw-firewall/actions/runs/22559977394)
> - [x] expires  on Mar 9, 2026, 3:25 AM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report — Mar 2026 #1115

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

📋 Current Agentic Workflow Inventory

🚨 Immediate Issues to Address

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1: Issue Triage Agent

P1 — Plan for Near-Term

P1.1: Firewall Escape Test Agent 🔥

P1.2: Workflow Health Monitor (Meta-Agent)

P1.3: Breaking Change Checker

P2 — Consider for Roadmap

P2.1: Daily Malicious Code Scan

P2.2: Sub Issue Closer

P2.3: Changeset Generator

P2.4: Fix `skip-if-match` for Discussion-Creating Workflows

P3 — Future Ideas

P3.1: Portfolio Analyst (Token Cost Optimizer)

P3.2: Weekly Issue & PR Summary

P3.3: Contribution Guidelines Checker

📈 Maturity Assessment

🔄 Comparison with Best Practices

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pattern	Description	Present here?
Specialization	Many focused workflows vs one monolithic agent	✅ Yes — 28 specialized workflows
Multi-engine	Different AI models for different tasks	✅ Yes — claude, codex, copilot
Meta-agents	Agents that monitor other agents (Audit Workflows, Workflow Health Manager)	❌ Missing
Cascade workflows	Issues → downstream PR chains via issue-monster	✅ Partial — issue-monster exists
Cache-memory	Cross-run persistent state (e.g., issue-duplication-detector)	✅ Yes
skip-if-match	Preventing duplicate outputs	⚠️ Partially broken — duplicates observed
Observability	Metrics Collector, Portfolio Analyst	❌ Missing
Issue triage	Automated labeling + triage comments	❌ Missing
Code quality agents	Continuous Simplicity, Refactoring, Style	❌ Missing
Breaking change detection	Alerting on backward-incompatible changes	❌ Missing
Daily malicious code scan	Supply chain defense	❌ Missing

Workflow	Purpose	Trigger	Engine	Assessment
`build-test-{bun,cpp,deno,dotnet,go,java,node,rust}`	Build & test PRs in 8 ecosystems	PR opened/sync	copilot	✅ Excellent coverage
`ci-cd-gaps-assessment`	Daily CI/CD gap analysis	Schedule daily	copilot	✅ Active, creating discussions
`ci-doctor`	Investigate CI failures, open issues	workflow_run failed	copilot	✅ Core workflow
`cli-flag-consistency-checker`	Weekly CLI flag consistency check	Schedule weekly	copilot	✅ Good hygiene
`dependency-security-monitor`	Daily CVE monitoring + dep PRs	Schedule daily	copilot	✅ Very active (3 open PRs)
`doc-maintainer`	Daily docs sync with code changes	Schedule daily	copilot	✅ Good coverage
`issue-duplication-detector`	Detect duplicate issues	Issue opened	copilot	✅ Uses cache-memory
`issue-monster`	Dispatch issues to Copilot SWE agent	Issue opened + hourly	copilot	✅ Core orchestrator
`pelis-agent-factory-advisor`	This workflow	Schedule daily	copilot	⚠️ UNCOMPILED
`plan`	`/plan` slash command	Discussion/issue comment	copilot	✅ Interactive
`secret-digger-claude/codex/copilot`	Hourly secret scanning (3 engines)	Hourly cron	all 3	⚠️ Codex + Copilot failing
`security-guard`	PR security review	PR opened/sync	claude	✅ Excellent for this repo
`security-review`	Daily comprehensive security review	Schedule daily	copilot	✅ Very thorough
`smoke-{chroot,claude,codex,copilot}`	End-to-end smoke tests	PR + schedule	all 3 + copilot	✅ Multi-engine, excellent
`test-coverage-improver`	Weekly test coverage PRs	Schedule weekly	copilot	⚠️ UNCOMPILED
`update-release-notes`	Enhance release notes on publish	Release published	copilot	✅ Good

Best Practice	This Repo	Notes
Issue triage	❌	Missing; all auto-created issues unlabeled
Fault investigation	✅	ci-doctor is excellent
Security compliance	✅✅	Above average — security-guard, security-review, secret-digger×3
Documentation sync	✅	doc-maintainer + cli-flag-consistency-checker
Meta-agent monitoring	❌	No workflow health manager or audit workflows
Release automation	⚠️	update-release-notes exists but no changeset generation
Code quality agents	❌	No simplicity/refactoring/style agents
Interactive/ChatOps	✅	`/plan` slash command
Multi-engine testing	✅✅	Unique strength — smoke tests on 4 configs
Observability/metrics	❌	No portfolio analyst or metrics collector

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Report — Mar 2026 #1115

Description

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

📋 Current Agentic Workflow Inventory

🚨 Immediate Issues to Address

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1: Issue Triage Agent

P1 — Plan for Near-Term

P1.1: Firewall Escape Test Agent 🔥

P1.2: Workflow Health Monitor (Meta-Agent)

P1.3: Breaking Change Checker

P2 — Consider for Roadmap

P2.1: Daily Malicious Code Scan

P2.2: Sub Issue Closer

P2.3: Changeset Generator

P2.4: Fix skip-if-match for Discussion-Creating Workflows

P3 — Future Ideas

P3.1: Portfolio Analyst (Token Cost Optimizer)

P3.2: Weekly Issue & PR Summary

P3.3: Contribution Guidelines Checker

📈 Maturity Assessment

🔄 Comparison with Best Practices

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

P2.4: Fix `skip-if-match` for Discussion-Creating Workflows