A structured methodology for working with AI coding agents in enterprise environments.
This repository provides workflow implementations for AI-assisted development. The methodology addresses common failure modes when working with AI agents and provides a systematic approach to planning, implementing, and reviewing AI-generated code.
For background on the failure modes and design principles behind these workflows, see:
-
Designing Agentic Workflows: Where Agents Fail, and Where We Fail
https://dev.to/danielbutlerirl/designing-agentic-workflows-where-agents-fail-and-where-we-fail-4a95 -
Designing Agentic Workflows: A Practical Example
https://dev.to/danielbutlerirl/designing-agentic-workflows-a-practical-example-291j
For a detailed walkthrough of how the core loop in this repository is structured and why it is sequenced this way, see:
-
Designing Agentic Workflows: The Core Loop
https://dev.to/danielbutlerirl/designing-agentic-workflows-the-core-loop-166d -
Designing agentic workflows: supplementary commands and pressure valves https://dev.to/danielbutlerirl/designing-agentic-workflows-supplementary-commands-and-pressure-valves-l51
These workflows are designed for developers who:
- Are new to agentic coding and want a cautious, structured approach
- Need reviewable commits and clear verification gates
- Work in enterprise environments requiring accountability
- Want to prevent common AI failure modes (premature completion claims, test deletion, hardcoded solutions)
As AI coding tools improve and you gain experience, you should streamline these workflows and design approaches that work best for your team and projects. This is a starting point, not a permanent prescription.
Working with AI agents requires awareness of common failure patterns:
- Baby-Counting: Silently drops requirements or deletes tests to appear complete
- Cardboard Muffin: Delivers hollow implementations (hardcoded values, stubbed logic)
- Half-Assing: Technically functional but poorly architected code
- Litterbug: Leaves debug code, temporary comments, and dead code behind
- Rubber-Stamping: Approving without reviewing due to review fatigue
- Intent Drift: Original goals lost as conversation lengthens
- Decision Delegation: Letting AI make architectural choices without explicit evaluation
- Review Fatigue: Overwhelmed by large diffs or comprehensive specifications
The workflows in this repository provide structure and checkpoints to detect and prevent these patterns.
| Tool | Directory | Format | Setup |
|---|---|---|---|
| Claude Code | claude-code/ | SKILL.md in directories | Copy to .claude/skills/ |
| Codex CLI | codex-cli/ | SKILL.md in directories | Copy to ~/.codex/skills/ |
| Gemini CLI | gemini-cli/ | TOML commands | Copy to ~/.gemini/commands/ |
| Bob | bob/ | Markdown commands | Copy to .bob/commands/ |
Each implementation includes tool-specific setup instructions and an AGENTS file template for project-level requirements.
- Investigate: Understand existing code or enrich sparse issues
- Design: Define interfaces and architecture (signatures only, no implementation)
- Issue Planning: Define an issue with clear success criteria
- Define Gates: Define verification gates for the issue
- Task Planning: Break issues into commit-sized tasks (50-200 lines each)
- Implementation: Execute single task with explicit verification
- Cleanup: Three-phase audit before PR (audit, fix, validate)
- Fresh sessions per task: Prevents context pollution
- Hard verification gates: Clear pass/fail criteria, not subjective review
- One task, one commit: Reviewable units of work
- Human approval required: For scope changes, dependencies, and commits
- Test protection: Never modify tests to make them pass
- Methodology Guide: Complete workflow explanation with rationale
- Tool-specific READMEs: Setup and usage instructions for each implementation
- Choose your AI coding tool from the implementations above
- Follow the setup instructions in that tool's directory
- Read the methodology guide to understand the workflow phases
- Start with Issue Planning for your first task
Apache 2.0 - See LICENSE for details
This is a reference implementation. Fork and adapt for your team's needs.