Skip to content

feat: consolidate workflow artifacts into fewer, well-named bundles #19452

@Mossaka

Description

@Mossaka

Problem

When users view a completed agentic workflow run in the GitHub Actions UI, they see 7 separate artifacts with confusing, inconsistent names:

Artifact Name Contents Size (example)
prompt Single file: prompt.txt 5 KB
safe-output Single file: outputs.jsonl ~1 KB
agent-output Single file: agent_output.json ~1 KB
agent_outputs Copilot CLI process logs, conversation.md 44-105 KB
agent-artifacts Unified bundle: agent-stdio.log, aw_info.json, MCP logs, firewall logs, patches 150-217 KB
threat-detection.log Single file: detection.log ~3 KB
safe-output-items Single file: safe-output-items.jsonl ~0.4 KB

User-facing problems

  1. Confusing naming: agent-output vs agent_outputs vs agent-artifacts — three artifacts with near-identical names that contain completely different things. Users have no way to know which one has the logs they need.

  2. Too many small artifacts: 4 out of 7 artifacts contain a single file. Users must download and unzip each one separately from the UI to find what they need.

  3. Inconsistent naming conventions: Mix of kebab-case (agent-output, safe-output), snake_case (agent_outputs), and names with dots (threat-detection.log). The .log suffix on threat-detection.log makes the artifact name look like a filename, and creates a confusing directory when downloaded (threat-detection.log/detection.log).

  4. No discoverability: A user looking at the artifact list can't tell which one has the firewall/Squid logs, which has the MCP server logs, or which has the full agent conversation trace. They have to download them all and explore.

  5. gh aw audit has to heavily post-process: The audit command applies 4 separate flattening passes (flattenSingleFileArtifacts, flattenActivationArtifact, flattenUnifiedArtifact, flattenAgentOutputsArtifact) to make the structure usable. This complexity exists because the raw structure is hard to navigate.

What each artifact actually is (for context)

  • prompt → The system prompt sent to the AI agent
  • safe-output → The JSONL of safe-output actions the agent declared (create issue, add comment, etc.)
  • agent-output → The sanitized agent output JSON (same content as safe-output but processed)
  • agent_outputs → Engine-specific logs: Copilot CLI process logs (process-*.log), conversation transcript (conversation.md)
  • agent-artifacts → The main bundle: agent-stdio.log (AWF container lifecycle), aw_info.json (run metadata), MCP server logs, firewall access/cache logs, patches, prompt copy
  • threat-detection.log → Threat detection result (prompt injection, secret leak, malicious patch checks)
  • safe-output-items → Manifest of safe-output items created by the safe_outputs job

Where these are defined

Artifact Source File Function
agent-artifacts pkg/workflow/compiler_yaml_artifacts.go:27 generateUnifiedArtifactUpload()
agent_outputs pkg/workflow/engine_output.go:72 generateEngineOutputCollection()
agent-output pkg/workflow/compiler_yaml.go:630 constant AgentOutputArtifactName
safe-output pkg/workflow/compiler_yaml.go:620 constant SafeOutputArtifactName
safe-output-items pkg/workflow/compiler_safe_outputs_job.go:444 in safe_outputs job
threat-detection.log pkg/workflow/threat_detection.go:457 inline detection step
prompt pkg/workflow/compiler_yaml.go prompt upload step

Proposal

Option A: Consolidate into 2-3 well-named artifacts

Bundle related files into fewer artifacts with descriptive names:

run-logs (or agentic-workflow-logs)

  • Everything currently in agent-artifacts (agent-stdio.log, aw_info.json, MCP logs, firewall logs)
  • Everything currently in agent_outputs (engine process logs, conversation.md)
  • The prompt (prompt.txt)
  • Threat detection log (detection.log)

run-outputs (or agentic-workflow-outputs)

  • safe-output (outputs.jsonl)
  • agent-output (agent_output.json)
  • safe-output-items (safe-output-items.jsonl)

This gives users a clear mental model: logs = what happened during the run, outputs = what the agent produced.

Option B: Keep separate but rename clearly

If consolidation is too risky, at minimum rename for clarity:

Current Proposed Rationale
agent-artifacts sandbox-logs Contains AWF/firewall/MCP logs
agent_outputs engine-logs Contains engine-specific (Copilot CLI) logs
agent-output agent-result The sanitized agent output
safe-output safe-output-actions The safe-output JSONL
threat-detection.log threat-detection Drop the .log suffix
prompt agent-prompt More specific
safe-output-items (keep or merge into safe-output-actions)

Considerations

  • Backward compatibility: gh aw audit already handles multiple naming schemes via flattening. Any rename would need the flattening logic updated to handle both old and new names during a transition period.
  • Artifact retention: Consolidating reduces the number of artifacts, which simplifies the UI but means users download more data if they only need one file. Could be mitigated with clear internal directory structure.
  • Cross-engine differences: agent_outputs varies by engine (Copilot CLI vs Claude Code vs Codex). The bundle should have a consistent top-level structure regardless of engine.

References

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions