Skip to content

feat: CLI and Agent Interface for OpenScreen #349

@lupuletic

Description

@lupuletic

Motivation

OpenScreen is a powerful desktop app for screen recording and video editing, but it currently requires a GUI for all operations. As AI agents become the primary consumers of developer tools, OpenScreen needs a programmatic interface that agents can call directly — without a display or mouse.

Projects like Remotion have proven this model: a CLI as the primary agent interface, agent skills for guidance, and lightweight MCP for discovery. This issue proposes bringing the same approach to OpenScreen.

What we're building

1. CLI (openscreen) — 20+ commands

Project management (pure Node.js, instant):

  • project create — Create .openscreen project from video recording
  • project info — Inspect project metadata, regions, settings
  • project validate — Validate project file integrity
  • project edit — Modify settings (wallpaper, padding, aspect ratio, etc.)

Region management (pure Node.js, instant):

  • zoom add/list/remove — Manage zoom effects (depth 1-6, focus coordinates)
  • trim add/list/remove — Cut sections from the video
  • speed add/list/remove — Change playback speed (0.25x - 2x)
  • annotate add/list/remove — Add text, image, or arrow overlays

Rendering (Electron headless, seconds-minutes):

  • render — Export as MP4 with all effects applied (WebCodecs + PixiJS)
  • gif — Export as animated GIF
  • still — Capture single frame as PNG/JPEG (planned)
  • frames — Export frame sequence (planned)

All commands support --json for machine-readable output.

2. Shared code extraction (src/shared/)

Pure TypeScript types and logic extracted from the Electron app so both CLI and GUI share the same validation, normalization, and project schema code. Zero breaking changes — original files become thin re-exports.

3. Electron headless bridge

A --cli-export mode in electron/main.ts that creates a hidden BrowserWindow, runs the full export pipeline (WebCodecs + PixiJS), and streams progress to stdout. This reuses 100% of the existing rendering code — no FFmpeg dependency, no reimplementation.

4. Agent skills (SKILL.md + rules)

Following the agentskills.io protocol and Remotion's pattern: a master SKILL.md with 7 rule files covering project setup, zoom regions, trim/speed, annotations, export, frames, and troubleshooting.

5. Minimal MCP server

Single openscreen_help tool for documentation search. Agents with shell access call the CLI directly; the MCP server is just for discovery.

Architecture

CLI (Node.js)                    Electron (headless)
┌─────────────────────┐         ┌──────────────────────┐
│ project create/edit │         │ --cli-export mode     │
│ zoom/trim/speed add │         │ CliExportRenderer     │
│ annotate add/remove │         │ VideoExporter (WebCodecs) │
│ shortcuts get/set   │         │ GifExporter (PixiJS)  │
│ --json output       │ ──────▶ │ Progress via IPC      │
└─────────────────────┘ spawn   └──────────────────────┘
        │                               │
        ▼                               ▼
  src/shared/                     Valid MP4/GIF files
  (types, validation,             with all effects applied
   project schema)

Current status

🚧 Work in progress — Core implementation complete and tested E2E. Draft PR linked below.

What's done

  • Shared code extraction (7 shared modules, zero breaking changes)
  • CLI core (20+ commands, --json dual output, input validation)
  • Electron headless bridge (MP4 + GIF export with progress)
  • Agent skills (SKILL.md + 7 rule files)
  • Minimal MCP server
  • Security review + code quality review applied
  • All 33 existing unit tests pass

What's next

  • still command — capture single frame as PNG/JPEG
  • frames command — export frame sequence
  • CLI unit tests
  • Contribution guide for CLI extensions

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions