feat: Fleet client #256

dzorlu · 2025-12-18T01:31:20Z

PR: Fleet environments (OpenEnv)

This PR documents and refines the Fleet runtime integration for OpenEnv.

What this enables

Run OpenEnv environments on Fleet (remote) with no local Docker.
Keep a strict split between:
- Orchestration (HTTP): reset / step / state
- Agent actions (MCP): tools/list + tools/call

What this is not

This is not the local “Dockerized env server + env container” setup.
There is no container/provider abstraction here; Fleet hosts the runtime remotely (HTTP env server + MCP service). The client only connects.

Main abstractions

FleetEnvClient (HTTP): orchestrator handle for reset/step/state.
FleetMCPTools (MCP): agent handle for listing/calling tools.
- Unions tools across Fleet’s MCP endpoints (today often api/v1/mcp and mcp)
- Returns tools in OpenAI “tools” dict format (via convert_tool_format)
- Routes tool calls to the owning endpoint (cached after discovery)

Quickstart

Install: pip install "openenv-core[fleet]"
Set: export FLEET_API_KEY="..."
Run: python examples/fleet_env_example.py <env_key>

Plea see the README for a deeper discussion and TODOs.

Darktex

Note: This is an automated review by Claude Code (alignment-reviewer agent), not a human review. The account posting this is shared with the human maintainer.

Now let me produce the comprehensive two-tier alignment review:

PR #256: Fleet Client - Alignment Review

Executive Summary

Tier 1 Status: ⚠️ 1 bug found (missing HTTP client dependency)
Tier 2 Status: ⚠️ 2 alignment flags (architectural boundary violations)

This PR introduces Fleet integration following RFC 003's MCP tool-call architecture. While the implementation demonstrates solid understanding of the dual-interface model (HTTP for orchestration, MCP for agent actions), there are critical issues that must be addressed before merge.

Tier 1: Bugs, Security, and Code Quality

🔴 Critical: Missing Dependency

Location: src/envs/fleet_env/client.py:20

Issue: The code imports HTTPEnvClient which doesn't exist in the codebase. Based on the existing pattern (EnvClient in src/openenv/core/env_client.py), this appears to be a non-existent module.

from openenv_core.http_env_client import HTTPEnvClient

Evidence: Grep search for HTTPEnvClient shows no definition:

$ grep -r "class HTTPEnvClient" src/
# No results

The existing client is EnvClient (WebSocket-based), not HTTPEnvClient.

Impact: The code will fail at import time. This suggests the PR was not tested in the current environment structure.

Fix Required:

Either implement HTTPEnvClient following existing EnvClient patterns, OR
Refactor FleetEnvClient to inherit from EnvClient and use WebSocket, OR
Document why HTTP-only is required for Fleet and implement the base class

⚠️ Lint Check Failed

The automated lint check failed due to missing uv:

Error: 'uv' is not installed or not in PATH

Required Action: The PR author needs to run:

uv run ruff format src/envs/fleet_env/ examples/fleet_env_example.py tests/envs/test_fleet_env.py

✅ Security: No Issues Found

No credentials in code
Proper authorization header handling
No command injection risks
Appropriate use of environment variables for API keys

✅ Debug Code: Acceptable

The check-debug.sh hook flagged print statements in examples/fleet_env_example.py, but these are acceptable as:

It's in examples/ (not src/)
The prints are intentional user-facing output for the example
They help users understand the flow

Tier 2: Alignment with Principles and Invariants

🚨 ALIGNMENT FLAG #1: Dual API Boundary Violation

Invariant at Risk: Architectural Invariants → Dual API boundary (INVARIANTS.md lines 41-58)

The Concern: FleetEnvClient.step() explicitly rejects MCP tool actions:

# src/envs/fleet_env/client.py:125-130
def step(self, action: Action) -> StepResult[Observation]:
    # Enforce separation: agent actions are MCP-only (use FleetMCPTools).
    if isinstance(action, (ListToolsAction, CallToolAction)):
        raise TypeError(
            "Agent tool actions are MCP-only. Use FleetMCPTools.list_tools()/call_tool()."
        )

Why This Violates the Invariant:

Per RFC 001 and INVARIANTS.md, the dual API boundary is:

Agent Interface (MCP): Tools the agent uses
Infrastructure Interface (Gym-like): reset(), step(), state() for orchestration

However, RFC 003 explicitly defines ListToolsAction and CallToolAction as Gym-style actions that wrap MCP calls:

# From RFC 003, lines 476-486
@dataclass
class ListToolsAction(Action):
    """Request list of available tools from MCP servers."""
    pass

@dataclass  
class CallToolAction(Action):
    """Call a specific tool via MCP."""
    tool_name: str
    parameters: Dict[str, Any]

The Design Intent (RFC 003, line 90):

# Training/Eval: Use step API
POST http://localhost:8000/step
{"action": {...}}

# Production/Inference: Use MCP API  
POST http://localhost:8000/mcp
{"jsonrpc": "2.0", "method": "tools/list", "id": 1}

The RFC shows both paths are valid: step-based (training) and direct MCP (inference).

The Problem: Fleet client forces users to use FleetMCPTools (direct MCP) and blocks the Gym-style action interface. This creates an environment that cannot be used in standard RL training loops that expect env.step(ListToolsAction()) to work.

Suggested Resolution:

Option A (Recommended): Support both interfaces as RFC 003 intends:
- env.step(ListToolsAction()) → calls MCP internally, returns observation
- tools.list_tools() → direct MCP access for inference
Option B: If Fleet truly cannot support Gym-style actions, document this as a known limitation and explain why Fleet environments are "inference-only" (violating OpenEnv's principle of identical training/production interfaces).

Reviewer: @Darktex (RFC 003 author)

🚨 ALIGNMENT FLAG #2: Missing Reward Computation

Principle at Risk: "Rewards inside environment" (PRINCIPLES.md line 31, RFC 002)

The Concern: FleetEnvClient._parse_result() constructs observations but doesn't compute rewards:

# src/envs/fleet_env/client.py:108-115
return StepResult(
    observation=Observation(
        metadata=obs_payload,
        reward=payload.get("reward"),  # ← Passthrough, not computed
        done=payload.get("done", False),
    ),
    reward=payload.get("reward"),
    done=payload.get("done", False),
)

Why This Matters:

Per PRINCIPLES.md (line 31) and INVARIANTS.md (lines 64-67):

Rewards in environment: Reward computation must stay inside environment boundary

The client is simply passing through whatever payload.get("reward") returns from Fleet. If Fleet doesn't compute rewards (or computes them incorrectly for the task), there's no mechanism to override or augment.

Questions for Clarification:

Does the Fleet server compute task-specific rewards?
If not, where should reward computation happen for Fleet environments?
Should FleetEnvClient support a reward transform pipeline?

Current State: The README (src/envs/fleet_env/README.md) doesn't mention reward computation at all, suggesting this is unresolved.

Suggested Resolution:

Document Fleet's reward computation model in the README
If Fleet doesn't compute rewards, add a reward_fn parameter to FleetEnvClient.__init__()
Consider whether Fleet environments are suitable for RL training without reward computation

Reviewer: @Darktex

⚠️ Minor: Client-Server Separation

Pattern Compliance: The code properly separates client and server concerns:

✅ models.py defines shared types
✅ No client imports from server/
✅ No server directory (Fleet is external)

However, there's a structural inconsistency: Traditional OpenEnv environments have:

my_env/
├── client.py
├── models.py
└── server/

Fleet has:

fleet_env/
├── client.py
├── mcp_tools.py
├── fleet_mcp_client.py
├── models.py
└── README.md  # No server/ because it's remote

Recommendation: Add a comment in __init__.py explaining why there's no server/ directory:

"""Fleet Environment - client-side adapter for Fleet-hosted MCP environments.

Note: Unlike standard OpenEnv environments, fleet_env has no server/ directory
because the environment runs remotely on Fleet infrastructure. This package
only provides client-side adapters.
"""

✅ Strengths

Excellent documentation: The README is comprehensive and well-structured
Type safety: Proper use of dataclasses and type hints
Error handling: Good separation of discovery vs execution errors in MCP tools
Testing: Thoughtful use of fakes to avoid external dependencies
Dual-plane awareness: The code clearly understands orchestration vs agent boundaries

Summary of Required Actions

Before Merge (Blockers):

Fix missing HTTPEnvClient (Tier 1 bug)
- Either implement it or refactor to use EnvClient
Resolve ALIGNMENT FLAG #1 (Dual API boundary)
- Decide: Support both Gym-style actions AND direct MCP, or document as inference-only
Run lint checks
- Execute: uv run ruff format ...

Strongly Recommended:

Resolve ALIGNMENT FLAG #2 (Reward computation)
- Document Fleet's reward model or implement client-side rewards
Add architectural note to __init__.py about missing server/

Test Results

The test file (tests/envs/test_fleet_env.py) uses appropriate faking strategies and covers:

✅ Provisioning from Fleet
✅ HTTP orchestration (reset)
✅ MCP tool discovery
✅ Tool routing across multiple endpoints

However, the tests will fail due to the missing HTTPEnvClient import.

Conclusion

This PR demonstrates strong architectural understanding and solid engineering practices. The two critical alignment flags stem from genuine design tensions between:

RFC 003's vision of unified Gym+MCP actions
Fleet's separate HTTP/MCP interface design

These tensions need human review to determine the right path forward. Once resolved, this will be a valuable addition to OpenEnv's runtime ecosystem.

Recommendation: Request clarification from RFC authors before merging.

Automated review by Claude Code | Learn more about OpenEnv's agentic workflow

- FleetTaskEnv wraps FleetEnvClient with task-oriented interface - Accepts task configs from export_training_tasks.py - Creates versioned environments on reset - Injects task prompt into observations - Executes verifier for reward computation on episode completion - Supports both sync and async step methods - Factory functions: make_fleet_task_env, from_json_file - Tests: 20 unit tests for init, specs, verifiers, factories 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

zkwentz · 2026-01-21T14:45:10Z

@greptile

greptile-apps · 2026-01-21T14:53:31Z

Greptile Summary

This PR implements Fleet runtime integration for OpenEnv, enabling remote environment execution without local Docker. The implementation maintains a clean separation between HTTP orchestration (FleetEnvClient) and MCP-based agent actions (FleetMCPTools), which aligns well with the architectural principles outlined in RFC 001 and RFC 003.

Key additions:

FleetEnvClient handles environment lifecycle (reset/step/state) via HTTP
FleetMCPTools aggregates tools from multiple MCP endpoints and converts them to OpenAI format
FleetTaskEnv provides a Gymnasium-compatible wrapper for RL training with task configs and verifier execution
Comprehensive tests and documentation

Critical issues:

Security vulnerability: task_env.py:301 uses exec() to execute arbitrary verifier code without sandboxing, creating a code injection risk
Missing parameter: task_env.py:138 builds data_spec but doesn't pass data_key to from_fleet()
Exception handling: mcp_tools.py catches BaseException which suppresses system exits and keyboard interrupts

Confidence Score: 3/5

This PR introduces significant new functionality but has a critical security issue with exec() usage that needs addressing before merge
Score reflects the security vulnerability from arbitrary code execution via exec(), the missing data_key parameter bug, and overly broad exception handling. The core architecture is sound and well-tested, but these issues need resolution.
Pay close attention to src/envs/fleet_env/task_env.py (exec security risk and missing parameter) and src/envs/fleet_env/mcp_tools.py (exception handling)

Important Files Changed

Filename	Overview
src/envs/fleet_env/task_env.py	Adds Gymnasium-compatible wrapper for Fleet tasks with verifier execution using exec() which poses security risks
src/envs/fleet_env/client.py	Implements HTTP orchestration client for Fleet environments with proper separation of concerns
src/envs/fleet_env/mcp_tools.py	Implements MCP tools client with broad exception handling that may hide issues
src/envs/fleet_env/fleet_mcp_client.py	Clean MCP client wrapper with proper async context management
src/envs/fleet_env/models.py	Provides tool format conversion between MCP and OpenAI formats with schema normalization

Sequence Diagram

sequenceDiagram
    participant Agent as Agent/Policy
    participant Client as FleetEnvClient<br/>(HTTP Orchestration)
    participant Tools as FleetMCPTools<br/>(MCP Agent Actions)
    participant Fleet as Fleet Runtime<br/>(Remote)
    participant MCP1 as MCP Endpoint 1<br/>(api/v1/mcp)
    participant MCP2 as MCP Endpoint 2<br/>(mcp)

    Note over Agent,Fleet: 1. Provision Environment
    Agent->>Client: FleetEnvClient.from_fleet(api_key, env_key)
    Client->>Fleet: Fleet.make(env_key, image_type="mcp")
    Fleet-->>Client: env handle + URLs
    Client->>Client: Create FleetMCPTools with MCP URLs
    Client-->>Agent: (orch_client, mcp_tools)

    Note over Agent,Fleet: 2. Reset Episode (HTTP)
    Agent->>Client: reset()
    Client->>Fleet: POST /reset
    Fleet-->>Client: initial observation
    Client-->>Agent: observation

    Note over Agent,MCP2: 3. Discover Tools (MCP)
    Agent->>Tools: list_tools()
    Tools->>MCP1: list_tools() via streamable HTTP
    MCP1-->>Tools: tools list A
    Tools->>MCP2: list_tools() via streamable HTTP
    MCP2-->>Tools: tools list B
    Tools->>Tools: Union tools + convert to OpenAI format
    Tools-->>Agent: ListToolsAction(tools)

    Note over Agent,MCP2: 4. Execute Tool (MCP)
    Agent->>Tools: call_tool(name, args)
    Tools->>Tools: Check owner cache for tool
    Tools->>MCP1: call_tool(name, args) via streamable HTTP
    MCP1-->>Tools: tool result
    Tools-->>Agent: result

    Note over Agent,Fleet: 5. Cleanup
    Agent->>Client: close()
    Client->>Fleet: env.close()
    Fleet-->>Client: terminated

greptile-apps

_{12 files reviewed, 7 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-21T14:53:25Z

src/envs/fleet_env/task_env.py

+        namespace = {}
+
+        # Execute the verifier code to define the function
+        exec(verifier_code, namespace)


logic: Using exec() on arbitrary code strings is a significant security vulnerability. This allows execution of untrusted Python code without sandboxing.

Suggested change

exec(verifier_code, namespace)

# SECURITY: exec() allows arbitrary code execution

# Consider using ast.literal_eval for safe data structures

# or a sandboxed execution environment like RestrictedPython

exec(verifier_code, namespace)

Is the verifier_code always from trusted sources, or could it come from user-provided task configs?

Prompt To Fix With AI

This is a comment left during a code review. Path: src/envs/fleet_env/task_env.py Line: 301:301 Comment: **logic:** Using `exec()` on arbitrary code strings is a significant security vulnerability. This allows execution of untrusted Python code without sandboxing. ```suggestion # SECURITY: exec() allows arbitrary code execution # Consider using ast.literal_eval for safe data structures # or a sandboxed execution environment like RestrictedPython exec(verifier_code, namespace) ``` Is the verifier_code always from trusted sources, or could it come from user-provided task configs? How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-21T14:53:26Z

src/envs/fleet_env/mcp_tools.py

+            except BaseException:  # noqa: BLE001
+                continue


logic: BaseException catches system exits and keyboard interrupts. Use Exception instead to allow proper cleanup and termination signals.

Suggested change

except BaseException: # noqa: BLE001

continue

except Exception: # noqa: S110

continue

Prompt To Fix With AI

This is a comment left during a code review. Path: src/envs/fleet_env/mcp_tools.py Line: 57:58 Comment: **logic:** `BaseException` catches system exits and keyboard interrupts. Use `Exception` instead to allow proper cleanup and termination signals. ```suggestion except Exception: # noqa: S110 continue ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-21T14:53:27Z

src/envs/fleet_env/mcp_tools.py

+            except BaseException:
+                # Only suppress discovery/connection errors.
+                # If call_tool raised, it would have bubbled up above.
+                continue


logic: BaseException catches system exits and keyboard interrupts. Use Exception instead.

Suggested change

except BaseException:

# Only suppress discovery/connection errors.

# If call_tool raised, it would have bubbled up above.

continue

except Exception:

# Only suppress discovery/connection errors.

# If call_tool raised, it would have bubbled up above.

continue

Prompt To Fix With AI

This is a comment left during a code review. Path: src/envs/fleet_env/mcp_tools.py Line: 76:79 Comment: **logic:** `BaseException` catches system exits and keyboard interrupts. Use `Exception` instead. ```suggestion except Exception: # Only suppress discovery/connection errors. # If call_tool raised, it would have bubbled up above. continue ``` How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-21T14:53:28Z

src/envs/fleet_env/task_env.py

+            env_key=env_spec,
+            data_key=data_spec,
+            ttl_seconds=self.ttl_seconds,


logic: data_key parameter not passed to from_fleet() - the built data_spec is unused.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/envs/fleet_env/task_env.py Line: 138:140 Comment: **logic:** `data_key` parameter not passed to `from_fleet()` - the built `data_spec` is unused. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-21T14:53:29Z

src/envs/fleet_env/client.py

+        if self._fleet_env:
+            self._fleet_env.close()


style: Check self._fleet_env is not None instead of truthiness since the handle could be falsy but still valid.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/envs/fleet_env/client.py Line: 136:137 Comment: **style:** Check `self._fleet_env is not None` instead of truthiness since the handle could be falsy but still valid. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-21T14:53:30Z

src/envs/fleet_env/task_env.py

+        import asyncio
+        return asyncio.get_event_loop().run_until_complete(self.step_async(action))


style: Calling asyncio.get_event_loop().run_until_complete() can fail if an event loop is already running. Consider documenting that step() should only be used from sync code, or handle the RuntimeError.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/envs/fleet_env/task_env.py Line: 191:192 Comment: **style:** Calling `asyncio.get_event_loop().run_until_complete()` can fail if an event loop is already running. Consider documenting that `step()` should only be used from sync code, or handle the `RuntimeError`. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-01-21T14:53:30Z

src/envs/fleet_env/task_env.py

+            try:
+                self._orch.close()
+            except Exception:
+                pass


style: Silently catching all exceptions during cleanup can hide issues. Consider logging the exception.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix With AI

This is a comment left during a code review. Path: src/envs/fleet_env/task_env.py Line: 324:327 Comment: **style:** Silently catching all exceptions during cleanup can hide issues. Consider logging the exception. <sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub> How can I resolve this? If you propose a fix, please make it concise.

The MCP images don't exist for all environment versions, causing FleetVersionNotFoundError when trying to create environments. Changing the default to None allows the Fleet SDK to use standard images which are available for all versions. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

FleetEnvClient.from_fleet() was not accepting data_key/data_version parameters, causing them to be passed through **kwargs to HTTPEnvClient which doesn't accept them. - Add data_key and data_version as explicit parameters - Pass them to fleet.make() - Update task_env.py to pass them separately 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Fleet SDK expects data_key in "key:version" format, not as separate parameters. Updated from_fleet() to combine them before calling fleet.make(). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

HTTPEnvClient.reset() doesn't support seed parameter yet. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Increases default timeout from 15s to 60s for Fleet API calls. This prevents timeouts during environment initialization. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Previously reset() did partial work and reset_async() added tool fetching. Now reset_async() does all the work (including fetching tools) and reset() is just a sync wrapper that calls it via run_until_complete(). This ensures both methods return identical results including tools. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

MCP's call_tool() returns a CallToolResult Pydantic object, not plain text. This was causing ugly repr strings to be passed to agents like: "meta=None content=[TextContent(type='text', text='...')] ..." Now properly extracts: - Text content from result.content[].text - Tries JSON parsing for structured results - Falls back to structuredContent if available - Handles isError cases 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Tests for: - FleetMCPClient._extract_tool_result(): - Single text content extraction - JSON parsing from text - Multiple text contents - Error result handling - Structured content fallback - Empty result handling - FleetTaskEnv reset: - reset_async() returns tools - reset() calls reset_async() (sync wrapper) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Move fleet.make() and list_tools() into FleetTaskEnv.__init__() - Tools are now fetched at env creation, not during reset - reset_async() calls _orch.reset() with error handling, returns cached tools - Use asyncio.run() for Python 3.13 compatibility - Update tests for new initialization pattern 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Log task_key and verifier code preview when verifier fails - Catch syntax errors separately with clear message - Show which functions were found if 'verify' is missing Helps debug issues like "Verifier code must define a 'verify' function" 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Replace custom _execute_verifier_local() with Fleet SDK's Task.verify_detailed() which properly sets up the verifier namespace with: - Environment type annotation - Helper functions (normalized_contains, etc.) - Proper function discovery (not just "verify" function) This fixes "name 'Environment' is not defined" errors during verifier execution. Changes: - _compute_reward: Create Fleet SDK Task and call verify_detailed() - Support both 'verifier_code' and 'verifier_func' field names - Add comprehensive logging for debugging - Remove broken _execute_verifier_local method Tests: - Update all verifier tests to mock Fleet SDK Task.verify_detailed() - Add tests for various edge cases (no verifier, no orch, exceptions) - Fix fixture to avoid asyncio.run() conflicts with pytest-asyncio 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

…context

- Add retry with exponential backoff (3 attempts, 1s/2s/4s delays) - Log errors instead of silently swallowing exceptions - Log warning when some clients fail but others succeed - Log error after all retries exhausted This fixes silent failures when MCP connections are flaky, which caused 'no tools found' errors in SkyRL training.

call_tool now retries with exponential backoff (3 attempts, 1s/2s/4s) on connection errors, similar to list_tools. ValueError (tool not found) is not retried. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Adds exponential backoff retry (3 attempts, 2s base delay) around fleet.make() to handle transient Fleet API errors like health check failures that can occur during instance provisioning. Only retries on transient errors (health check, timeout, connection). Permanent errors are raised immediately. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Add Toolathlon-style context management tools for long trajectories: - check_context: Check visible/total turn counts - manage_context: Drop old turns to free up context space - search_history: Search all history (including dropped) - search_tool_output: Search truncated tool output - view_tool_output: Paginate through truncated output The ContextManager class can be used by any training framework that maintains chat_history. It tracks full history and handles truncated tool outputs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Computer-use tasks require MCP-enabled container images (e.g., famazon:mcp0.0.7) which have scrot installed for screenshots and the MCP server with 'computer' tool for mouse/keyboard control.

Deniz added 5 commits December 12, 2025 12:14

fleet integartion step 0

38edfc2

updated README

f67bc43

readme update

164853c

another iteraton

935826f

readme

7c09d5b

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 18, 2025

dzorlu changed the title ~~Deniz/fleet client~~ feat: Fleet client Dec 18, 2025

Darktex reviewed Jan 13, 2026

View reviewed changes

Deniz and others added 2 commits January 16, 2026 22:07

conb

7efae22

greptile-apps bot reviewed Jan 21, 2026

View reviewed changes

Deniz and others added 17 commits January 21, 2026 19:11

Add __init__.py to envs package for pip install compatibility

791a071

fix: Remove seed parameter from reset() call

7852847

HTTPEnvClient.reset() doesn't support seed parameter yet. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Fix: fetch tools lazily in reset_async to avoid asyncio.run in async …

ced5eca

…context

debug: add logging for call_tool to trace success/failure paths

a2f3531

fix: unwrap ExceptionGroup to show actual error cause

a08cb6d

Deniz and others added 4 commits January 28, 2026 17:15

debug: add logging for Fleet instance creation timing

9806eb8

Use image_type='mcp' for computer_use tasks

a1ac1a7

Computer-use tasks require MCP-enabled container images (e.g., famazon:mcp0.0.7) which have scrot installed for screenshots and the MCP server with 'computer' tool for mouse/keyboard control.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Fleet client #256

feat: Fleet client #256

Uh oh!

dzorlu commented Dec 18, 2025

Uh oh!

Darktex left a comment

Uh oh!

zkwentz commented Jan 21, 2026

Uh oh!

greptile-apps bot commented Jan 21, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Jan 21, 2026

Uh oh!

greptile-apps bot Jan 21, 2026

Uh oh!

greptile-apps bot Jan 21, 2026

Uh oh!

greptile-apps bot Jan 21, 2026

Uh oh!

greptile-apps bot Jan 21, 2026

Uh oh!

greptile-apps bot Jan 21, 2026

Uh oh!

greptile-apps bot Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-        exec(verifier_code, namespace)
+        # SECURITY: exec() allows arbitrary code execution
+        # Consider using ast.literal_eval for safe data structures
+        # or a sandboxed execution environment like RestrictedPython
+        exec(verifier_code, namespace)

		import asyncio
		return asyncio.get_event_loop().run_until_complete(self.step_async(action))

feat: Fleet client #256

Are you sure you want to change the base?

feat: Fleet client #256

Uh oh!

Conversation

dzorlu commented Dec 18, 2025

PR: Fleet environments (OpenEnv)

What this enables

What this is not

Main abstractions

Quickstart

Uh oh!

Darktex left a comment

Choose a reason for hiding this comment

PR #256: Fleet Client - Alignment Review

Executive Summary

Tier 1: Bugs, Security, and Code Quality

🔴 Critical: Missing Dependency

⚠️ Lint Check Failed

✅ Security: No Issues Found

✅ Debug Code: Acceptable

Tier 2: Alignment with Principles and Invariants

🚨 ALIGNMENT FLAG #1: Dual API Boundary Violation

🚨 ALIGNMENT FLAG #2: Missing Reward Computation

⚠️ Minor: Client-Server Separation

✅ Strengths

Summary of Required Actions

Before Merge (Blockers):

Strongly Recommended:

Test Results

Conclusion

Uh oh!

zkwentz commented Jan 21, 2026

Uh oh!

greptile-apps bot commented Jan 21, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants