docs: document preflight sentinel, evaluator interfaces, and timeouts#2
docs: document preflight sentinel, evaluator interfaces, and timeouts#2
Conversation
- New "2b. Choose Your Evaluator Interface" section with decision table: Python API vs CLI command vs HTTP endpoint - New "Preflight Behavior" section documenting the undocumented __optimize_anything_preflight__ sentinel and 10s timeout - Document the 30s command_evaluator timeout (not configurable via CLI) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Document __optimize_anything_preflight__ sentinel in Evaluator Contract - Add step 7 to Workflow: test preflight fast-return Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Added preflight guard to shared I/O contract section - Pattern 1 (eval_prompt.py): guard after candidate extraction - Pattern 3 (eval_docs.py): guard after candidate extraction - Pattern 4 (eval_agent.py): guard checks raw payload (before .lower()) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🟡 Pattern 2 (bash evaluator) missing preflight guard despite commit claiming all patterns updated
The commit message states "add preflight guard to all evaluator pattern templates" and the I/O contract section (skills/evaluator-patterns/SKILL.md:14) recommends a preflight guard for all evaluators, yet Pattern 2 (evaluator.sh, lines 138–208) is the only template that was not updated with one. Ironically, this is the bash evaluator that runs pytest — arguably the slowest pattern and the most likely to exceed the 10-second preflight timeout (src/optimize_anything/preflight.py:99). Users who copy this template verbatim will have their evaluator fail the preflight check, blocking optimization entirely (src/optimize_anything/cli.py:490-491).
(Refers to lines 138-209)
Prompt for agents
In skills/evaluator-patterns/SKILL.md, add a preflight guard to the Pattern 2 bash evaluator template (evaluator.sh). After the line that extracts the candidate (line 143, candidate="$(printf '%s' "$payload" | python3 -c ...)"), add a preflight check before the workdir/pytest logic, similar to:
# Preflight guard for optimize-anything CLI
if [ "$candidate" = "__optimize_anything_preflight__" ]; then
printf '{"score":0.5}\n'
exit 0
fi
This should be inserted around line 144, before the mktemp/workdir creation on line 145.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
__optimize_anything_preflight__sentinel sent by the CLI with a 10-second timeout, and how to handle it in evaluatorscommand_evaluatortimeout (not configurable via CLI)Motivation
While integrating optimize-anything with a production evaluator that makes real Gemini API calls (~30-60s per eval), the CLI preflight timed out at 10s. The evaluator parsed the preflight payload as a real candidate because the sentinel value (
__optimize_anything_preflight__) is undocumented. Additionally, the 30scommand_evaluatortimeout would have been the next blocker even after fixing preflight.These docs prevent future users from hitting the same wall and guide them toward the Python API when their evaluator is in-process Python code.
Test plan
echo '{"candidate":"__optimize_anything_preflight__"}' | python3 eval_prompt.pyreturns{"score":0.5}instantly🤖 Generated with Claude Code