Add live-docs-lookup skill for AI SDK doc grounding by dhegde11 · Pull Request #646 · obra/superpowers

dhegde11 · 2026-03-06T22:55:31Z

Motivation and Context

Adds a new skill that fetches live documentation from Anthropic, OpenAI, or Google before answering any AI SDK question so stale training data doesn't cause bugs.

Motivation and Context

Built after a real failure: used superpowers (brainstorming → writing-plans → executing-plans) to build a research tool calling the Anthropic Messages API on 1000 companies. The agent made three confident silent mistakes — all caused by stale training data:

Missed the Messages Batches API — 50% cost reduction, purpose-built for this workload
Treated native server tools as custom implementations to build — didn't know Anthropic provides code execution tool built-in with web search (web_search_20260209) and web fetch (web_fetch_20260209)
Used deprecated tool name syntax throughout

No errors, no hedges — just wrong code.

How Has This Been Tested?

Three pressure tests with empirical baseline runs (March 2026):

test-pressure-1 (Anthropic): budget_tokens deprecation on Opus 4.6 — quantitative benchmark: 100% pass rate with skill vs 45% without
test-pressure-2 (OpenAI): unverified model IDs in Responses API guidance
test-pressure-3 (Google): gemini-1.5 recommended when gemini-2.0 is current

See CREATION-LOG.md for full details.

Breaking Changes

None — additive only.

Types of changes

New feature (non-breaking change which adds functionality)

Checklist

My code follows the repository's style guidelines
I have added or updated documentation as needed

Additional context

Supports Anthropic, OpenAI, and Google. Provider detected automatically from imports, env vars, or model names. Falls back to web search if a doc URL has moved.

coderabbitai · 2026-03-06T23:02:44Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds documentation-only "live-docs-lookup" skill: automatic provider detection and per-provider live documentation fetching (Anthropic, OpenAI, Google/Gemini), concise grounding summaries for models/APIs, a creation log, and academic/pressure tests illustrating expected agent behavior changes.

Changes

Cohort / File(s)	Summary
Core Skill Docs `skills/live-docs-lookup/CREATION-LOG.md`, `skills/live-docs-lookup/SKILL.md`	New docs: creation log and detailed skill spec covering provider detection, per-provider URL mappings, parallel fetch + fallback strategy, 30s fetch workflow, verification, and 5–10 bullet grounding output.
Tests — Academic & Pressure `skills/live-docs-lookup/test-academic.md`, `skills/live-docs-lookup/test-pressure-1.md`, `skills/live-docs-lookup/test-pressure-2.md`, `skills/live-docs-lookup/test-pressure-3.md`	Adds academic test and three pressure tests (Anthropic extended-thinking deprecation, OpenAI Responses vs Chat Completions, Google/Gemini SDK/model-ID selection) with scenarios, baseline, expected behavior with the skill, and key assertions.
Repository README `README.md`	Adds Skills Library entry for the live-docs-lookup skill (appears twice in the same subsection).

Sequence Diagram(s)

sequenceDiagram
  participant Agent
  participant ProviderDetector as ProviderDetector
  participant DocFetcher as DocFetcher
  participant WebSearch as WebSearchFallback
  participant Summarizer as Summarizer

  Agent->>ProviderDetector: supply prompt / code / env
  ProviderDetector->>Agent: detected provider(s)
  ProviderDetector->>DocFetcher: request provider-specific docs
  DocFetcher->>DocFetcher: fetch model/docs pages (parallel)
  DocFetcher-->>WebSearchFallback: fallback restricted web search if missing
  WebSearchFallback-->>DocFetcher: return candidate pages
  DocFetcher->>Summarizer: deliver fetched docs
  Summarizer->>Agent: return 5–10 bullet grounding + suggested fixes
  Agent->>Agent: continue original task using grounded context

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 I hopped through docs both near and far,

Nosed out new IDs, chased each tiny star,
I fetched the live pages, flagged what broke and shone,
Brought back tidy notes so the agent’s not alone,
A rabbit’s grounding gift — fresh facts to take home.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly matches the main change: adding a new live-docs-lookup skill for grounding AI SDK documentation.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check	✅ Passed	The PR description clearly explains the motivation (preventing stale training data bugs), provides concrete real-world failure examples, documents testing approach with empirical results, and describes the scope and implementation details.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (1)

skills/live-docs-lookup/SKILL.md (1)
1-9: Add this skill to the main skills index for discoverability.

README.md:88-115 enumerates the available skills, but this new one is missing there. Shipping the skill without updating the central list makes it much harder to discover than the rest of the library.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/live-docs-lookup/SKILL.md` around lines 1 - 9, The new skill
"live-docs-lookup" is missing from the central skills index, so add an entry for
it into the README skills list (the section that enumerates available skills)
matching the format used for other skills: include the skill name
"live-docs-lookup", a one-line description pulled from SKILL.md, and a
link/reference to the SKILL.md file so it appears in discoverability lists;
ensure the entry follows the same ordering/format and syntax as existing skill
entries in the README skills index.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/live-docs-lookup/CREATION-LOG.md`:
- Around line 18-28: Update the wording so the docs do not present
web_search_20260209 / web_fetch_20260209 as canonical tool identifiers; instead
state that Anthropic provides native server-hosted web search and web fetch
tools (e.g., historically referenced as web_search_20250305, web_fetch_20250910,
etc.) and add a note instructing developers to verify the exact current `type`
identifier in the official docs before implementation.

In `@skills/live-docs-lookup/SKILL.md`:
- Around line 46-49: Update the fallback instruction in the "If a URL fails or
returns a redirect/404" section to require domain-scoped searches: when a doc
URL fails, search only the vendor's official docs domain (e.g.,
platform.claude.com/docs, developers.openai.com/api/docs, ai.google.dev) using
the doc topic and provider name instead of a general web search; change the text
to explicitly instruct the agent to restrict queries to the provider's canonical
docs domain and include example domain patterns to enforce grounding.
- Around line 30-31: The provider detection list is missing the modern Google
Gen AI import path; add detection for the import pattern "from google import
genai" alongside the existing "import google.generativeai" and "import vertexai"
entries so the skill recognizes projects using the official google-genai SDK.
Update the Google row in SKILL.md to include the `from google import genai`
token (and any equivalent identifier like `genai`) so functions that parse these
strings (provider detection logic) will trigger for the new SDK import style.

In `@skills/live-docs-lookup/test-pressure-3.md`:
- Around line 21-30: The test hardcodes an outdated model ID
(`gemini-2.0-flash`) in the expectation block; update the test in
skills/live-docs-lookup/test-pressure-3.md to either replace that hardcoded ID
with the current recommended IDs (`gemini-3-flash` and `gemini-3.1-pro`)
or—preferably—change the assertion logic referenced by the "whichever is current
at fetch time" check to validate that the fetched model recommendation is newer
than `gemini-2.0-flash` (i.e., a freshness comparison) instead of exact string
matching, so the test enforces dynamic live-docs fetching rather than a fixed
model name.
- Around line 36-50: The test expects the deprecated Python SDK
`google-generativeai`; update it to the current GenAI SDK by replacing
references to `google-generativeai` with `google-genai` and change import
guidance to the new syntax `from google import genai`; also update any example
model IDs from legacy names like `gemini-pro` to current aliases such as
`gemini-2.0-flash` (or the live recommended alias) so the test reflects live-doc
grounding and non-deprecated usage.

---

Nitpick comments:
In `@skills/live-docs-lookup/SKILL.md`:
- Around line 1-9: The new skill "live-docs-lookup" is missing from the central
skills index, so add an entry for it into the README skills list (the section
that enumerates available skills) matching the format used for other skills:
include the skill name "live-docs-lookup", a one-line description pulled from
SKILL.md, and a link/reference to the SKILL.md file so it appears in
discoverability lists; ensure the entry follows the same ordering/format and
syntax as existing skill entries in the README skills index.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: bb5b2716-5e80-474b-a38f-068d40fd5873

📥 Commits

Reviewing files that changed from the base of the PR and between e4a2375 and 581e46d.

📒 Files selected for processing (6)

skills/live-docs-lookup/CREATION-LOG.md
skills/live-docs-lookup/SKILL.md
skills/live-docs-lookup/test-academic.md
skills/live-docs-lookup/test-pressure-1.md
skills/live-docs-lookup/test-pressure-2.md
skills/live-docs-lookup/test-pressure-3.md

skills/live-docs-lookup/CREATION-LOG.md

skills/live-docs-lookup/SKILL.md

skills/live-docs-lookup/test-pressure-3.md

- Add `from google import genai` to Google provider detection (google-genai SDK) - Domain-scope fallback URL search to official docs domains (platform.claude.com/docs, platform.openai.com/docs, ai.google.dev) - test-pressure-3: update SDK references to recommend google-genai over legacy google-generativeai; remove pinned model ID expectation in favor of "whichever is current at fetch time" - CREATION-LOG: clarify that tool type identifiers (e.g. web_search_20260209) are versioned and should be verified in official docs - README: add live-docs-lookup to skills index under AI SDK Grounding category Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

skills/live-docs-lookup/test-pressure-3.md (1)

21-25: ⚠️ Potential issue | 🟠 Major

Remove the baked-in “current” Gemini model from the baseline narrative.

Line 23 hardcodes gemini-2.0-flash as the default “as of March 2026”, but Lines 47-48 and 55-56 say this test should validate whichever model is current at fetch time. That makes the doc stale again as soon as Google updates its recommendation and undercuts the PR’s goal of removing pinned model-ID expectations.

Suggested doc update

-**Confirmed failure — stale model IDs:**
-The baseline recommended `gemini-1.5-flash` and `gemini-1.5-pro` as current
-models. As of March 2026, `gemini-2.0-flash` is the recommended default and
-the 1.5 series has been superseded. The baseline even acknowledged its own
+**Confirmed failure — stale model IDs:**
+The baseline recommended `gemini-1.5-flash` and `gemini-1.5-pro` as current
+models instead of checking the live models page for the fetch-time
+recommendation. The baseline even acknowledged its own
 uncertainty ("check the docs for latest models") — but still gave 1.5-series
 IDs, which developers will copy-paste into production code.

Google Gemini API models page current recommended model March 2026

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@skills/live-docs-lookup/test-pressure-3.md` around lines 21 - 25, Remove the
baked-in “current” Gemini model string and date from the baseline narrative:
delete the hardcoded "gemini-2.0-flash" token and the "as of March 2026"
phrasing and replace with a short, neutral sentence that says the test will
validate whichever Gemini model is returned by the API at fetch time (i.e.,
follow live docs). Update the narrative text that currently contrasts with the
later lines (the passages referencing model validation) so it no longer asserts
a specific recommended model and instead directs readers to query the official
Google Gemini models page for the current recommendation.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/live-docs-lookup/SKILL.md`:
- Around line 132-154: The checklist in SKILL.md's "Common things to verify"
hardcodes rapidly changing provider facts (e.g., Anthropic bullets like
budget_tokens, beta header values, model suffix guidance, OpenAI/Google model
examples), so replace concrete assertions with directive guidance and doc
pointers: edit the Anthropic, OpenAI and Google subsections (the listed bullet
items) to state what to verify (e.g., "confirm current token/format flags,
required beta headers, and exact model IDs") and instruct the agent to consult
the provider's official docs or models page rather than embedding specific
values; remove or reword all specific deprecated flag names and header strings
into generic checks and add a short note like "see provider docs" for each
provider. Ensure the updated text preserves the "what to verify" intent and
removes mutable factual claims.

---

Duplicate comments:
In `@skills/live-docs-lookup/test-pressure-3.md`:
- Around line 21-25: Remove the baked-in “current” Gemini model string and date
from the baseline narrative: delete the hardcoded "gemini-2.0-flash" token and
the "as of March 2026" phrasing and replace with a short, neutral sentence that
says the test will validate whichever Gemini model is returned by the API at
fetch time (i.e., follow live docs). Update the narrative text that currently
contrasts with the later lines (the passages referencing model validation) so it
no longer asserts a specific recommended model and instead directs readers to
query the official Google Gemini models page for the current recommendation.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a746d69d-dcd6-416b-9618-8e6e6de2f7d5

📥 Commits

Reviewing files that changed from the base of the PR and between 581e46d and 0ef9250.

📒 Files selected for processing (4)

README.md
skills/live-docs-lookup/CREATION-LOG.md
skills/live-docs-lookup/SKILL.md
skills/live-docs-lookup/test-pressure-3.md

🚧 Files skipped from review as they are similar to previous changes (1)

skills/live-docs-lookup/CREATION-LOG.md

skills/live-docs-lookup/SKILL.md

…line SKILL.md: Replace specific deprecated-fact assertions in "Common things to verify" with durable watch-list items — keeps named parameters as search anchors (budget_tokens, output_format, beta headers) but removes current-state claims that would age. Apply CodeRabbit's OpenAI and Google sections as-is. test-pressure-3.md: Remove hardcoded gemini-2.0-flash from baseline narrative ("as of March 2026, gemini-2.0-flash is the recommended default") — now says "at least one generation behind at time of testing", consistent with the test assertion that the skill should recommend whichever model is current at fetch time. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/live-docs-lookup/SKILL.md`:
- Around line 46-55: Update the OpenAI docs domain used in the fallback search
guidance in SKILL.md: replace the `site:platform.openai.com/docs` entry with
`site:developers.openai.com` so fallback queries target the accessible canonical
documentation domain; ensure the example query (`site:platform.openai.com/docs
responses API`) is updated to use `site:developers.openai.com` as well and keep
the surrounding text/format intact.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d5e7606a-5ee0-4e1e-a07b-a2e9274709fa

📥 Commits

Reviewing files that changed from the base of the PR and between 0ef9250 and aaa72a5.

📒 Files selected for processing (2)

skills/live-docs-lookup/SKILL.md
skills/live-docs-lookup/test-pressure-3.md

skills/live-docs-lookup/SKILL.md

Use site:developers.openai.com instead — verified accessible (HTTP 200) and is OpenAI's canonical documentation domain. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The original failure obra#2 ("treated web search and web fetch as custom tools") was not accurate — the agent did use native server tools correctly. Reduced to two documented failures. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/live-docs-lookup/SKILL.md`:
- Around line 78-91: The OpenAI doc URLs under the "OpenAI" section in SKILL.md
are returning 403; update the listed links (the Model IDs, Chat completions,
Responses API, Function calling, Streaming, Structured outputs, Assistants API,
Batch API, and Rate limits entries) to use the accessible developers.openai.com
domain instead of platform.openai.com (e.g., replace
https://platform.openai.com/docs/... with the corresponding
https://developers.openai.com/docs/... URLs) so the skill can fetch those pages
directly rather than falling back to search.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 475ea557-21a4-4a67-a0e9-45d141745c47

📥 Commits

Reviewing files that changed from the base of the PR and between aaa72a5 and e510c8c.

📒 Files selected for processing (2)

skills/live-docs-lookup/CREATION-LOG.md
skills/live-docs-lookup/SKILL.md

skills/live-docs-lookup/SKILL.md

platform.openai.com/docs returns HTTP 403 to non-browser fetches. Updated all OpenAI URLs to the accessible developers.openai.com domain with verified paths. Also removed Assistants API row (now labeled legacy). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dhegde11 · 2026-03-07T05:45:26Z

Ready for review. Done with changes based on CodeRabbit's feedback.

Fetches live documentation from Anthropic/OpenAI/Google before answering AI SDK questions, preventing stale training data from causing bugs. Upstream PR: obra#646 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

obra · 2026-03-10T00:29:33Z

Hi! I'd encourage you to publish this skill as its own plugin.

Add live-docs-lookup skill for AI SDK doc grounding

581e46d

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

coderabbitai bot reviewed Mar 7, 2026

View reviewed changes

skills/live-docs-lookup/SKILL.md Show resolved Hide resolved

coderabbitai bot reviewed Mar 7, 2026

View reviewed changes

skills/live-docs-lookup/SKILL.md Show resolved Hide resolved

Fix OpenAI fallback search domain (platform.openai.com/docs returns 403)

0a7c1ea

Use site:developers.openai.com instead — verified accessible (HTTP 200) and is OpenAI's canonical documentation domain. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

dhegde11 closed this Mar 7, 2026

dhegde11 reopened this Mar 7, 2026

coderabbitai bot reviewed Mar 7, 2026

View reviewed changes

skills/live-docs-lookup/SKILL.md Show resolved Hide resolved

obra closed this Mar 10, 2026

Uh oh!

Conversation

dhegde11 commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Motivation and Context

How Has This Been Tested?

Breaking Changes

Types of changes

Checklist

Additional context

Uh oh!

coderabbitai bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dhegde11 commented Mar 7, 2026

Uh oh!

obra commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dhegde11 commented Mar 6, 2026 •

edited

Loading

coderabbitai bot commented Mar 6, 2026 •

edited

Loading