Conversation
… types Add rate-limiter utility with per-endpoint backoff and retry logic. Improve embedding handler with better error handling and fallback support. Refactor init, audio, image, object, and text model handlers for @elizaos/core API compatibility. Expand types and add banner. Update README with comprehensive documentation. Co-authored-by: Cursor <cursoragent@cursor.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 WalkthroughAdds a process-wide rate limiter with per-category sliding-window RPMs and exponential backoff, billing-429 detection (fail-fast quota errors), startup banner and tier logging, synthetic embedding fast-paths, non-blocking init, streaming compatibility shims, and refactors model registration to use direct handler references. Changes
sequenceDiagram
participant Client as Client Request
participant Handler as Model Handler
participant Limiter as Rate Limiter
participant API as OpenAI API
participant Tier as Tier Logger
Client->>Handler: invoke model handler
Handler->>Limiter: withRateLimit / acquireRateLimit(category, fn)
Limiter->>Limiter: check sliding window, backoff, RPM limits
alt allowed
Limiter->>Handler: execute fn
Handler->>API: send HTTP request
API-->>Handler: 200 OK + headers
Handler->>Tier: logTierOnce(response)
Tier-->>Tier: parse x-ratelimit-limit-* and log once
Handler-->>Client: return result (or stream)
else 429 / billing
API-->>Handler: 429 response
Handler->>Limiter: throwIfRateLimited -> classify (quota vs rate)
alt quota (billing)
Handler-->>Client: throw QuotaExceededError (fail-fast)
else transient rate-limit
Limiter->>Limiter: schedule retry (Retry-After / backoff)
Limiter->>Handler: retry fn up to maxRetries
end
end
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/models/image.ts (1)
269-271:⚠️ Potential issue | 🟡 MinorConsider future-proofing the model version check with a regex pattern.
The current hardcoded checks for
gpt-5,o1, ando3will miss future o-series models (e.g.,o4,o5). Per OpenAI's API documentation, all o-series reasoning models and GPT-5 models requiremax_completion_tokensinstead of the deprecatedmax_tokens. Update the check to use a pattern that matches all future versions:- const useMaxCompletionTokens = modelName.startsWith("gpt-5") || - modelName.startsWith("o1") || - modelName.startsWith("o3"); + // Models requiring max_completion_tokens instead of max_tokens (o-series and gpt-5) + const useMaxCompletionTokens = /^(gpt-5|o[0-9]+)/.test(modelName);src/index.ts (1)
332-365:⚠️ Potential issue | 🔴 CriticalStreaming test is broken:
stream: trueis missing, andonStreamChunkcallback is never invoked.
streamParamsat line 343 setspromptandonStreamChunkbut does not setstream: true. IngenerateTextByModelType(src/models/text.ts line 66), the checkif (params.stream)gates the streaming path. Sincestreamisundefined, the non-streaminggenerateTextpath is taken, which never invokesonStreamChunk. Additionally,onStreamChunkis not forwarded togenerateParams(lines 44–54), so it would not be passed to the underlying Vercel AI SDK functions even if the streaming path were entered.The assertion at line 356–358 (
chunks.length === 0→ throw) will always fail because the chunks array remains empty.Proposed fix
const streamParams: StreamingTextParams = { prompt: "Count from 1 to 5.", + stream: true, onStreamChunk: (chunk: string) => { chunks.push(chunk); }, };With
stream: trueadded,resultwill be aTextStreamResult(not a string), requiring the downstream code to consumeresult.textStreamas anAsyncIterable<string>rather than using theonStreamChunkcallback pattern. The current assertions casting tostringand calling.substring()will fail.
🤖 Fix all issues with AI agents
In `@package.json`:
- Line 47: The package.json currently lists the dependency "@elizaos/core":
"workspace:*", which is a pnpm workspace-only spec and will break installs for
consumers—replace the "workspace:*" value with a concrete semver range (e.g.
"^1.7.2" or "^0.1.9") for the "@elizaos/core" dependency entry in package.json,
update the lockfile by reinstalling (npm install or bun install) and commit the
changed package.json and lockfile; ensure any CI/publish workflows reference the
updated version rather than a workspace protocol.
In `@src/models/embedding.ts`:
- Around line 153-164: The current check in the embedding validation
(variables/functions: embedding, embeddingDimension, logger.error,
syntheticEmbedding) logs a mismatch and returns a synthetic vector which
silently corrupts the embedding store; change this to fail fast by throwing a
clear, descriptive error (including actual length and expected
embeddingDimension) instead of returning synthetic data, or make this behavior
configurable (e.g., add an option/flag like throwOnDimensionMismatch to the
embedding creation/validation path) so callers can opt into strict validation;
update callers/tests to handle the thrown error or pass the new option as
appropriate.
- Around line 101-106: The current short-circuit "if (text === 'test')" in the
embedding creation path returns a synthetic embedding for any literal "test"
input; change this to only trigger for the probe sentinel by verifying the
original param is the special probe object shape (not just the string).
Concretely, update the condition around where text is extracted (the code that
handles string vs object params) to check that the incoming arg is an object and
contains the probe marker (e.g., a dedicated boolean/string sentinel property
used by ensureEmbeddingDimension), and only then call
syntheticEmbedding(embeddingDimension, 0.1); otherwise continue to call the real
embedding path. Reference symbols: syntheticEmbedding, embeddingDimension, and
ensureEmbeddingDimension probe.
🧹 Nitpick comments (7)
src/banner.ts (2)
84-88: Long values will overflow the table row and break alignment.
padEnddoesn't truncate — if${value} ${status}exceedsCOL2_WIDTH - 1(41 chars), the row will be wider than the table borders. A custom base URL likehttps://my-company-proxy.internal.example.com/v1would break the layout.Consider truncating or ellipsizing the value to fit:
Proposed fix
function row(label: string, value: string, status: string): string { - const rightContent = `${value} ${status}`.padEnd(COL2_WIDTH - 1); + const maxValueLen = COL2_WIDTH - 1 - status.length - 1; // 1 for space before status + const truncatedValue = value.length > maxValueLen + ? value.slice(0, maxValueLen - 1) + "…" + : value; + const rightContent = `${truncatedValue} ${status}`.padEnd(COL2_WIDTH - 1); const leftContent = label.padEnd(COL1_WIDTH - 1); return `| ${leftContent}| ${rightContent}|`; }
93-104: Consider usingloggerinstead ofconsole.logfor consistency with the rest of the codebase.Direct
console.logbypasses any structured logging, log-level filtering, or log routing the host application may have configured. If the logger adds unwanted prefixes that break table formatting, this is a reasonable trade-off — but worth a brief inline comment explaining whyconsole.logis used intentionally.src/models/embedding.ts (1)
141-149: Type assertion on response JSON lacks full runtime validation.The
as { data: [...] }cast at line 141 trusts the response shape. The check at line 146 only validatesdata.data[0].embeddingexists but doesn't verify it's actually anumber[]. If the API returns an unexpected shape (e.g., embedding is a string), this would propagate garbage downstream silently.This is acceptable risk given OpenAI's stable API, but worth noting.
src/models/image.ts (1)
326-333: Logging full response headers in error context could leak sensitive data in some proxy configurations.
Object.fromEntries(response.headers.entries())logs all response headers. While OpenAI response headers are typically safe, custom proxies or intermediaries might inject sensitive headers. Consider filtering to known-useful headers.README.md (1)
75-87: Add language specifiers to fenced code blocks flagged by markdownlint.Lines 75, 93, 118, and 269 have fenced code blocks without a language specifier (markdownlint MD040). Use
textfor plaintext output blocks:Fix
- ``` + ```text +----------------------------------------------------------------------+- ``` + ```text [OpenAI] Account tier detected: 10000 RPM, 2000000 TPM- ``` + ```text [OpenAI] Quota exceeded -- your API key has no remaining credits.- ``` + ```text src/Also applies to: 93-95, 118-121, 269-289
src/utils/rate-limiter.ts (2)
154-188: Minor race condition inacquire()under concurrent calls for the same category.When multiple async calls to
acquire(category)interleave (e.g., embeddings burst), each call creates its ownrecentarray fromfilter()at line 170, checks capacity, pushes a timestamp, and writes back viathis.windows.set(). The last writer wins, so earlier timestamps can be lost — allowing slightly more requests than the RPM limit.For a client-side rate limiter this is acceptable (the real enforcement is server-side), but if you want tighter accuracy, consider mutating the stored array in-place rather than replacing it:
Optional tightening
const maxRPM = this.getRPM(category); const cutoff = Date.now() - WINDOW_MS; - const timestamps = this.windows.get(category) ?? []; - const recent = timestamps.filter((t) => t > cutoff); + let recent = this.windows.get(category); + if (!recent) { + recent = []; + this.windows.set(category, recent); + } + // Prune in-place + let writeIdx = 0; + for (let i = 0; i < recent.length; i++) { + if (recent[i] > cutoff) recent[writeIdx++] = recent[i]; + } + recent.length = writeIdx; if (recent.length >= maxRPM) { const oldestInWindow = recent[0]; const waitMs = oldestInWindow + WINDOW_MS - Date.now() + 50; if (waitMs > 0) { logger.warn( `[OpenAI:RateLimit] ${category}: ${recent.length}/${maxRPM} RPM, throttling ${waitMs}ms`, ); await sleep(waitMs); } } recent.push(Date.now()); - this.windows.set(category, recent);
357-384: String-matching fallback inextractRateLimitInfocould produce false positives on unrelated errors containing "429".An error message like
"Server returned 4290 items"would matchmsg.includes("429"). This is unlikely with OpenAI errors but could trip on custom proxy error messages. A word-boundary check would be more precise:Optional tightening
if ( - msg.includes("429") || + /\b429\b/.test(msg) || msg.includes("rate limit") || msg.includes("too many requests") || isBilling ) {
There was a problem hiding this comment.
Pull request overview
Adds a process-level OpenAI rate limiting utility and refactors model handlers/types/docs to improve resiliency (429 handling, tier logging, startup behavior) and compatibility across @elizaos/core versions.
Changes:
- Introduces a shared rate limiter with per-endpoint RPM windows, 429 retry/backoff, quota detection, and one-shot tier logging.
- Refactors text/object/image/audio/embedding handlers to use rate limiting and improved error handling (plus synthetic embedding fast-paths).
- Adds a startup configuration banner, expands local compatibility types, and updates README/CHANGELOG.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/utils/rate-limiter.ts | New rate limiter wrapper, 429/quota detection, per-category windows/backoff, tier logging |
| src/utils/index.ts | Re-exports the rate limiter utilities |
| src/types/index.ts | Adds shared types + local shims for streaming compatibility with older core versions |
| src/models/text.ts | Uses limiter for chat calls; adds streaming initiation throttling |
| src/models/object.ts | Wraps object generation in rate-limited retry wrapper |
| src/models/image.ts | Wraps image generation/description with rate limiting + 429 parsing + tier logging |
| src/models/embedding.ts | Adds synthetic embedding fast-paths and wraps API calls with rate limiting |
| src/models/audio.ts | Wraps TTS/transcription with rate limiting + 429 parsing + tier logging |
| src/init.ts | Removes eager API validation; prints banner and logs baseURL |
| src/index.ts | Simplifies model registration; adds streaming test and type shims usage |
| src/banner.ts | Adds startup configuration banner output |
| package.json | Switches @elizaos/core dependency to workspace:* |
| README.md | Documents banner, limiter behavior/config, compatibility notes, model table, architecture |
| CHANGELOG.md | Adds unreleased changelog describing new functionality |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let text: string; | ||
| if (typeof params === "string") { | ||
| text = params; | ||
| } else if (typeof params === "object" && params.text) { | ||
| text = params.text; | ||
| } else { | ||
| const errorMsg = "Invalid input format for embedding"; | ||
| logger.warn(errorMsg); | ||
| const fallbackVector = Array(embeddingDimension).fill(0); | ||
| fallbackVector[0] = 0.2; | ||
| return fallbackVector; | ||
| logger.warn("Invalid input format for embedding"); | ||
| return syntheticEmbedding(embeddingDimension, 0.2); | ||
| } |
There was a problem hiding this comment.
params.text is checked for truthiness, so { text: "" } is treated as “invalid input format” (marker 0.2) instead of the intended “empty text” path (marker 0.3). Check for the presence of the text property (e.g., 'text' in params) and then handle empty-string via the existing !text.trim() branch.
| const streamParams: StreamingTextParams = { | ||
| prompt: "Count from 1 to 5.", | ||
| onStreamChunk: (chunk: string) => { | ||
| chunks.push(chunk); | ||
| }, | ||
| }); | ||
| if (!result || result.length === 0) { | ||
| }; |
There was a problem hiding this comment.
The streaming test sets onStreamChunk but never sets stream: true. In this PR’s text handler, streaming only activates when params.stream is truthy, so this test will run the non-streaming path and chunks will remain empty. Either set stream: true here (and adjust assertions/types to match the streaming return shape) or update the handler to treat onStreamChunk as a streaming request.
| "@ai-sdk/openai": "^2.0.32", | ||
| "@elizaos/core": "^1.7.0", | ||
| "@elizaos/core": "workspace:*", | ||
| "ai": "^5.0.47", |
There was a problem hiding this comment.
Changing @elizaos/core dependency to "workspace:*" will break consumers installing this plugin from npm (the workspace protocol generally won’t resolve outside a monorepo). If the intent is to support both local workspace dev and published installs, consider restoring a semver range here and/or moving @elizaos/core to peerDependencies with an appropriate supported range.
Iteration 1 prr-fix:prrc_kwdon0mv_86nxzss prr-fix:prrc_kwdon0mv_86nx1xh prr-fix:prrc_kwdon0mv_86nx1xp prr-fix:prrc_kwdon0mv_86nx3qo prr-fix:prrc_kwdon0mv_86nx3qv prr-fix:prrc_kwdon0mv_86nx3qz prr-fix:prrc_kwdon0mv_86nx3qc prr-fix:prrc_kwdon0mv_86nx3jh prr-fix:prrc_kwdon0mv_86nx3jl prr-fix:prrc_kwdon0mv_86nx3ql
Changes: - rate-limiter.ts: parseFloat on `retry-after` can return NaN if header value is invalid, then m... - package.json: _⚠️ Potential issue_ | _🔴 Critical_ <details> <summary>🧩 Analysis chain</s... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ <details> <summary>🧩 Analysis chain</summ... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ **Dimension mismatch returns a synthetic e... - embedding.ts: `params.text` is checked for truthiness, so `{ text: "" }` is treated as “inv... - index.ts: The streaming test sets `onStreamChunk` but never sets `stream: true`. In thi... - README.md: README says the default `IMAGE_DESCRIPTION` model is `gpt-4o-mini`, but the c... - package.json: Changing `@elizaos/core` dependency to `"workspace:*"` will break consumers i... - rate-limiter.ts: `RateLimiter.acquire()` is not concurrency-safe: concurrent callers can read ... - rate-limiter.ts: `OPENAI_RATE_LIMIT_MAX_RETRIES` set to `0` won’t be respected because `parseI... - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Env var max retries zero treated as default **Low Severity** <!-- DESCR...
Changes: - rate-limiter.ts: parseFloat on `retry-after` can return NaN if header value is invalid, then m... - package.json: _⚠️ Potential issue_ | _🔴 Critical_ <details> <summary>🧩 Analysis chain</s... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ <details> <summary>🧩 Analysis chain</summ... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ **Dimension mismatch returns a synthetic e... - embedding.ts: `params.text` is checked for truthiness, so `{ text: "" }` is treated as “inv... - index.ts: The streaming test sets `onStreamChunk` but never sets `stream: true`. In thi... - README.md: README says the default `IMAGE_DESCRIPTION` model is `gpt-4o-mini`, but the c... - package.json: Changing `@elizaos/core` dependency to `"workspace:*"` will break consumers i... - rate-limiter.ts: `RateLimiter.acquire()` is not concurrency-safe: concurrent callers can read ... - rate-limiter.ts: `OPENAI_RATE_LIMIT_MAX_RETRIES` set to `0` won’t be respected because `parseI... - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Env var max retries zero treated as default **Low Severity** <!-- DESCR...
Changes: - rate-limiter.ts: parseFloat on `retry-after` can return NaN if header value is invalid, then m... - package.json: _⚠️ Potential issue_ | _🔴 Critical_ <details> <summary>🧩 Analysis chain</s... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ <details> <summary>🧩 Analysis chain</summ... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ **Dimension mismatch returns a synthetic e... - embedding.ts: `params.text` is checked for truthiness, so `{ text: "" }` is treated as “inv... - index.ts: The streaming test sets `onStreamChunk` but never sets `stream: true`. In thi... - README.md: README says the default `IMAGE_DESCRIPTION` model is `gpt-4o-mini`, but the c... - package.json: Changing `@elizaos/core` dependency to `"workspace:*"` will break consumers i... - rate-limiter.ts: `RateLimiter.acquire()` is not concurrency-safe: concurrent callers can read ... - rate-limiter.ts: `OPENAI_RATE_LIMIT_MAX_RETRIES` set to `0` won’t be respected because `parseI... - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Env var max retries zero treated as default **Low Severity** <!-- DESCR...
Changes: - rate-limiter.ts: parseFloat on `retry-after` can return NaN if header value is invalid, then m... - package.json: _⚠️ Potential issue_ | _🔴 Critical_ <details> <summary>🧩 Analysis chain</s... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ <details> <summary>🧩 Analysis chain</summ... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ **Dimension mismatch returns a synthetic e... - embedding.ts: `params.text` is checked for truthiness, so `{ text: "" }` is treated as “inv... - index.ts: The streaming test sets `onStreamChunk` but never sets `stream: true`. In thi... - README.md: README says the default `IMAGE_DESCRIPTION` model is `gpt-4o-mini`, but the c... - package.json: Changing `@elizaos/core` dependency to `"workspace:*"` will break consumers i... - rate-limiter.ts: `RateLimiter.acquire()` is not concurrency-safe: concurrent callers can read ... - rate-limiter.ts: `OPENAI_RATE_LIMIT_MAX_RETRIES` set to `0` won’t be respected because `parseI... - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Env var max retries zero treated as default **Low Severity** <!-- DESCR...
Changes: - rate-limiter.ts: parseFloat on `retry-after` can return NaN if header value is invalid, then m... - package.json: _⚠️ Potential issue_ | _🔴 Critical_ <details> <summary>🧩 Analysis chain</s... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ <details> <summary>🧩 Analysis chain</summ... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ **Dimension mismatch returns a synthetic e... - embedding.ts: `params.text` is checked for truthiness, so `{ text: "" }` is treated as “inv... - index.ts: The streaming test sets `onStreamChunk` but never sets `stream: true`. In thi... - README.md: README says the default `IMAGE_DESCRIPTION` model is `gpt-4o-mini`, but the c... - package.json: Changing `@elizaos/core` dependency to `"workspace:*"` will break consumers i... - rate-limiter.ts: `RateLimiter.acquire()` is not concurrency-safe: concurrent callers can read ... - rate-limiter.ts: `OPENAI_RATE_LIMIT_MAX_RETRIES` set to `0` won’t be respected because `parseI... - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Env var max retries zero treated as default **Low Severity** <!-- DESCR...
Changes: - rate-limiter.ts: parseFloat on `retry-after` can return NaN if header value is invalid, then m... - package.json: _⚠️ Potential issue_ | _🔴 Critical_ <details> <summary>🧩 Analysis chain</s... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ <details> <summary>🧩 Analysis chain</summ... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ **Dimension mismatch returns a synthetic e... - embedding.ts: `params.text` is checked for truthiness, so `{ text: "" }` is treated as “inv... - index.ts: The streaming test sets `onStreamChunk` but never sets `stream: true`. In thi... - README.md: README says the default `IMAGE_DESCRIPTION` model is `gpt-4o-mini`, but the c... - package.json: Changing `@elizaos/core` dependency to `"workspace:*"` will break consumers i... - rate-limiter.ts: `RateLimiter.acquire()` is not concurrency-safe: concurrent callers can read ... - rate-limiter.ts: `OPENAI_RATE_LIMIT_MAX_RETRIES` set to `0` won’t be respected because `parseI... - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Env var max retries zero treated as default **Low Severity** <!-- DESCR...
Changes: - rate-limiter.ts: parseFloat on `retry-after` can return NaN if header value is invalid, then m... - package.json: _⚠️ Potential issue_ | _🔴 Critical_ <details> <summary>🧩 Analysis chain</s... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ <details> <summary>🧩 Analysis chain</summ... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ **Dimension mismatch returns a synthetic e... - embedding.ts: `params.text` is checked for truthiness, so `{ text: "" }` is treated as “inv... - index.ts: The streaming test sets `onStreamChunk` but never sets `stream: true`. In thi... - README.md: README says the default `IMAGE_DESCRIPTION` model is `gpt-4o-mini`, but the c... - package.json: Changing `@elizaos/core` dependency to `"workspace:*"` will break consumers i... - rate-limiter.ts: `RateLimiter.acquire()` is not concurrency-safe: concurrent callers can read ... - rate-limiter.ts: `OPENAI_RATE_LIMIT_MAX_RETRIES` set to `0` won’t be respected because `parseI... - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Env var max retries zero treated as default **Low Severity** <!-- DESCR...
Changes: - rate-limiter.ts: parseFloat on `retry-after` can return NaN if header value is invalid, then m... - package.json: _⚠️ Potential issue_ | _🔴 Critical_ <details> <summary>🧩 Analysis chain</s... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ <details> <summary>🧩 Analysis chain</summ... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ **Dimension mismatch returns a synthetic e... - embedding.ts: `params.text` is checked for truthiness, so `{ text: "" }` is treated as “inv... - index.ts: The streaming test sets `onStreamChunk` but never sets `stream: true`. In thi... - README.md: README says the default `IMAGE_DESCRIPTION` model is `gpt-4o-mini`, but the c... - package.json: Changing `@elizaos/core` dependency to `"workspace:*"` will break consumers i... - rate-limiter.ts: `RateLimiter.acquire()` is not concurrency-safe: concurrent callers can read ... - rate-limiter.ts: `OPENAI_RATE_LIMIT_MAX_RETRIES` set to `0` won’t be respected because `parseI... - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Env var max retries zero treated as default **Low Severity** <!-- DESCR...
Changes: - rate-limiter.ts: parseFloat on `retry-after` can return NaN if header value is invalid, then m... - package.json: _⚠️ Potential issue_ | _🔴 Critical_ <details> <summary>🧩 Analysis chain</s... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ <details> <summary>🧩 Analysis chain</summ... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ **Dimension mismatch returns a synthetic e... - embedding.ts: `params.text` is checked for truthiness, so `{ text: "" }` is treated as “inv... - index.ts: The streaming test sets `onStreamChunk` but never sets `stream: true`. In thi... - README.md: README says the default `IMAGE_DESCRIPTION` model is `gpt-4o-mini`, but the c... - package.json: Changing `@elizaos/core` dependency to `"workspace:*"` will break consumers i... - rate-limiter.ts: `RateLimiter.acquire()` is not concurrency-safe: concurrent callers can read ... - rate-limiter.ts: `OPENAI_RATE_LIMIT_MAX_RETRIES` set to `0` won’t be respected because `parseI... - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Env var max retries zero treated as default **Low Severity** <!-- DESCR...
Changes: - rate-limiter.ts: parseFloat on `retry-after` can return NaN if header value is invalid, then m... - package.json: _⚠️ Potential issue_ | _🔴 Critical_ <details> <summary>🧩 Analysis chain</s... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ <details> <summary>🧩 Analysis chain</summ... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ **Dimension mismatch returns a synthetic e... - embedding.ts: `params.text` is checked for truthiness, so `{ text: "" }` is treated as “inv... - index.ts: The streaming test sets `onStreamChunk` but never sets `stream: true`. In thi... - README.md: README says the default `IMAGE_DESCRIPTION` model is `gpt-4o-mini`, but the c... - package.json: Changing `@elizaos/core` dependency to `"workspace:*"` will break consumers i... - rate-limiter.ts: `RateLimiter.acquire()` is not concurrency-safe: concurrent callers can read ... - rate-limiter.ts: `OPENAI_RATE_LIMIT_MAX_RETRIES` set to `0` won’t be respected because `parseI... - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Env var max retries zero treated as default **Low Severity** <!-- DESCR...
Iteration 11 prr-fix:prrc_kwdon0mv_86of0ln prr-fix:prrc_kwdon0mv_86nx3jh
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
src/utils/rate-limiter.ts (2)
296-303: Environment variable re-read on every call.
process.env.OPENAI_RATE_LIMIT_MAX_RETRIESis parsed on every invocation ofwithRateLimit. This is a hot path for all API calls. Consider caching it at module scope (likeenvRPMon line 519) or lazily on first read, since it's unlikely to change at runtime.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/rate-limiter.ts` around lines 296 - 303, The code repeatedly reads and parses process.env.OPENAI_RATE_LIMIT_MAX_RETRIES inside withRateLimit (envMaxRetriesRaw/envMaxRetries); move that parsing out of the hot path by caching it at module scope (e.g., create a module-level constant like cachedMaxRetries or reuse envRPM pattern) or initialize it lazily on first access, then have withRateLimit reference the cached value instead of re-reading envMaxRetriesRaw each call; ensure the cached value uses the same validation (Number.isFinite and >= 0) and remains typed as number | undefined.
372-406:Retry-Afterheader: only seconds format is parsed, HTTP-date format is silently ignored.The
Retry-Afterheader can be either a number of seconds or an HTTP-date (RFC 7231 §7.1.3).parseFloaton an HTTP-date returnsNaN, which is safely handled (falls back toundefined), so this isn't a bug. However, if OpenAI ever switches to date format, the optimal backoff timing would be lost. Documenting this assumption in a comment would be sufficient.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/rate-limiter.ts` around lines 372 - 406, The rate-limit handling in throwIfRateLimited currently only parses a numeric Retry-After seconds value (via parseFloat) and will silently ignore RFC-7231 HTTP-date formatted Retry-After headers; update the throwIfRateLimited function to document this assumption by adding a clear comment above the retry-after parsing that explains both allowed formats (seconds or HTTP-date) and that the current implementation only parses seconds, and optionally (preferred) add fallback parsing for HTTP-date by using Date.parse(response.headers.get("retry-after")) to compute retrySeconds when parseFloat yields NaN before converting to retryMs and passing it into RateLimitError.src/banner.ts (2)
75-78: Inconsistent indentation on multi-line assignments.Lines 75–78 and 95–98 have an extra leading space compared to adjacent lines (e.g.,
const titlevsconst billingPrefix). This appears to be an accidental whitespace inconsistency.🧹 Proposed fix
- const title = - titlePrefix + - " ".repeat(WIDTH - titlePrefix.length - 1) + - "|"; + const title = + titlePrefix + + " ".repeat(WIDTH - titlePrefix.length - 1) + + "|";- const billingPrefix = "| Billing: "; - const billingRow = `${billingPrefix}${billingURL}${" ".repeat( - WIDTH - billingPrefix.length - billingURL.length - 1, - )}|`; + const billingPrefix = "| Billing: "; + const billingRow = `${billingPrefix}${billingURL}${" ".repeat( + WIDTH - billingPrefix.length - billingURL.length - 1, + )}|`;Also applies to: 95-98
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/banner.ts` around lines 75 - 78, The multi-line assignments for the constants title and billingPrefix have an extra leading space causing inconsistent indentation; fix both by aligning their continued lines with the declaration (remove the accidental leading space before the second and subsequent lines) so the concatenation lines for const title and const billingPrefix line up with their variable names and match other multi-line assignments that use WIDTH and titlePrefix/billingPrefix.
100-111: Consider usinglogger.infoinstead ofconsole.log.The rest of the codebase uses
loggerfrom@elizaos/corefor all output. Usingconsole.logdirectly bypasses any log-level filtering, structured logging, or output redirection that the framework may provide. If this is intentional (to ensure the banner is always visible), a brief comment explaining the choice would help.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/banner.ts` around lines 100 - 111, The banner currently uses console.log in src/banner.ts (calls to console.log around line, title, divider, row(...) and billingRow), which bypasses the project's logger; update the file to import and use logger.info from `@elizaos/core` for these outputs (replace each console.log(...) with logger.info(...)) so log-levels and sinks are respected, or if the intent is to always show the banner regardless of log-levels, add a short comment above the block explaining that intentional choice and why console.log is required; ensure references to the row(...) helper and variables (line, title, divider, billingRow) remain unchanged.README.md (1)
75-87: Add language identifiers to fenced code blocks.Several fenced code blocks in this file (lines 75, 93, 118, 269) lack a language specifier. For terminal/log output, use
```textor```consoleto satisfy markdownlint MD040 and improve rendering.Same applies to the blocks at lines 93, 118, and 269.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 75 - 87, The README contains several fenced code blocks that lack a language identifier (notably the ASCII table block starting with "+----------------------------------------------------------------------+" and other terminal/log output blocks); update each opening fence to include an appropriate language specifier such as ```text or ```console (for example change ``` to ```text) for the blocks referenced in the review so they satisfy markdownlint MD040 and render correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/models/embedding.ts`:
- Around line 101-107: Remove the accidental line-number decorations ("101 |",
"102 |", etc.) that were inserted into the if-block around the synthetic probe
check; locate the if statement that checks `if (text === "test" && typeof params
=== "object" && params !== null)` and the subsequent `logger.debug(...)` and
`return syntheticEmbedding(embeddingDimension, 0.1);` lines and delete the
numeric prefixes so the block is valid TypeScript again, ensuring
`logger.debug`, `syntheticEmbedding`, `embeddingDimension`, `text`, and `params`
remain unchanged otherwise.
---
Nitpick comments:
In `@README.md`:
- Around line 75-87: The README contains several fenced code blocks that lack a
language identifier (notably the ASCII table block starting with
"+----------------------------------------------------------------------+" and
other terminal/log output blocks); update each opening fence to include an
appropriate language specifier such as ```text or ```console (for example change
``` to ```text) for the blocks referenced in the review so they satisfy
markdownlint MD040 and render correctly.
In `@src/banner.ts`:
- Around line 75-78: The multi-line assignments for the constants title and
billingPrefix have an extra leading space causing inconsistent indentation; fix
both by aligning their continued lines with the declaration (remove the
accidental leading space before the second and subsequent lines) so the
concatenation lines for const title and const billingPrefix line up with their
variable names and match other multi-line assignments that use WIDTH and
titlePrefix/billingPrefix.
- Around line 100-111: The banner currently uses console.log in src/banner.ts
(calls to console.log around line, title, divider, row(...) and billingRow),
which bypasses the project's logger; update the file to import and use
logger.info from `@elizaos/core` for these outputs (replace each console.log(...)
with logger.info(...)) so log-levels and sinks are respected, or if the intent
is to always show the banner regardless of log-levels, add a short comment above
the block explaining that intentional choice and why console.log is required;
ensure references to the row(...) helper and variables (line, title, divider,
billingRow) remain unchanged.
In `@src/utils/rate-limiter.ts`:
- Around line 296-303: The code repeatedly reads and parses
process.env.OPENAI_RATE_LIMIT_MAX_RETRIES inside withRateLimit
(envMaxRetriesRaw/envMaxRetries); move that parsing out of the hot path by
caching it at module scope (e.g., create a module-level constant like
cachedMaxRetries or reuse envRPM pattern) or initialize it lazily on first
access, then have withRateLimit reference the cached value instead of re-reading
envMaxRetriesRaw each call; ensure the cached value uses the same validation
(Number.isFinite and >= 0) and remains typed as number | undefined.
- Around line 372-406: The rate-limit handling in throwIfRateLimited currently
only parses a numeric Retry-After seconds value (via parseFloat) and will
silently ignore RFC-7231 HTTP-date formatted Retry-After headers; update the
throwIfRateLimited function to document this assumption by adding a clear
comment above the retry-after parsing that explains both allowed formats
(seconds or HTTP-date) and that the current implementation only parses seconds,
and optionally (preferred) add fallback parsing for HTTP-date by using
Date.parse(response.headers.get("retry-after")) to compute retrySeconds when
parseFloat yields NaN before converting to retryMs and passing it into
RateLimitError.
Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ...
Changes:
- embedding.ts: `params.text` is checked for truthiness, so `{ text: "" }` is treated as “inv...
- banner.ts: ### Banner title and billing rows off-by-one width
**Low Severity**
<!-- DE...
- embedding.ts: _⚠️ Potential issue_ | _🔴 Critical_
<details>
<summary>🧩 Analysis chain</s...
There was a problem hiding this comment.
🧹 Nitpick comments (5)
src/utils/rate-limiter.ts (5)
496-498:logTierOnceonly triggers on status200, not the full 2xx range.OpenAI's main endpoints (chat, embeddings) return 200, so this works in practice. If any future endpoint returns
201or204with rate-limit headers, tier detection would silently skip it. Aresponse.okcheck would be more forward-compatible.Proposed fix
- if (response.status !== 200) return; // Only log from successful responses + if (!response.ok) return; // Only log from successful responses🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/rate-limiter.ts` around lines 496 - 498, The logTierOnce function currently only considers responses with status === 200 and thus skips other successful 2xx responses; update the guard in logTierOnce to use response.ok (e.g., if (!response.ok) return) so any 2xx success will allow tier detection to run, leaving the tierLoggedOnce short-circuit unchanged to preserve single-time logging.
394-399:Retry-Afterheader may contain an HTTP-date, not just seconds.Per RFC 9110,
Retry-Aftercan be either a delay-seconds (120) or an HTTP-date (Thu, 01 Dec 2025 16:00:00 GMT). The currentparseFloatcorrectly falls through toundefinedfor dates (NaN check on line 397), so this degrades gracefully — just noting that an HTTP-date value would discard a valid server hint. OpenAI currently uses numeric seconds, so this is low risk.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/rate-limiter.ts` around lines 394 - 399, The Retry-After header handling only parses numeric seconds (retryAfter -> retrySeconds -> retryMs) and ignores valid HTTP-date values; update the logic that reads response.headers.get("retry-after") so it first attempts parseFloat and if that is NaN, tries to parse an HTTP-date (e.g., via Date.parse(retryAfter)) and computes retryMs as the difference between the parsed date and Date.now() (clamped to >=0); keep existing behavior of using numeric seconds when present and ensure invalid parses result in undefined.
514-526: Global RPM override only applies to categories known at module load time.
Object.keys(DEFAULT_RPM)iterates over the four predefined categories. If a caller later invokeswithRateLimit("some_new_category", ...), the override won't apply —getRPMfalls back to60. This is fine for the current four OpenAI categories but is worth documenting if the set of categories may grow.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/rate-limiter.ts` around lines 514 - 526, The global RPM override logic currently reads OPENAI_RATE_LIMIT_RPM into envRPM and only applies it to the prepopulated keys from DEFAULT_RPM via limiter.setRPM, so any later-created categories (used by withRateLimit or looked up via getRPM) won't inherit the override; change the implementation so the override is applied centrally — either have limiter.getRPM consult the parsed envRPM if no explicit RPM is set for a category, or ensure limiter.setRPM is invoked for newly-created categories by storing the parsed rpm globally and applying it when withRateLimit/createCategory is called; reference envRPM, DEFAULT_RPM, limiter.setRPM, getRPM and withRateLimit to locate where to add this fallback behavior.
294-301: UseparseIntforOPENAI_RATE_LIMIT_MAX_RETRIESto avoid fractional retry counts.
Number(envMaxRetriesRaw)parses"3.5"as3.5, which flows into the loop comparisonattempt <= 3.5and yields 4 iterations — a surprising result for a "max retries" setting. UsingparseInt(consistent with the RPM parsing on line 519) makes the intent explicit.Proposed fix
- const parsed = Number(envMaxRetriesRaw); - if (Number.isFinite(parsed) && parsed >= 0) { + const parsed = parseInt(envMaxRetriesRaw, 10); + if (!isNaN(parsed) && parsed >= 0) { envMaxRetries = parsed;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/rate-limiter.ts` around lines 294 - 301, The envMaxRetries parsing currently uses Number(envMaxRetriesRaw) which allows fractional values (e.g., "3.5") to slip through; update the logic that sets envMaxRetries from OPENAI_RATE_LIMIT_MAX_RETRIES to use parseInt(envMaxRetriesRaw, 10) instead, validate the result is a finite integer >= 0, and assign that integer to envMaxRetries (mirroring how RPM is parsed elsewhere); keep the same variable names envMaxRetriesRaw and envMaxRetries and ensure non-numeric or negative values leave envMaxRetries undefined.
433-448: String-matching heuristic for billing detection in SDK-wrapped errors has a minor false-positive surface.Any
Errorwhose message coincidentally contains"quota"or"billing"(but is not actually a 429/billing error) would be classified asisBillingError: true, causingwithRateLimitto skip retries and re-throw immediately. Since this only runs on errors thrown insidefn()(i.e., during OpenAI API calls), the practical risk is very low, but worth keeping in mind if the wrapper is ever used around non-OpenAI calls.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/rate-limiter.ts` around lines 433 - 448, The current error-string heuristic marks any Error whose message contains "quota" or "billing" as isBillingError, causing withRateLimit to skip retries on benign errors; change the detection so isBillingError is true only when billing keywords appear together with an explicit rate-limit indicator or OpenAI-specific error metadata: keep the existing rate-limit checks (msg.includes("429") / "rate limit" / "too many requests") and only set isBillingError when isBilling && (msg contains a rate-limit token OR error has OpenAI-style markers such as error.code === "insufficient_quota" or error.response?.status === 429 or error.type === "insufficient_quota"); update the block around the existing error instanceof Error handling that sets isRateLimit and isBillingError used by withRateLimit accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/utils/rate-limiter.ts`:
- Around line 496-498: The logTierOnce function currently only considers
responses with status === 200 and thus skips other successful 2xx responses;
update the guard in logTierOnce to use response.ok (e.g., if (!response.ok)
return) so any 2xx success will allow tier detection to run, leaving the
tierLoggedOnce short-circuit unchanged to preserve single-time logging.
- Around line 394-399: The Retry-After header handling only parses numeric
seconds (retryAfter -> retrySeconds -> retryMs) and ignores valid HTTP-date
values; update the logic that reads response.headers.get("retry-after") so it
first attempts parseFloat and if that is NaN, tries to parse an HTTP-date (e.g.,
via Date.parse(retryAfter)) and computes retryMs as the difference between the
parsed date and Date.now() (clamped to >=0); keep existing behavior of using
numeric seconds when present and ensure invalid parses result in undefined.
- Around line 514-526: The global RPM override logic currently reads
OPENAI_RATE_LIMIT_RPM into envRPM and only applies it to the prepopulated keys
from DEFAULT_RPM via limiter.setRPM, so any later-created categories (used by
withRateLimit or looked up via getRPM) won't inherit the override; change the
implementation so the override is applied centrally — either have limiter.getRPM
consult the parsed envRPM if no explicit RPM is set for a category, or ensure
limiter.setRPM is invoked for newly-created categories by storing the parsed rpm
globally and applying it when withRateLimit/createCategory is called; reference
envRPM, DEFAULT_RPM, limiter.setRPM, getRPM and withRateLimit to locate where to
add this fallback behavior.
- Around line 294-301: The envMaxRetries parsing currently uses
Number(envMaxRetriesRaw) which allows fractional values (e.g., "3.5") to slip
through; update the logic that sets envMaxRetries from
OPENAI_RATE_LIMIT_MAX_RETRIES to use parseInt(envMaxRetriesRaw, 10) instead,
validate the result is a finite integer >= 0, and assign that integer to
envMaxRetries (mirroring how RPM is parsed elsewhere); keep the same variable
names envMaxRetriesRaw and envMaxRetries and ensure non-numeric or negative
values leave envMaxRetries undefined.
- Around line 433-448: The current error-string heuristic marks any Error whose
message contains "quota" or "billing" as isBillingError, causing withRateLimit
to skip retries on benign errors; change the detection so isBillingError is true
only when billing keywords appear together with an explicit rate-limit indicator
or OpenAI-specific error metadata: keep the existing rate-limit checks
(msg.includes("429") / "rate limit" / "too many requests") and only set
isBillingError when isBilling && (msg contains a rate-limit token OR error has
OpenAI-style markers such as error.code === "insufficient_quota" or
error.response?.status === 429 or error.type === "insufficient_quota"); update
the block around the existing error instanceof Error handling that sets
isRateLimit and isBillingError used by withRateLimit accordingly.
Iteration 2 prr-fix:prrc_kwdon0mv_86ovemv
Changes: - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ... - rate-limiter.ts: ### Redundant sleep before acquire causes double-wait on retry **Low Severit...
Iteration 1 prr-fix:prrc_kwdon0mv_86ovemz
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (8)
src/banner.ts (3)
94-98: Extra leading space on line 95.
billingPrefixis indented with 3 spaces while the surroundingbillingURLdeclaration uses 2. Inconsistent with the rest of the function body.🧹 Proposed fix
const billingURL = "https://platform.openai.com/settings/organization/billing"; - const billingPrefix = "| Billing: "; + const billingPrefix = "| Billing: ";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/banner.ts` around lines 94 - 98, The indentation of billingPrefix is inconsistent (it has one extra leading space) compared to billingURL and the surrounding code; fix by aligning billingPrefix's declaration indentation with billingURL and billingRow so all three consts (billingURL, billingPrefix, billingRow) share the same left margin, leaving the expressions unchanged (billingPrefix = "| Billing: ", billingRow using WIDTH) to restore consistent formatting.
57-59: Redundant?? "text-embedding-3-small"fallback.
getSetting(runtime, "OPENAI_EMBEDDING_MODEL", "text-embedding-3-small")already returns the default string when neither the setting nor the env var is set, so the trailing?? "text-embedding-3-small"is unreachable dead code.🧹 Proposed simplification
- const embeddingModel = - getSetting(runtime, "OPENAI_EMBEDDING_MODEL", "text-embedding-3-small") ?? - "text-embedding-3-small"; + const embeddingModel = + getSetting(runtime, "OPENAI_EMBEDDING_MODEL", "text-embedding-3-small")!;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/banner.ts` around lines 57 - 59, Remove the unreachable fallback after getSetting: the expression assigning embeddingModel uses getSetting(runtime, "OPENAI_EMBEDDING_MODEL", "text-embedding-3-small") which already returns the default, so drop the trailing ?? "text-embedding-3-small"; update the embeddingModel assignment (around the embeddingModel symbol) to use only the getSetting call and ensure no other code relies on the redundant nullish coalescing.
100-111:console.logbypasses the sharedloggerused elsewhere in the plugin.The rest of the codebase (e.g.,
src/utils/config.ts) routes output through theloggerfrom@elizaos/core. Usingconsole.loghere means the banner output ignores any log-level filtering or transport configured on that logger. If guaranteed visibility at startup is the intent,logger.infoachieves the same for typical configurations while staying consistent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/banner.ts` around lines 100 - 111, Replace the direct console.log calls in the banner rendering with the shared logger so output respects configured log levels/transports: import the existing logger from '@elizaos/core' if not already present, then change each console.log(...) call (the ones printing line, title, divider, row(...), billingRow, and the trailing blank line) to logger.info(...) so the banner (variables/functions: line, title, divider, row, billingRow) is emitted through the centralized logger while preserving the same string formatting and order.src/models/embedding.ts (2)
165-165:logger.logon every successful embedding will be noisy — considerlogger.debug.Embeddings are called frequently during message processing.
logger.log(typicallyinfo-level) will clutter output. Other success paths in this file uselogger.debug.- logger.log(`Got valid embedding with length ${embedding.length}`); + logger.debug(`Got valid embedding with length ${embedding.length}`);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/models/embedding.ts` at line 165, Replace the current success-level logging call logger.log(`Got valid embedding with length ${embedding.length}`) with a debug-level call (logger.debug) so embedding success messages don't clutter info logs; locate the usage in the function that validates/returns embeddings (the line using embedding.length) and make it consistent with other success paths that use logger.debug.
91-95: Placeholder issue link in comment.Line 95 references
issues/XXX— either link the real issue or drop the URL to avoid confusion for future readers.- // See: https://github.com/elizaos-plugins/plugin-openai/issues/XXX + // See discussion in PR `#24` for rationale.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/models/embedding.ts` around lines 91 - 95, The inline comment in embedding.ts references a placeholder issue URL "issues/XXX"; update the comment near the ensureEmbeddingDimension() probe and the block discussing short-circuiting (the comment that starts "NOTE: We do NOT short-circuit...") by either replacing "issues/XXX" with the correct issue/PR number or removing the URL entirely so it doesn't confuse future readers—ensure the rest of the explanatory text remains intact and that the comment still mentions that the probe sends null and not a TextEmbeddingParams object to keep context for ensureEmbeddingDimension().src/utils/rate-limiter.ts (3)
513-524: Env RPM override only applies to predefined categories — intentional but worth noting.
OPENAI_RATE_LIMIT_RPMiteratesObject.keys(DEFAULT_RPM), so any future category added viawithRateLimit("new_cat", ...)without updatingDEFAULT_RPMwould fall back to 60 RPM and miss the override. Current usage is consistent, just something to keep in mind when adding new categories.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/rate-limiter.ts` around lines 513 - 524, The env override for OPENAI_RATE_LIMIT_RPM only applies to keys from DEFAULT_RPM, so categories added later via withRateLimit(...) will miss the override; change the logic to apply the RPM override to the limiter's actual category registry instead of Object.keys(DEFAULT_RPM) (or also apply the override inside withRateLimit when new categories are registered). Locate the envRPM handling around limiter.setRPM and DEFAULT_RPM and ensure limiter.setRPM(cat, rpm) is invoked for every registered category (or that withRateLimit calls limiter.setRPM when OPENAI_RATE_LIMIT_RPM is set) so new categories receive the global override.
139-145: Remove unusedcurrentPromisesmap — dead state that adds confusion without value.The map is set at line 191 and deleted at line 247, but never read. The cleanup logic at line 245 compares against
acquireQueues, notcurrentPromises. The comment claiming it tracks "current promise for cleanup comparison" is misleading. Removing this map eliminates dead state and clarifies the serialization logic.Cleanup
Remove the declaration:
- /** - * Track the current promise for cleanup comparison. - * Maps category to the "current" promise (before chaining). - */ - private currentPromises = new Map<string, Promise<void>>();Remove the set operation at line 191:
this.acquireQueues.set(category, queued); - this.currentPromises.set(category, queued);Remove the delete operation at line 247:
if (this.acquireQueues.get(category) === queued) { this.acquireQueues.delete(category); - this.currentPromises.delete(category); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/rate-limiter.ts` around lines 139 - 145, Remove the dead Map currentPromises and its related operations: delete the private declaration currentPromises = new Map<string, Promise<void>>(), remove any assignments to currentPromises.set(...) (the set at where the queue promise is stored), and remove the corresponding currentPromises.delete(...) cleanup; also update or remove the misleading comment that describes tracking "current promise for cleanup comparison" so the code only uses acquireQueues for serialization/cleanup and has no stale references to currentPromises. Ensure no other code reads currentPromises before committing.
306-338: Add a defensive throw after theforloop for type safety.The loop always exits via
returnorthrow, making the post-loop code unreachable. However, withnoImplicitReturnsenabled by"strict": truein tsconfig.json, adding an explicit throw improves type clarity and guards against future refactors that could break the control-flow invariant.Suggested addition
throw error; } } + + // Unreachable: every iteration returns or throws. This satisfies the + // compiler and guards against future refactors that break the invariant. + throw new Error(`[OpenAI:RateLimit] ${category}: max retries exhausted`); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/utils/rate-limiter.ts` around lines 306 - 338, The for-loop that calls limiter.acquire(category) and returns await fn() can theoretically fall through and TypeScript's noImplicitReturns wants an explicit exit; after the loop that handles retries, add a defensive throw (e.g., throw new Error("Unreachable: retry loop exited without returning")) to make the control flow explicit and satisfy strict type checking; place this throw immediately after the closing brace of the for loop that contains limiter.acquire(category), extractRateLimitInfo(error), calculateBackoff(attempt), limiter.recordBackoff(category, backoffMs), and sleep(backoffMs).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/banner.ts`:
- Around line 16-17: maskApiKey and keyStatus use different thresholds causing
contradictory output; update keyStatus to derive its value from maskApiKey so
they stay consistent (e.g., call maskApiKey(key) and treat "(not set)" as
falsey) or apply the same minimum-length check (10 chars) used in maskApiKey.
Specifically, modify the keyStatus computation to use the result of
maskApiKey(key) (or reuse the same length check) instead of checking only
truthiness so both maskApiKey and keyStatus produce matching state for keys
shorter than 10 characters.
- Around line 88-92: The row function uses padEnd which doesn't truncate long
strings, causing overflow when value + status exceed COL2_WIDTH; in function
row, truncate the combined rightContent (or truncate value to reserve space for
status) to at most COL2_WIDTH - 1 characters before calling padEnd, e.g.,
compute a maxLen = COL2_WIDTH - 1 and if `${value} ${status}` length > maxLen
slice it to maxLen (or slice value to maxLen - status.length - 1) and then
padEnd so the column never overflows; apply the same truncation logic to
leftContent using COL1_WIDTH - 1 if needed.
---
Nitpick comments:
In `@src/banner.ts`:
- Around line 94-98: The indentation of billingPrefix is inconsistent (it has
one extra leading space) compared to billingURL and the surrounding code; fix by
aligning billingPrefix's declaration indentation with billingURL and billingRow
so all three consts (billingURL, billingPrefix, billingRow) share the same left
margin, leaving the expressions unchanged (billingPrefix = "| Billing: ",
billingRow using WIDTH) to restore consistent formatting.
- Around line 57-59: Remove the unreachable fallback after getSetting: the
expression assigning embeddingModel uses getSetting(runtime,
"OPENAI_EMBEDDING_MODEL", "text-embedding-3-small") which already returns the
default, so drop the trailing ?? "text-embedding-3-small"; update the
embeddingModel assignment (around the embeddingModel symbol) to use only the
getSetting call and ensure no other code relies on the redundant nullish
coalescing.
- Around line 100-111: Replace the direct console.log calls in the banner
rendering with the shared logger so output respects configured log
levels/transports: import the existing logger from '@elizaos/core' if not
already present, then change each console.log(...) call (the ones printing line,
title, divider, row(...), billingRow, and the trailing blank line) to
logger.info(...) so the banner (variables/functions: line, title, divider, row,
billingRow) is emitted through the centralized logger while preserving the same
string formatting and order.
In `@src/models/embedding.ts`:
- Line 165: Replace the current success-level logging call logger.log(`Got valid
embedding with length ${embedding.length}`) with a debug-level call
(logger.debug) so embedding success messages don't clutter info logs; locate the
usage in the function that validates/returns embeddings (the line using
embedding.length) and make it consistent with other success paths that use
logger.debug.
- Around line 91-95: The inline comment in embedding.ts references a placeholder
issue URL "issues/XXX"; update the comment near the ensureEmbeddingDimension()
probe and the block discussing short-circuiting (the comment that starts "NOTE:
We do NOT short-circuit...") by either replacing "issues/XXX" with the correct
issue/PR number or removing the URL entirely so it doesn't confuse future
readers—ensure the rest of the explanatory text remains intact and that the
comment still mentions that the probe sends null and not a TextEmbeddingParams
object to keep context for ensureEmbeddingDimension().
In `@src/utils/rate-limiter.ts`:
- Around line 513-524: The env override for OPENAI_RATE_LIMIT_RPM only applies
to keys from DEFAULT_RPM, so categories added later via withRateLimit(...) will
miss the override; change the logic to apply the RPM override to the limiter's
actual category registry instead of Object.keys(DEFAULT_RPM) (or also apply the
override inside withRateLimit when new categories are registered). Locate the
envRPM handling around limiter.setRPM and DEFAULT_RPM and ensure
limiter.setRPM(cat, rpm) is invoked for every registered category (or that
withRateLimit calls limiter.setRPM when OPENAI_RATE_LIMIT_RPM is set) so new
categories receive the global override.
- Around line 139-145: Remove the dead Map currentPromises and its related
operations: delete the private declaration currentPromises = new Map<string,
Promise<void>>(), remove any assignments to currentPromises.set(...) (the set at
where the queue promise is stored), and remove the corresponding
currentPromises.delete(...) cleanup; also update or remove the misleading
comment that describes tracking "current promise for cleanup comparison" so the
code only uses acquireQueues for serialization/cleanup and has no stale
references to currentPromises. Ensure no other code reads currentPromises before
committing.
- Around line 306-338: The for-loop that calls limiter.acquire(category) and
returns await fn() can theoretically fall through and TypeScript's
noImplicitReturns wants an explicit exit; after the loop that handles retries,
add a defensive throw (e.g., throw new Error("Unreachable: retry loop exited
without returning")) to make the control flow explicit and satisfy strict type
checking; place this throw immediately after the closing brace of the for loop
that contains limiter.acquire(category), extractRateLimitInfo(error),
calculateBackoff(attempt), limiter.recordBackoff(category, backoffMs), and
sleep(backoffMs).
Iteration 1 prr-fix:prrc_kwdon0mv_86ov2j0 prr-fix:prrc_kwdon0mv_86ov2jy
Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ... - rate-limiter.ts: ### Redundant sleep before acquire causes double-wait on retry **Low Severit... - rate-limiter.ts: ### Unreachable throw after for-loop in withRateLimit **Low Severity** <!--... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...
Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...
Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ... - rate-limiter.ts: ### Redundant sleep before acquire causes double-wait on retry **Low Severit... - rate-limiter.ts: ### Unreachable throw after for-loop in withRateLimit **Low Severity** <!--... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...
Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ... - rate-limiter.ts: ### Redundant sleep before acquire causes double-wait on retry **Low Severit... - rate-limiter.ts: ### Unreachable throw after for-loop in withRateLimit **Low Severity** <!--... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...
Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ... - rate-limiter.ts: ### Redundant sleep before acquire causes double-wait on retry **Low Severit... - rate-limiter.ts: ### Unreachable throw after for-loop in withRateLimit **Low Severity** <!--... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...
Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...
Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ... - rate-limiter.ts: ### Redundant sleep before acquire causes double-wait on retry **Low Severit... - rate-limiter.ts: ### Unreachable throw after for-loop in withRateLimit **Low Severity** <!--... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...
Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ... - rate-limiter.ts: ### Redundant sleep before acquire causes double-wait on retry **Low Severit... - rate-limiter.ts: ### Unreachable throw after for-loop in withRateLimit **Low Severity** <!--... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...
Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...
Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ... - rate-limiter.ts: ### Redundant sleep before acquire causes double-wait on retry **Low Severit... - rate-limiter.ts: ### Unreachable throw after for-loop in withRateLimit **Low Severity** <!--... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...
Iteration 1 prr-fix:prrc_kwdon0mv_86nx3jh
Iteration 2 prr-fix:prrc_kwdon0mv_86nxzss prr-fix:prrc_kwdon0mv_86nx3ql prr-fix:prrc_kwdon0mv_86nx3qh prr-fix:prrc_kwdon0mv_86nx3jl prr-fix:prrc_kwdon0mv_86ovemz prr-fix:prrc_kwdon0mv_86ovemx
Iteration 7 prr-fix:prrc_kwdon0mv_86nx1xh prr-fix:prrc_kwdon0mv_86nx3qc prr-fix:prrc_kwdon0mv_86ovemv prr-fix:prrc_kwdon0mv_86of0ln
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| - Fix for src/utils/rate-limiter.ts - I can see that the code at lines 296-306 already has a proper fix that checks for NaN and handles zero values correctly: typescript | ||
| - Fix for src/utils/rate-limiter.ts:248 - tool modified wrong files (src/banner.ts, src/models/embedding.ts), need to modify src/utils/rate-limiter.ts | ||
| - Fix for src/utils/rate-limiter.ts:248 - The cleanup condition must account for map updates during concurrent calls—identity checks fail when other acquire() calls mutate the map before cleanup runs. | ||
| - Fix for src/utils/rate-limiter.ts:248 - The cleanup condition compares a stale promise reference against a freshly-created derived promise — they're guaranteed different objects on every subsequent call. |
There was a problem hiding this comment.
Auto-generated PR review tool state committed to repo
Low Severity
The .prr/lessons.md file contains auto-generated debugging output from the prr review tool — fix attempt logs, RESULT: ALREADY_FIXED markers, and stale line-number references. While the tool suggests committing it, the content is process-specific noise (e.g., "tool modified wrong files", references to now-outdated line numbers) that provides no value to repository readers and clutters the codebase.


… types
Add rate-limiter utility with per-endpoint backoff and retry logic. Improve embedding handler with better error handling and fallback support. Refactor init, audio, image, object, and text model handlers for @elizaos/core API compatibility. Expand types and add banner. Update README with comprehensive documentation.
Note
Medium Risk
Touches all OpenAI request paths and introduces coordinated retry/backoff behavior, so any bug could impact latency or error handling under load; changes are well-scoped but affect core runtime interactions (streaming, init, request execution).
Overview
Adds a module-level OpenAI rate limiter (
src/utils/rate-limiter.ts) with per-category sliding-window RPM throttling, coordinated backoff/retry on 429s, and fast-fail quota/billing detection (distinguishingQuotaExceededErrorfrom transientRateLimitError), plus one-shot tier logging from rate-limit headers.Applies rate limiting across model handlers: raw
fetchpaths nowawait throwIfRateLimited()and are wrapped inwithRateLimit()(embeddings/images/audio), Vercel AI SDK calls are wrapped for non-streaming (text.ts,object.ts), and streaming text is throttled viaacquireRateLimit()only.Simplifies plugin setup by removing the eager
GET /modelsvalidation call ininit.ts(replaced with sync config checks + a new startup config banner insrc/banner.ts), DRYsmodelsregistration insrc/index.tswith a type assertion for older core typings, and adds local streaming type shims (StreamingTextParams,TextStreamResult) for@elizaos/corecompatibility. Documentation is expanded (newCHANGELOG.md, major README updates), defaults are updated (OPENAI_IMAGE_DESCRIPTION_MODEL), and@elizaos/coreis moved to a peer dependency.Written by Cursor Bugbot for commit 5e06d42. This will update automatically on new commits. Configure here.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Documentation
Greptile Overview
Greptile Summary
This PR adds a comprehensive rate limiting system to prevent 429 errors from OpenAI's API and improves model handlers with better error handling and caching.
Major Changes:
@elizaos/coreversionsTechnical Highlights:
Confidence Score: 4/5
retry-afterheader parsing that could pass NaN to backoff logic. All other code is well-structured with proper error handling, caching strategies, and type safetysrc/utils/rate-limiter.ts:338- the retry-after header parsing needs the parentheses fix to handle invalid header values correctlyImportant Files Changed
Flowchart
flowchart TD Start[API Call Request] --> CheckCategory{Determine Category<br/>chat/embeddings/images/audio} CheckCategory --> AcquireSlot[Acquire Rate Limit Slot] AcquireSlot --> CheckBackoff{Active Backoff?} CheckBackoff -->|Yes| WaitBackoff[Wait for Backoff] CheckBackoff -->|No| CheckWindow{Within RPM Limit?} WaitBackoff --> CheckWindow CheckWindow -->|No| WaitWindow[Wait for Window Slot] CheckWindow -->|Yes| RecordRequest[Record Request Timestamp] WaitWindow --> RecordRequest RecordRequest --> MakeCall[Execute API Call] MakeCall --> CheckResponse{Response Status} CheckResponse -->|200 OK| Success[Return Result] CheckResponse -->|429 Quota| QuotaError[Throw QuotaExceededError<br/>No Retry] CheckResponse -->|429 Rate Limit| RateError{Retries Left?} CheckResponse -->|Other Error| OtherError[Throw Error] RateError -->|Yes| RecordBackoff[Record Backoff Deadline<br/>Exponential + Jitter] RateError -->|No| MaxRetries[Throw Max Retries Error] RecordBackoff --> Sleep[Sleep with Backoff] Sleep --> AcquireSlot Success --> End[Complete] QuotaError --> End OtherError --> End MaxRetries --> EndLast reviewed commit: 7542c67