feat(plugin-openai): add rate limiter, improve embeddings, and update… by odilitime · Pull Request #24 · elizaos-plugins/plugin-openai

odilitime · 2026-02-14T22:30:12Z

… types

Add rate-limiter utility with per-endpoint backoff and retry logic. Improve embedding handler with better error handling and fallback support. Refactor init, audio, image, object, and text model handlers for @elizaos/core API compatibility. Expand types and add banner. Update README with comprehensive documentation.

Note

Medium Risk
Touches all OpenAI request paths and introduces coordinated retry/backoff behavior, so any bug could impact latency or error handling under load; changes are well-scoped but affect core runtime interactions (streaming, init, request execution).

Overview
Adds a module-level OpenAI rate limiter (src/utils/rate-limiter.ts) with per-category sliding-window RPM throttling, coordinated backoff/retry on 429s, and fast-fail quota/billing detection (distinguishing QuotaExceededError from transient RateLimitError), plus one-shot tier logging from rate-limit headers.

Applies rate limiting across model handlers: raw fetch paths now await throwIfRateLimited() and are wrapped in withRateLimit() (embeddings/images/audio), Vercel AI SDK calls are wrapped for non-streaming (text.ts, object.ts), and streaming text is throttled via acquireRateLimit() only.

Simplifies plugin setup by removing the eager GET /models validation call in init.ts (replaced with sync config checks + a new startup config banner in src/banner.ts), DRYs models registration in src/index.ts with a type assertion for older core typings, and adds local streaming type shims (StreamingTextParams, TextStreamResult) for @elizaos/core compatibility. Documentation is expanded (new CHANGELOG.md, major README updates), defaults are updated (OPENAI_IMAGE_DESCRIPTION_MODEL), and @elizaos/core is moved to a peer dependency.

^{Written by Cursor Bugbot for commit 5e06d42. This will update automatically on new commits. Configure here.}

Summary by CodeRabbit

New Features
- Startup banner with masked API key, config summary and billing link
- Process-wide, per-category rate limiting with retries/backoff, billing fail-fast, and one-time tier detection
- Synthetic fast-paths for embeddings to avoid unnecessary API calls
- Streaming-friendly text responses and compatible types
Bug Fixes
- Reduced startup latency by removing eager network validation
- Improved 429 vs billing error classification and handling
Refactor
- Simplified model handler wiring and applied rate-limited wrappers
Documentation
- Expanded README with examples, architecture, telemetry, and updated defaults

Greptile Overview

Greptile Summary

This PR adds a comprehensive rate limiting system to prevent 429 errors from OpenAI's API and improves model handlers with better error handling and caching.

Major Changes:

New sliding-window rate limiter with per-category tracking (embeddings, chat, images, audio) and exponential backoff with automatic retry logic
Removed eager API validation during initialization to prevent race conditions with embedding dimension probes
Enhanced embedding handler with synthetic vectors for test probes, avoiding unnecessary API calls during startup
Added persistent caching for image descriptions and transcriptions with content-based hashing
Refactored all model handlers to use rate limiting wrapper with quota detection
Added forward-compatibility type shims for streaming support across @elizaos/core versions
New startup banner displaying configuration status

Technical Highlights:

Rate limiter uses module-level singleton pattern to preserve state across agent reinitializations
Image description caching includes ETag/Last-Modified headers for content-aware cache keys
Transcription caching uses optimized blob sampling (start/middle/end slices) to minimize memory usage
Quota exhaustion (billing) errors are distinguished from transient rate limit errors to fail fast
Comprehensive documentation with detailed rationale comments throughout

Confidence Score: 4/5

Safe to merge with one minor logic issue in rate-limiter retry-after header parsing
Comprehensive implementation with excellent documentation and clear architectural decisions. One minor bug in retry-after header parsing that could pass NaN to backoff logic. All other code is well-structured with proper error handling, caching strategies, and type safety
Pay attention to src/utils/rate-limiter.ts:338 - the retry-after header parsing needs the parentheses fix to handle invalid header values correctly

Important Files Changed

Filename	Overview
src/utils/rate-limiter.ts	New rate limiter with sliding window, exponential backoff, and per-category tracking. Well-documented implementation with quota detection.
src/models/embedding.ts	Enhanced with rate limiting and improved error handling. Synthetic embeddings for test probes avoid unnecessary API calls.
src/models/image.ts	Added rate limiting, caching with content-aware keys, and size validation. Comprehensive error handling for image operations.
src/models/text.ts	Refactored to use rate limiter with different strategies for streaming vs non-streaming calls. Clean implementation.
src/models/audio.ts	Added transcription caching with content-based hashing and rate limiting. Efficient blob sampling for cache keys.
src/init.ts	Removed eager API validation to avoid startup race conditions. Synchronous config validation only.

Flowchart

flowchart TD
    Start[API Call Request] --> CheckCategory{Determine Category<br/>chat/embeddings/images/audio}
    CheckCategory --> AcquireSlot[Acquire Rate Limit Slot]
    
    AcquireSlot --> CheckBackoff{Active Backoff?}
    CheckBackoff -->|Yes| WaitBackoff[Wait for Backoff]
    CheckBackoff -->|No| CheckWindow{Within RPM Limit?}
    
    WaitBackoff --> CheckWindow
    CheckWindow -->|No| WaitWindow[Wait for Window Slot]
    CheckWindow -->|Yes| RecordRequest[Record Request Timestamp]
    
    WaitWindow --> RecordRequest
    RecordRequest --> MakeCall[Execute API Call]
    
    MakeCall --> CheckResponse{Response Status}
    CheckResponse -->|200 OK| Success[Return Result]
    CheckResponse -->|429 Quota| QuotaError[Throw QuotaExceededError<br/>No Retry]
    CheckResponse -->|429 Rate Limit| RateError{Retries Left?}
    CheckResponse -->|Other Error| OtherError[Throw Error]
    
    RateError -->|Yes| RecordBackoff[Record Backoff Deadline<br/>Exponential + Jitter]
    RateError -->|No| MaxRetries[Throw Max Retries Error]
    
    RecordBackoff --> Sleep[Sleep with Backoff]
    Sleep --> AcquireSlot
    
    Success --> End[Complete]
    QuotaError --> End
    OtherError --> End
    MaxRetries --> End

_{Last reviewed commit: 7542c67}

… types Add rate-limiter utility with per-endpoint backoff and retry logic. Improve embedding handler with better error handling and fallback support. Refactor init, audio, image, object, and text model handlers for @elizaos/core API compatibility. Expand types and add banner. Update README with comprehensive documentation. Co-authored-by: Cursor <cursoragent@cursor.com>

coderabbitai · 2026-02-14T22:30:31Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

Walkthrough

Adds a process-wide rate limiter with per-category sliding-window RPMs and exponential backoff, billing-429 detection (fail-fast quota errors), startup banner and tier logging, synthetic embedding fast-paths, non-blocking init, streaming compatibility shims, and refactors model registration to use direct handler references.

Changes

Cohort / File(s)	Summary
Rate Limiting & Exports `src/utils/rate-limiter.ts`, `src/utils/index.ts`	New process-wide rate limiter: per-category sliding windows, exponential backoff, Retry-After handling, env-configurable RPM/max-retries, `RateLimitError`/`QuotaExceededError`, `withRateLimit`/`acquireRateLimit`/`throwIfRateLimited`/`logTierOnce`; re-exported from utils.
Startup Banner & Init `src/banner.ts`, `src/init.ts`	Adds `printOpenAiBanner` showing masked API key, base URL, models and billing link; removes eager /models validation fetch and makes init synchronous/non-blocking for startup.
Embedding Handler `src/models/embedding.ts`	Adds synthetic-embedding fast-paths for probes/empty/invalid inputs, defensive response validation, usage events, tier logging, and wraps API calls with `withRateLimit` + `throwIfRateLimited`.
Text / Streaming `src/models/text.ts`, `src/types/index.ts`	Introduces `StreamingTextParams` and `TextStreamResult` compatibility shims; streaming uses `acquireRateLimit`, non-streaming uses `withRateLimit`; handler signatures updated to accept streaming-compatible types.
Image / Audio / Object Handlers `src/models/image.ts`, `src/models/audio.ts`, `src/models/object.ts`	Wraps image generation/description, TTS, transcription, and object generation with `withRateLimit`/`acquireRateLimit`; centralizes `throwIfRateLimited` and `logTierOnce`, improves error logging, maintains caching and usage emission inside rate-limited flows.
Plugin Entry & Models `src/index.ts`	Refactors plugin model registration to use direct handler function references (replacing wrapper lambdas) with a type-assertion workaround; imports and re-exports types from `./types`.
Types & Compatibility `src/types/index.ts`	Adds public compatibility shims `StreamingTextParams` and `TextStreamResult` to support older core versions and enable streaming behavior.
Docs & Changelog `CHANGELOG.md`, `README.md`	Extensive documentation and changelog updates covering the banner, tier detection, billing-429 handling, rate-limiter design, examples, architecture notes, telemetry, and model/type documentation; default image description model updated.
Package Manifest `package.json`	Moves `@elizaos/core` from `dependencies` to `peerDependencies` (`"@elizaos/core": "^1.0.0"`).
Misc / CI / Docs `.gitignore`, `.prr/lessons.md`	Adds `.pr-resolver-state.json` to .gitignore; adds autogenerated PRR lessons doc.

sequenceDiagram
    participant Client as Client Request
    participant Handler as Model Handler
    participant Limiter as Rate Limiter
    participant API as OpenAI API
    participant Tier as Tier Logger

    Client->>Handler: invoke model handler
    Handler->>Limiter: withRateLimit / acquireRateLimit(category, fn)
    Limiter->>Limiter: check sliding window, backoff, RPM limits
    alt allowed
        Limiter->>Handler: execute fn
        Handler->>API: send HTTP request
        API-->>Handler: 200 OK + headers
        Handler->>Tier: logTierOnce(response)
        Tier-->>Tier: parse x-ratelimit-limit-* and log once
        Handler-->>Client: return result (or stream)
    else 429 / billing
        API-->>Handler: 429 response
        Handler->>Limiter: throwIfRateLimited -> classify (quota vs rate)
        alt quota (billing)
            Handler-->>Client: throw QuotaExceededError (fail-fast)
        else transient rate-limit
            Limiter->>Limiter: schedule retry (Retry-After / backoff)
            Limiter->>Handler: retry fn up to maxRetries
        end
    end

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

🐰 I nudged the code and left a trail,
Limits set so requests won't fail,
Probes that skip a costly call,
A banner printed in the hall,
This rabbit hops—steady, short, and small. 🎉

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main changes: adding a rate limiter, improving embeddings, and updating types. It is specific, concise, and directly related to the primary objectives of the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 86.36% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch odi-dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps

_{14 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

src/utils/rate-limiter.ts

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/models/image.ts (1)
269-271: ⚠️ Potential issue | 🟡 Minor

Consider future-proofing the model version check with a regex pattern.

The current hardcoded checks for gpt-5, o1, and o3 will miss future o-series models (e.g., o4, o5). Per OpenAI's API documentation, all o-series reasoning models and GPT-5 models require max_completion_tokens instead of the deprecated max_tokens. Update the check to use a pattern that matches all future versions:
-  const useMaxCompletionTokens = modelName.startsWith("gpt-5") ||
-    modelName.startsWith("o1") ||
-    modelName.startsWith("o3");
+  // Models requiring max_completion_tokens instead of max_tokens (o-series and gpt-5)
+  const useMaxCompletionTokens = /^(gpt-5|o[0-9]+)/.test(modelName);
src/index.ts (1)
332-365: ⚠️ Potential issue | 🔴 Critical

Streaming test is broken: stream: true is missing, and onStreamChunk callback is never invoked.

streamParams at line 343 sets prompt and onStreamChunk but does not set stream: true. In generateTextByModelType (src/models/text.ts line 66), the check if (params.stream) gates the streaming path. Since stream is undefined, the non-streaming generateText path is taken, which never invokes onStreamChunk. Additionally, onStreamChunk is not forwarded to generateParams (lines 44–54), so it would not be passed to the underlying Vercel AI SDK functions even if the streaming path were entered.

The assertion at line 356–358 (chunks.length === 0 → throw) will always fail because the chunks array remains empty.
Proposed fix
              const streamParams: StreamingTextParams = {
                prompt: "Count from 1 to 5.",
+               stream: true,
                onStreamChunk: (chunk: string) => {
                  chunks.push(chunk);
                },
              };
With stream: true added, result will be a TextStreamResult (not a string), requiring the downstream code to consume result.textStream as an AsyncIterable<string> rather than using the onStreamChunk callback pattern. The current assertions casting to string and calling .substring() will fail.

🤖 Fix all issues with AI agents

In `@package.json`:
- Line 47: The package.json currently lists the dependency "@elizaos/core":
"workspace:*", which is a pnpm workspace-only spec and will break installs for
consumers—replace the "workspace:*" value with a concrete semver range (e.g.
"^1.7.2" or "^0.1.9") for the "@elizaos/core" dependency entry in package.json,
update the lockfile by reinstalling (npm install or bun install) and commit the
changed package.json and lockfile; ensure any CI/publish workflows reference the
updated version rather than a workspace protocol.

In `@src/models/embedding.ts`:
- Around line 153-164: The current check in the embedding validation
(variables/functions: embedding, embeddingDimension, logger.error,
syntheticEmbedding) logs a mismatch and returns a synthetic vector which
silently corrupts the embedding store; change this to fail fast by throwing a
clear, descriptive error (including actual length and expected
embeddingDimension) instead of returning synthetic data, or make this behavior
configurable (e.g., add an option/flag like throwOnDimensionMismatch to the
embedding creation/validation path) so callers can opt into strict validation;
update callers/tests to handle the thrown error or pass the new option as
appropriate.
- Around line 101-106: The current short-circuit "if (text === 'test')" in the
embedding creation path returns a synthetic embedding for any literal "test"
input; change this to only trigger for the probe sentinel by verifying the
original param is the special probe object shape (not just the string).
Concretely, update the condition around where text is extracted (the code that
handles string vs object params) to check that the incoming arg is an object and
contains the probe marker (e.g., a dedicated boolean/string sentinel property
used by ensureEmbeddingDimension), and only then call
syntheticEmbedding(embeddingDimension, 0.1); otherwise continue to call the real
embedding path. Reference symbols: syntheticEmbedding, embeddingDimension, and
ensureEmbeddingDimension probe.

🧹 Nitpick comments (7)

src/banner.ts (2)
84-88: Long values will overflow the table row and break alignment.

padEnd doesn't truncate — if ${value} ${status} exceeds COL2_WIDTH - 1 (41 chars), the row will be wider than the table borders. A custom base URL like https://my-company-proxy.internal.example.com/v1 would break the layout.

Consider truncating or ellipsizing the value to fit:
Proposed fix
   function row(label: string, value: string, status: string): string {
-    const rightContent = `${value} ${status}`.padEnd(COL2_WIDTH - 1);
+    const maxValueLen = COL2_WIDTH - 1 - status.length - 1; // 1 for space before status
+    const truncatedValue = value.length > maxValueLen
+      ? value.slice(0, maxValueLen - 1) + "…"
+      : value;
+    const rightContent = `${truncatedValue} ${status}`.padEnd(COL2_WIDTH - 1);
     const leftContent = label.padEnd(COL1_WIDTH - 1);
     return `| ${leftContent}| ${rightContent}|`;
   }
93-104: Consider using logger instead of console.log for consistency with the rest of the codebase.

Direct console.log bypasses any structured logging, log-level filtering, or log routing the host application may have configured. If the logger adds unwanted prefixes that break table formatting, this is a reasonable trade-off — but worth a brief inline comment explaining why console.log is used intentionally.
src/models/embedding.ts (1)

141-149: Type assertion on response JSON lacks full runtime validation.

The as { data: [...] } cast at line 141 trusts the response shape. The check at line 146 only validates data.data[0].embedding exists but doesn't verify it's actually a number[]. If the API returns an unexpected shape (e.g., embedding is a string), this would propagate garbage downstream silently.

This is acceptable risk given OpenAI's stable API, but worth noting.

src/models/image.ts (1)

326-333: Logging full response headers in error context could leak sensitive data in some proxy configurations.

Object.fromEntries(response.headers.entries()) logs all response headers. While OpenAI response headers are typically safe, custom proxies or intermediaries might inject sensitive headers. Consider filtering to known-useful headers.
README.md (1)
75-87: Add language specifiers to fenced code blocks flagged by markdownlint.

Lines 75, 93, 118, and 269 have fenced code blocks without a language specifier (markdownlint MD040). Use text for plaintext output blocks:
Fix
- ```
+ ```text
  +----------------------------------------------------------------------+
- ```
+ ```text
  [OpenAI] Account tier detected: 10000 RPM, 2000000 TPM
- ```
+ ```text
  [OpenAI] Quota exceeded -- your API key has no remaining credits.
- ```
+ ```text
  src/
Also applies to: 93-95, 118-121, 269-289
src/utils/rate-limiter.ts (2)
154-188: Minor race condition in acquire() under concurrent calls for the same category.

When multiple async calls to acquire(category) interleave (e.g., embeddings burst), each call creates its own recent array from filter() at line 170, checks capacity, pushes a timestamp, and writes back via this.windows.set(). The last writer wins, so earlier timestamps can be lost — allowing slightly more requests than the RPM limit.

For a client-side rate limiter this is acceptable (the real enforcement is server-side), but if you want tighter accuracy, consider mutating the stored array in-place rather than replacing it:
Optional tightening
    const maxRPM = this.getRPM(category);
    const cutoff = Date.now() - WINDOW_MS;
-   const timestamps = this.windows.get(category) ?? [];
-   const recent = timestamps.filter((t) => t > cutoff);
+   let recent = this.windows.get(category);
+   if (!recent) {
+     recent = [];
+     this.windows.set(category, recent);
+   }
+   // Prune in-place
+   let writeIdx = 0;
+   for (let i = 0; i < recent.length; i++) {
+     if (recent[i] > cutoff) recent[writeIdx++] = recent[i];
+   }
+   recent.length = writeIdx;

    if (recent.length >= maxRPM) {
      const oldestInWindow = recent[0];
      const waitMs = oldestInWindow + WINDOW_MS - Date.now() + 50;
      if (waitMs > 0) {
        logger.warn(
          `[OpenAI:RateLimit] ${category}: ${recent.length}/${maxRPM} RPM, throttling ${waitMs}ms`,
        );
        await sleep(waitMs);
      }
    }

    recent.push(Date.now());
-   this.windows.set(category, recent);
357-384: String-matching fallback in extractRateLimitInfo could produce false positives on unrelated errors containing "429".

An error message like "Server returned 4290 items" would match msg.includes("429"). This is unlikely with OpenAI errors but could trip on custom proxy error messages. A word-boundary check would be more precise:
Optional tightening
    if (
-     msg.includes("429") ||
+     /\b429\b/.test(msg) ||
      msg.includes("rate limit") ||
      msg.includes("too many requests") ||
      isBilling
    ) {

package.json

src/models/embedding.ts

Copilot

Pull request overview

Adds a process-level OpenAI rate limiting utility and refactors model handlers/types/docs to improve resiliency (429 handling, tier logging, startup behavior) and compatibility across @elizaos/core versions.

Changes:

Introduces a shared rate limiter with per-endpoint RPM windows, 429 retry/backoff, quota detection, and one-shot tier logging.
Refactors text/object/image/audio/embedding handlers to use rate limiting and improved error handling (plus synthetic embedding fast-paths).
Adds a startup configuration banner, expands local compatibility types, and updates README/CHANGELOG.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
src/utils/rate-limiter.ts	New rate limiter wrapper, 429/quota detection, per-category windows/backoff, tier logging
src/utils/index.ts	Re-exports the rate limiter utilities
src/types/index.ts	Adds shared types + local shims for streaming compatibility with older core versions
src/models/text.ts	Uses limiter for chat calls; adds streaming initiation throttling
src/models/object.ts	Wraps object generation in rate-limited retry wrapper
src/models/image.ts	Wraps image generation/description with rate limiting + 429 parsing + tier logging
src/models/embedding.ts	Adds synthetic embedding fast-paths and wraps API calls with rate limiting
src/models/audio.ts	Wraps TTS/transcription with rate limiting + 429 parsing + tier logging
src/init.ts	Removes eager API validation; prints banner and logs baseURL
src/index.ts	Simplifies model registration; adds streaming test and type shims usage
src/banner.ts	Adds startup configuration banner output
package.json	Switches `@elizaos/core` dependency to `workspace:*`
README.md	Documents banner, limiter behavior/config, compatibility notes, model table, architecture
CHANGELOG.md	Adds unreleased changelog describing new functionality

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-14T22:40:06Z

src/models/embedding.ts

  let text: string;
  if (typeof params === "string") {
    text = params;
  } else if (typeof params === "object" && params.text) {
    text = params.text;
  } else {
-    const errorMsg = "Invalid input format for embedding";
-    logger.warn(errorMsg);
-    const fallbackVector = Array(embeddingDimension).fill(0);
-    fallbackVector[0] = 0.2;
-    return fallbackVector;
+    logger.warn("Invalid input format for embedding");
+    return syntheticEmbedding(embeddingDimension, 0.2);
  }


params.text is checked for truthiness, so { text: "" } is treated as “invalid input format” (marker 0.2) instead of the intended “empty text” path (marker 0.3). Check for the presence of the text property (e.g., 'text' in params) and then handle empty-string via the existing !text.trim() branch.

Copilot · 2026-02-14T22:40:06Z

src/index.ts

+              const streamParams: StreamingTextParams = {
                prompt: "Count from 1 to 5.",
                onStreamChunk: (chunk: string) => {
                  chunks.push(chunk);
                },
-              });
-              if (!result || result.length === 0) {
+              };


The streaming test sets onStreamChunk but never sets stream: true. In this PR’s text handler, streaming only activates when params.stream is truthy, so this test will run the non-streaming path and chunks will remain empty. Either set stream: true here (and adjust assertions/types to match the streaming return shape) or update the handler to treat onStreamChunk as a streaming request.

README.md

Copilot · 2026-02-14T22:40:07Z

package.json

    "@ai-sdk/openai": "^2.0.32",
-    "@elizaos/core": "^1.7.0",
+    "@elizaos/core": "workspace:*",
    "ai": "^5.0.47",


Changing @elizaos/core dependency to "workspace:*" will break consumers installing this plugin from npm (the workspace protocol generally won’t resolve outside a monorepo). If the intent is to support both local workspace dev and published installs, consider restoring a semver range here and/or moving @elizaos/core to peerDependencies with an appropriate supported range.

src/utils/rate-limiter.ts

src/banner.ts

src/utils/rate-limiter.ts

Iteration 1 prr-fix:prrc_kwdon0mv_86nxzss prr-fix:prrc_kwdon0mv_86nx1xh prr-fix:prrc_kwdon0mv_86nx1xp prr-fix:prrc_kwdon0mv_86nx3qo prr-fix:prrc_kwdon0mv_86nx3qv prr-fix:prrc_kwdon0mv_86nx3qz prr-fix:prrc_kwdon0mv_86nx3qc prr-fix:prrc_kwdon0mv_86nx3jh prr-fix:prrc_kwdon0mv_86nx3jl prr-fix:prrc_kwdon0mv_86nx3ql

Changes: - rate-limiter.ts: parseFloat on `retry-after` can return NaN if header value is invalid, then m... - package.json: _⚠️ Potential issue_ | _🔴 Critical_ <details> <summary>🧩 Analysis chain</s... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ <details> <summary>🧩 Analysis chain</summ... - embedding.ts: _⚠️ Potential issue_ | _🟠 Major_ **Dimension mismatch returns a synthetic e... - embedding.ts: `params.text` is checked for truthiness, so `{ text: "" }` is treated as “inv... - index.ts: The streaming test sets `onStreamChunk` but never sets `stream: true`. In thi... - README.md: README says the default `IMAGE_DESCRIPTION` model is `gpt-4o-mini`, but the c... - package.json: Changing `@elizaos/core` dependency to `"workspace:*"` will break consumers i... - rate-limiter.ts: `RateLimiter.acquire()` is not concurrency-safe: concurrent callers can read ... - rate-limiter.ts: `OPENAI_RATE_LIMIT_MAX_RETRIES` set to `0` won’t be respected because `parseI... - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Env var max retries zero treated as default **Low Severity** <!-- DESCR...

src/utils/rate-limiter.ts

Iteration 11 prr-fix:prrc_kwdon0mv_86of0ln prr-fix:prrc_kwdon0mv_86nx3jh

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (5)

src/utils/rate-limiter.ts (2)
296-303: Environment variable re-read on every call.

process.env.OPENAI_RATE_LIMIT_MAX_RETRIES is parsed on every invocation of withRateLimit. This is a hot path for all API calls. Consider caching it at module scope (like envRPM on line 519) or lazily on first read, since it's unlikely to change at runtime.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/rate-limiter.ts` around lines 296 - 303, The code repeatedly reads
and parses process.env.OPENAI_RATE_LIMIT_MAX_RETRIES inside withRateLimit
(envMaxRetriesRaw/envMaxRetries); move that parsing out of the hot path by
caching it at module scope (e.g., create a module-level constant like
cachedMaxRetries or reuse envRPM pattern) or initialize it lazily on first
access, then have withRateLimit reference the cached value instead of re-reading
envMaxRetriesRaw each call; ensure the cached value uses the same validation
(Number.isFinite and >= 0) and remains typed as number | undefined.
372-406: Retry-After header: only seconds format is parsed, HTTP-date format is silently ignored.

The Retry-After header can be either a number of seconds or an HTTP-date (RFC 7231 §7.1.3). parseFloat on an HTTP-date returns NaN, which is safely handled (falls back to undefined), so this isn't a bug. However, if OpenAI ever switches to date format, the optimal backoff timing would be lost. Documenting this assumption in a comment would be sufficient.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/rate-limiter.ts` around lines 372 - 406, The rate-limit handling in
throwIfRateLimited currently only parses a numeric Retry-After seconds value
(via parseFloat) and will silently ignore RFC-7231 HTTP-date formatted
Retry-After headers; update the throwIfRateLimited function to document this
assumption by adding a clear comment above the retry-after parsing that explains
both allowed formats (seconds or HTTP-date) and that the current implementation
only parses seconds, and optionally (preferred) add fallback parsing for
HTTP-date by using Date.parse(response.headers.get("retry-after")) to compute
retrySeconds when parseFloat yields NaN before converting to retryMs and passing
it into RateLimitError.
src/banner.ts (2)
75-78: Inconsistent indentation on multi-line assignments.

Lines 75–78 and 95–98 have an extra leading space compared to adjacent lines (e.g., const title vs const billingPrefix). This appears to be an accidental whitespace inconsistency.
🧹 Proposed fix
-   const title =
-     titlePrefix +
-     " ".repeat(WIDTH - titlePrefix.length - 1) +
-     "|";
+  const title =
+    titlePrefix +
+    " ".repeat(WIDTH - titlePrefix.length - 1) +
+    "|";
-   const billingPrefix = "| Billing: ";
-   const billingRow = `${billingPrefix}${billingURL}${" ".repeat(
-     WIDTH - billingPrefix.length - billingURL.length - 1,
-   )}|`;
+  const billingPrefix = "| Billing: ";
+  const billingRow = `${billingPrefix}${billingURL}${" ".repeat(
+    WIDTH - billingPrefix.length - billingURL.length - 1,
+  )}|`;
Also applies to: 95-98
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/banner.ts` around lines 75 - 78, The multi-line assignments for the
constants title and billingPrefix have an extra leading space causing
inconsistent indentation; fix both by aligning their continued lines with the
declaration (remove the accidental leading space before the second and
subsequent lines) so the concatenation lines for const title and const
billingPrefix line up with their variable names and match other multi-line
assignments that use WIDTH and titlePrefix/billingPrefix.
100-111: Consider using logger.info instead of console.log.

The rest of the codebase uses logger from @elizaos/core for all output. Using console.log directly bypasses any log-level filtering, structured logging, or output redirection that the framework may provide. If this is intentional (to ensure the banner is always visible), a brief comment explaining the choice would help.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/banner.ts` around lines 100 - 111, The banner currently uses console.log
in src/banner.ts (calls to console.log around line, title, divider, row(...) and
billingRow), which bypasses the project's logger; update the file to import and
use logger.info from `@elizaos/core` for these outputs (replace each
console.log(...) with logger.info(...)) so log-levels and sinks are respected,
or if the intent is to always show the banner regardless of log-levels, add a
short comment above the block explaining that intentional choice and why
console.log is required; ensure references to the row(...) helper and variables
(line, title, divider, billingRow) remain unchanged.
README.md (1)
75-87: Add language identifiers to fenced code blocks.

Several fenced code blocks in this file (lines 75, 93, 118, 269) lack a language specifier. For terminal/log output, use ```text or ```console to satisfy markdownlint MD040 and improve rendering.

Same applies to the blocks at lines 93, 118, and 269.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 75 - 87, The README contains several fenced code
blocks that lack a language identifier (notably the ASCII table block starting
with "+----------------------------------------------------------------------+"
and other terminal/log output blocks); update each opening fence to include an
appropriate language specifier such as ```text or ```console (for example change
``` to ```text) for the blocks referenced in the review so they satisfy
markdownlint MD040 and render correctly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/models/embedding.ts`:
- Around line 101-107: Remove the accidental line-number decorations ("101 |",
"102 |", etc.) that were inserted into the if-block around the synthetic probe
check; locate the if statement that checks `if (text === "test" && typeof params
=== "object" && params !== null)` and the subsequent `logger.debug(...)` and
`return syntheticEmbedding(embeddingDimension, 0.1);` lines and delete the
numeric prefixes so the block is valid TypeScript again, ensuring
`logger.debug`, `syntheticEmbedding`, `embeddingDimension`, `text`, and `params`
remain unchanged otherwise.

---

Nitpick comments:
In `@README.md`:
- Around line 75-87: The README contains several fenced code blocks that lack a
language identifier (notably the ASCII table block starting with
"+----------------------------------------------------------------------+" and
other terminal/log output blocks); update each opening fence to include an
appropriate language specifier such as ```text or ```console (for example change
``` to ```text) for the blocks referenced in the review so they satisfy
markdownlint MD040 and render correctly.

In `@src/banner.ts`:
- Around line 75-78: The multi-line assignments for the constants title and
billingPrefix have an extra leading space causing inconsistent indentation; fix
both by aligning their continued lines with the declaration (remove the
accidental leading space before the second and subsequent lines) so the
concatenation lines for const title and const billingPrefix line up with their
variable names and match other multi-line assignments that use WIDTH and
titlePrefix/billingPrefix.
- Around line 100-111: The banner currently uses console.log in src/banner.ts
(calls to console.log around line, title, divider, row(...) and billingRow),
which bypasses the project's logger; update the file to import and use
logger.info from `@elizaos/core` for these outputs (replace each console.log(...)
with logger.info(...)) so log-levels and sinks are respected, or if the intent
is to always show the banner regardless of log-levels, add a short comment above
the block explaining that intentional choice and why console.log is required;
ensure references to the row(...) helper and variables (line, title, divider,
billingRow) remain unchanged.

In `@src/utils/rate-limiter.ts`:
- Around line 296-303: The code repeatedly reads and parses
process.env.OPENAI_RATE_LIMIT_MAX_RETRIES inside withRateLimit
(envMaxRetriesRaw/envMaxRetries); move that parsing out of the hot path by
caching it at module scope (e.g., create a module-level constant like
cachedMaxRetries or reuse envRPM pattern) or initialize it lazily on first
access, then have withRateLimit reference the cached value instead of re-reading
envMaxRetriesRaw each call; ensure the cached value uses the same validation
(Number.isFinite and >= 0) and remains typed as number | undefined.
- Around line 372-406: The rate-limit handling in throwIfRateLimited currently
only parses a numeric Retry-After seconds value (via parseFloat) and will
silently ignore RFC-7231 HTTP-date formatted Retry-After headers; update the
throwIfRateLimited function to document this assumption by adding a clear
comment above the retry-after parsing that explains both allowed formats
(seconds or HTTP-date) and that the current implementation only parses seconds,
and optionally (preferred) add fallback parsing for HTTP-date by using
Date.parse(response.headers.get("retry-after")) to compute retrySeconds when
parseFloat yields NaN before converting to retryMs and passing it into
RateLimitError.

src/models/embedding.ts

Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ...

Changes: - embedding.ts: `params.text` is checked for truthiness, so `{ text: "" }` is treated as “inv... - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - embedding.ts: _⚠️ Potential issue_ | _🔴 Critical_ <details> <summary>🧩 Analysis chain</s...

coderabbitai

🧹 Nitpick comments (5)

src/utils/rate-limiter.ts (5)
496-498: logTierOnce only triggers on status 200, not the full 2xx range.

OpenAI's main endpoints (chat, embeddings) return 200, so this works in practice. If any future endpoint returns 201 or 204 with rate-limit headers, tier detection would silently skip it. A response.ok check would be more forward-compatible.
Proposed fix
-  if (response.status !== 200) return; // Only log from successful responses
+  if (!response.ok) return; // Only log from successful responses
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/rate-limiter.ts` around lines 496 - 498, The logTierOnce function
currently only considers responses with status === 200 and thus skips other
successful 2xx responses; update the guard in logTierOnce to use response.ok
(e.g., if (!response.ok) return) so any 2xx success will allow tier detection to
run, leaving the tierLoggedOnce short-circuit unchanged to preserve single-time
logging.
394-399: Retry-After header may contain an HTTP-date, not just seconds.

Per RFC 9110, Retry-After can be either a delay-seconds (120) or an HTTP-date (Thu, 01 Dec 2025 16:00:00 GMT). The current parseFloat correctly falls through to undefined for dates (NaN check on line 397), so this degrades gracefully — just noting that an HTTP-date value would discard a valid server hint. OpenAI currently uses numeric seconds, so this is low risk.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/rate-limiter.ts` around lines 394 - 399, The Retry-After header
handling only parses numeric seconds (retryAfter -> retrySeconds -> retryMs) and
ignores valid HTTP-date values; update the logic that reads
response.headers.get("retry-after") so it first attempts parseFloat and if that
is NaN, tries to parse an HTTP-date (e.g., via Date.parse(retryAfter)) and
computes retryMs as the difference between the parsed date and Date.now()
(clamped to >=0); keep existing behavior of using numeric seconds when present
and ensure invalid parses result in undefined.
514-526: Global RPM override only applies to categories known at module load time.

Object.keys(DEFAULT_RPM) iterates over the four predefined categories. If a caller later invokes withRateLimit("some_new_category", ...), the override won't apply — getRPM falls back to 60. This is fine for the current four OpenAI categories but is worth documenting if the set of categories may grow.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/rate-limiter.ts` around lines 514 - 526, The global RPM override
logic currently reads OPENAI_RATE_LIMIT_RPM into envRPM and only applies it to
the prepopulated keys from DEFAULT_RPM via limiter.setRPM, so any later-created
categories (used by withRateLimit or looked up via getRPM) won't inherit the
override; change the implementation so the override is applied centrally —
either have limiter.getRPM consult the parsed envRPM if no explicit RPM is set
for a category, or ensure limiter.setRPM is invoked for newly-created categories
by storing the parsed rpm globally and applying it when
withRateLimit/createCategory is called; reference envRPM, DEFAULT_RPM,
limiter.setRPM, getRPM and withRateLimit to locate where to add this fallback
behavior.
294-301: Use parseInt for OPENAI_RATE_LIMIT_MAX_RETRIES to avoid fractional retry counts.

Number(envMaxRetriesRaw) parses "3.5" as 3.5, which flows into the loop comparison attempt <= 3.5 and yields 4 iterations — a surprising result for a "max retries" setting. Using parseInt (consistent with the RPM parsing on line 519) makes the intent explicit.
Proposed fix
-    const parsed = Number(envMaxRetriesRaw);
-    if (Number.isFinite(parsed) && parsed >= 0) {
+    const parsed = parseInt(envMaxRetriesRaw, 10);
+    if (!isNaN(parsed) && parsed >= 0) {
       envMaxRetries = parsed;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/rate-limiter.ts` around lines 294 - 301, The envMaxRetries parsing
currently uses Number(envMaxRetriesRaw) which allows fractional values (e.g.,
"3.5") to slip through; update the logic that sets envMaxRetries from
OPENAI_RATE_LIMIT_MAX_RETRIES to use parseInt(envMaxRetriesRaw, 10) instead,
validate the result is a finite integer >= 0, and assign that integer to
envMaxRetries (mirroring how RPM is parsed elsewhere); keep the same variable
names envMaxRetriesRaw and envMaxRetries and ensure non-numeric or negative
values leave envMaxRetries undefined.
433-448: String-matching heuristic for billing detection in SDK-wrapped errors has a minor false-positive surface.

Any Error whose message coincidentally contains "quota" or "billing" (but is not actually a 429/billing error) would be classified as isBillingError: true, causing withRateLimit to skip retries and re-throw immediately. Since this only runs on errors thrown inside fn() (i.e., during OpenAI API calls), the practical risk is very low, but worth keeping in mind if the wrapper is ever used around non-OpenAI calls.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/rate-limiter.ts` around lines 433 - 448, The current error-string
heuristic marks any Error whose message contains "quota" or "billing" as
isBillingError, causing withRateLimit to skip retries on benign errors; change
the detection so isBillingError is true only when billing keywords appear
together with an explicit rate-limit indicator or OpenAI-specific error
metadata: keep the existing rate-limit checks (msg.includes("429") / "rate
limit" / "too many requests") and only set isBillingError when isBilling && (msg
contains a rate-limit token OR error has OpenAI-style markers such as error.code
=== "insufficient_quota" or error.response?.status === 429 or error.type ===
"insufficient_quota"); update the block around the existing error instanceof
Error handling that sets isRateLimit and isBillingError used by withRateLimit
accordingly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/utils/rate-limiter.ts`:
- Around line 496-498: The logTierOnce function currently only considers
responses with status === 200 and thus skips other successful 2xx responses;
update the guard in logTierOnce to use response.ok (e.g., if (!response.ok)
return) so any 2xx success will allow tier detection to run, leaving the
tierLoggedOnce short-circuit unchanged to preserve single-time logging.
- Around line 394-399: The Retry-After header handling only parses numeric
seconds (retryAfter -> retrySeconds -> retryMs) and ignores valid HTTP-date
values; update the logic that reads response.headers.get("retry-after") so it
first attempts parseFloat and if that is NaN, tries to parse an HTTP-date (e.g.,
via Date.parse(retryAfter)) and computes retryMs as the difference between the
parsed date and Date.now() (clamped to >=0); keep existing behavior of using
numeric seconds when present and ensure invalid parses result in undefined.
- Around line 514-526: The global RPM override logic currently reads
OPENAI_RATE_LIMIT_RPM into envRPM and only applies it to the prepopulated keys
from DEFAULT_RPM via limiter.setRPM, so any later-created categories (used by
withRateLimit or looked up via getRPM) won't inherit the override; change the
implementation so the override is applied centrally — either have limiter.getRPM
consult the parsed envRPM if no explicit RPM is set for a category, or ensure
limiter.setRPM is invoked for newly-created categories by storing the parsed rpm
globally and applying it when withRateLimit/createCategory is called; reference
envRPM, DEFAULT_RPM, limiter.setRPM, getRPM and withRateLimit to locate where to
add this fallback behavior.
- Around line 294-301: The envMaxRetries parsing currently uses
Number(envMaxRetriesRaw) which allows fractional values (e.g., "3.5") to slip
through; update the logic that sets envMaxRetries from
OPENAI_RATE_LIMIT_MAX_RETRIES to use parseInt(envMaxRetriesRaw, 10) instead,
validate the result is a finite integer >= 0, and assign that integer to
envMaxRetries (mirroring how RPM is parsed elsewhere); keep the same variable
names envMaxRetriesRaw and envMaxRetries and ensure non-numeric or negative
values leave envMaxRetries undefined.
- Around line 433-448: The current error-string heuristic marks any Error whose
message contains "quota" or "billing" as isBillingError, causing withRateLimit
to skip retries on benign errors; change the detection so isBillingError is true
only when billing keywords appear together with an explicit rate-limit indicator
or OpenAI-specific error metadata: keep the existing rate-limit checks
(msg.includes("429") / "rate limit" / "too many requests") and only set
isBillingError when isBilling && (msg contains a rate-limit token OR error has
OpenAI-style markers such as error.code === "insufficient_quota" or
error.response?.status === 429 or error.type === "insufficient_quota"); update
the block around the existing error instanceof Error handling that sets
isRateLimit and isBillingError used by withRateLimit accordingly.

src/models/embedding.ts

src/utils/rate-limiter.ts

Iteration 2 prr-fix:prrc_kwdon0mv_86ovemv

Changes: - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ... - rate-limiter.ts: ### Redundant sleep before acquire causes double-wait on retry **Low Severit...

Iteration 1 prr-fix:prrc_kwdon0mv_86ovemz

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (8)

src/banner.ts (3)

94-98: Extra leading space on line 95.

billingPrefix is indented with 3 spaces while the surrounding billingURL declaration uses 2. Inconsistent with the rest of the function body.
🧹 Proposed fix
  const billingURL = "https://platform.openai.com/settings/organization/billing";
-  const billingPrefix = "| Billing: ";
+  const billingPrefix = "| Billing: ";
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/banner.ts` around lines 94 - 98, The indentation of billingPrefix is
inconsistent (it has one extra leading space) compared to billingURL and the
surrounding code; fix by aligning billingPrefix's declaration indentation with
billingURL and billingRow so all three consts (billingURL, billingPrefix,
billingRow) share the same left margin, leaving the expressions unchanged
(billingPrefix = "| Billing: ", billingRow using WIDTH) to restore consistent
formatting.
57-59: Redundant ?? "text-embedding-3-small" fallback.

getSetting(runtime, "OPENAI_EMBEDDING_MODEL", "text-embedding-3-small") already returns the default string when neither the setting nor the env var is set, so the trailing ?? "text-embedding-3-small" is unreachable dead code.
🧹 Proposed simplification
- const embeddingModel =
-   getSetting(runtime, "OPENAI_EMBEDDING_MODEL", "text-embedding-3-small") ??
-   "text-embedding-3-small";
+ const embeddingModel =
+   getSetting(runtime, "OPENAI_EMBEDDING_MODEL", "text-embedding-3-small")!;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/banner.ts` around lines 57 - 59, Remove the unreachable fallback after
getSetting: the expression assigning embeddingModel uses getSetting(runtime,
"OPENAI_EMBEDDING_MODEL", "text-embedding-3-small") which already returns the
default, so drop the trailing ?? "text-embedding-3-small"; update the
embeddingModel assignment (around the embeddingModel symbol) to use only the
getSetting call and ensure no other code relies on the redundant nullish
coalescing.
100-111: console.log bypasses the shared logger used elsewhere in the plugin.

The rest of the codebase (e.g., src/utils/config.ts) routes output through the logger from @elizaos/core. Using console.log here means the banner output ignores any log-level filtering or transport configured on that logger. If guaranteed visibility at startup is the intent, logger.info achieves the same for typical configurations while staying consistent.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/banner.ts` around lines 100 - 111, Replace the direct console.log calls
in the banner rendering with the shared logger so output respects configured log
levels/transports: import the existing logger from '@elizaos/core' if not
already present, then change each console.log(...) call (the ones printing line,
title, divider, row(...), billingRow, and the trailing blank line) to
logger.info(...) so the banner (variables/functions: line, title, divider, row,
billingRow) is emitted through the centralized logger while preserving the same
string formatting and order.

src/models/embedding.ts (2)

165-165: logger.log on every successful embedding will be noisy — consider logger.debug.

Embeddings are called frequently during message processing. logger.log (typically info-level) will clutter output. Other success paths in this file use logger.debug.

-      logger.log(`Got valid embedding with length ${embedding.length}`);
+      logger.debug(`Got valid embedding with length ${embedding.length}`);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/models/embedding.ts` at line 165, Replace the current success-level
logging call logger.log(`Got valid embedding with length ${embedding.length}`)
with a debug-level call (logger.debug) so embedding success messages don't
clutter info logs; locate the usage in the function that validates/returns
embeddings (the line using embedding.length) and make it consistent with other
success paths that use logger.debug.

91-95: Placeholder issue link in comment.

Line 95 references issues/XXX — either link the real issue or drop the URL to avoid confusion for future readers.

-  // See: https://github.com/elizaos-plugins/plugin-openai/issues/XXX
+  // See discussion in PR `#24` for rationale.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/models/embedding.ts` around lines 91 - 95, The inline comment in
embedding.ts references a placeholder issue URL "issues/XXX"; update the comment
near the ensureEmbeddingDimension() probe and the block discussing
short-circuiting (the comment that starts "NOTE: We do NOT short-circuit...") by
either replacing "issues/XXX" with the correct issue/PR number or removing the
URL entirely so it doesn't confuse future readers—ensure the rest of the
explanatory text remains intact and that the comment still mentions that the
probe sends null and not a TextEmbeddingParams object to keep context for
ensureEmbeddingDimension().

src/utils/rate-limiter.ts (3)

513-524: Env RPM override only applies to predefined categories — intentional but worth noting.

OPENAI_RATE_LIMIT_RPM iterates Object.keys(DEFAULT_RPM), so any future category added via withRateLimit("new_cat", ...) without updating DEFAULT_RPM would fall back to 60 RPM and miss the override. Current usage is consistent, just something to keep in mind when adding new categories.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/utils/rate-limiter.ts` around lines 513 - 524, The env override for
OPENAI_RATE_LIMIT_RPM only applies to keys from DEFAULT_RPM, so categories added
later via withRateLimit(...) will miss the override; change the logic to apply
the RPM override to the limiter's actual category registry instead of
Object.keys(DEFAULT_RPM) (or also apply the override inside withRateLimit when
new categories are registered). Locate the envRPM handling around limiter.setRPM
and DEFAULT_RPM and ensure limiter.setRPM(cat, rpm) is invoked for every
registered category (or that withRateLimit calls limiter.setRPM when
OPENAI_RATE_LIMIT_RPM is set) so new categories receive the global override.

139-145: Remove unused currentPromises map — dead state that adds confusion without value.

The map is set at line 191 and deleted at line 247, but never read. The cleanup logic at line 245 compares against acquireQueues, not currentPromises. The comment claiming it tracks "current promise for cleanup comparison" is misleading. Removing this map eliminates dead state and clarifies the serialization logic.

Cleanup

Remove the declaration:

-  /**
-   * Track the current promise for cleanup comparison.
-   * Maps category to the "current" promise (before chaining).
-   */
-  private currentPromises = new Map<string, Promise<void>>();

Remove the set operation at line 191:

     this.acquireQueues.set(category, queued);
-    this.currentPromises.set(category, queued);

Remove the delete operation at line 247:

         if (this.acquireQueues.get(category) === queued) {
           this.acquireQueues.delete(category);
-          this.currentPromises.delete(category);
         }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/utils/rate-limiter.ts` around lines 139 - 145, Remove the dead Map
currentPromises and its related operations: delete the private declaration
currentPromises = new Map<string, Promise<void>>(), remove any assignments to
currentPromises.set(...) (the set at where the queue promise is stored), and
remove the corresponding currentPromises.delete(...) cleanup; also update or
remove the misleading comment that describes tracking "current promise for
cleanup comparison" so the code only uses acquireQueues for
serialization/cleanup and has no stale references to currentPromises. Ensure no
other code reads currentPromises before committing.

306-338: Add a defensive throw after the for loop for type safety.

The loop always exits via return or throw, making the post-loop code unreachable. However, with noImplicitReturns enabled by "strict": true in tsconfig.json, adding an explicit throw improves type clarity and guards against future refactors that could break the control-flow invariant.

Suggested addition

       throw error;
     }
   }
+  
+  // Unreachable: every iteration returns or throws. This satisfies the
+  // compiler and guards against future refactors that break the invariant.
+  throw new Error(`[OpenAI:RateLimit] ${category}: max retries exhausted`);
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/utils/rate-limiter.ts` around lines 306 - 338, The for-loop that calls
limiter.acquire(category) and returns await fn() can theoretically fall through
and TypeScript's noImplicitReturns wants an explicit exit; after the loop that
handles retries, add a defensive throw (e.g., throw new Error("Unreachable:
retry loop exited without returning")) to make the control flow explicit and
satisfy strict type checking; place this throw immediately after the closing
brace of the for loop that contains limiter.acquire(category),
extractRateLimitInfo(error), calculateBackoff(attempt),
limiter.recordBackoff(category, backoffMs), and sleep(backoffMs).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/banner.ts`:
- Around line 16-17: maskApiKey and keyStatus use different thresholds causing
contradictory output; update keyStatus to derive its value from maskApiKey so
they stay consistent (e.g., call maskApiKey(key) and treat "(not set)" as
falsey) or apply the same minimum-length check (10 chars) used in maskApiKey.
Specifically, modify the keyStatus computation to use the result of
maskApiKey(key) (or reuse the same length check) instead of checking only
truthiness so both maskApiKey and keyStatus produce matching state for keys
shorter than 10 characters.
- Around line 88-92: The row function uses padEnd which doesn't truncate long
strings, causing overflow when value + status exceed COL2_WIDTH; in function
row, truncate the combined rightContent (or truncate value to reserve space for
status) to at most COL2_WIDTH - 1 characters before calling padEnd, e.g.,
compute a maxLen = COL2_WIDTH - 1 and if `${value} ${status}` length > maxLen
slice it to maxLen (or slice value to maxLen - status.length - 1) and then
padEnd so the column never overflows; apply the same truncation logic to
leftContent using COL1_WIDTH - 1 if needed.

---

Nitpick comments:
In `@src/banner.ts`:
- Around line 94-98: The indentation of billingPrefix is inconsistent (it has
one extra leading space) compared to billingURL and the surrounding code; fix by
aligning billingPrefix's declaration indentation with billingURL and billingRow
so all three consts (billingURL, billingPrefix, billingRow) share the same left
margin, leaving the expressions unchanged (billingPrefix = "| Billing: ",
billingRow using WIDTH) to restore consistent formatting.
- Around line 57-59: Remove the unreachable fallback after getSetting: the
expression assigning embeddingModel uses getSetting(runtime,
"OPENAI_EMBEDDING_MODEL", "text-embedding-3-small") which already returns the
default, so drop the trailing ?? "text-embedding-3-small"; update the
embeddingModel assignment (around the embeddingModel symbol) to use only the
getSetting call and ensure no other code relies on the redundant nullish
coalescing.
- Around line 100-111: Replace the direct console.log calls in the banner
rendering with the shared logger so output respects configured log
levels/transports: import the existing logger from '@elizaos/core' if not
already present, then change each console.log(...) call (the ones printing line,
title, divider, row(...), billingRow, and the trailing blank line) to
logger.info(...) so the banner (variables/functions: line, title, divider, row,
billingRow) is emitted through the centralized logger while preserving the same
string formatting and order.

In `@src/models/embedding.ts`:
- Line 165: Replace the current success-level logging call logger.log(`Got valid
embedding with length ${embedding.length}`) with a debug-level call
(logger.debug) so embedding success messages don't clutter info logs; locate the
usage in the function that validates/returns embeddings (the line using
embedding.length) and make it consistent with other success paths that use
logger.debug.
- Around line 91-95: The inline comment in embedding.ts references a placeholder
issue URL "issues/XXX"; update the comment near the ensureEmbeddingDimension()
probe and the block discussing short-circuiting (the comment that starts "NOTE:
We do NOT short-circuit...") by either replacing "issues/XXX" with the correct
issue/PR number or removing the URL entirely so it doesn't confuse future
readers—ensure the rest of the explanatory text remains intact and that the
comment still mentions that the probe sends null and not a TextEmbeddingParams
object to keep context for ensureEmbeddingDimension().

In `@src/utils/rate-limiter.ts`:
- Around line 513-524: The env override for OPENAI_RATE_LIMIT_RPM only applies
to keys from DEFAULT_RPM, so categories added later via withRateLimit(...) will
miss the override; change the logic to apply the RPM override to the limiter's
actual category registry instead of Object.keys(DEFAULT_RPM) (or also apply the
override inside withRateLimit when new categories are registered). Locate the
envRPM handling around limiter.setRPM and DEFAULT_RPM and ensure
limiter.setRPM(cat, rpm) is invoked for every registered category (or that
withRateLimit calls limiter.setRPM when OPENAI_RATE_LIMIT_RPM is set) so new
categories receive the global override.
- Around line 139-145: Remove the dead Map currentPromises and its related
operations: delete the private declaration currentPromises = new Map<string,
Promise<void>>(), remove any assignments to currentPromises.set(...) (the set at
where the queue promise is stored), and remove the corresponding
currentPromises.delete(...) cleanup; also update or remove the misleading
comment that describes tracking "current promise for cleanup comparison" so the
code only uses acquireQueues for serialization/cleanup and has no stale
references to currentPromises. Ensure no other code reads currentPromises before
committing.
- Around line 306-338: The for-loop that calls limiter.acquire(category) and
returns await fn() can theoretically fall through and TypeScript's
noImplicitReturns wants an explicit exit; after the loop that handles retries,
add a defensive throw (e.g., throw new Error("Unreachable: retry loop exited
without returning")) to make the control flow explicit and satisfy strict type
checking; place this throw immediately after the closing brace of the for loop
that contains limiter.acquire(category), extractRateLimitInfo(error),
calculateBackoff(attempt), limiter.recordBackoff(category, backoffMs), and
sleep(backoffMs).

src/banner.ts

Iteration 1 prr-fix:prrc_kwdon0mv_86ov2j0 prr-fix:prrc_kwdon0mv_86ov2jy

Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ... - rate-limiter.ts: ### Redundant sleep before acquire causes double-wait on retry **Low Severit... - rate-limiter.ts: ### Unreachable throw after for-loop in withRateLimit **Low Severity** <!--... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...

Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...

Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ... - rate-limiter.ts: ### Redundant sleep before acquire causes double-wait on retry **Low Severit... - rate-limiter.ts: ### Unreachable throw after for-loop in withRateLimit **Low Severity** <!--... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...

Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...

Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ... - rate-limiter.ts: ### Redundant sleep before acquire causes double-wait on retry **Low Severit... - rate-limiter.ts: ### Unreachable throw after for-loop in withRateLimit **Low Severity** <!--... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...

Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...

Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ... - rate-limiter.ts: ### Redundant sleep before acquire causes double-wait on retry **Low Severit... - rate-limiter.ts: ### Unreachable throw after for-loop in withRateLimit **Low Severity** <!--... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`maskApiKey` and `keyStatus` use inconsi... - banner.ts: _⚠️ Potential issue_ | _🟡 Minor_ **`row` overflows for long values — `padEn...

Iteration 1 prr-fix:prrc_kwdon0mv_86nx3jh

Iteration 2 prr-fix:prrc_kwdon0mv_86nxzss prr-fix:prrc_kwdon0mv_86nx3ql prr-fix:prrc_kwdon0mv_86nx3qh prr-fix:prrc_kwdon0mv_86nx3jl prr-fix:prrc_kwdon0mv_86ovemz prr-fix:prrc_kwdon0mv_86ovemx

Iteration 7 prr-fix:prrc_kwdon0mv_86nx1xh prr-fix:prrc_kwdon0mv_86nx3qc prr-fix:prrc_kwdon0mv_86ovemv prr-fix:prrc_kwdon0mv_86of0ln

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

cursor · 2026-02-20T03:31:30Z

.prr/lessons.md

+- Fix for src/utils/rate-limiter.ts - I can see that the code at lines 296-306 already has a proper fix that checks for NaN and handles zero values correctly: typescript
+- Fix for src/utils/rate-limiter.ts:248 - tool modified wrong files (src/banner.ts, src/models/embedding.ts), need to modify src/utils/rate-limiter.ts
+- Fix for src/utils/rate-limiter.ts:248 - The cleanup condition must account for map updates during concurrent calls—identity checks fail when other acquire() calls mutate the map before cleanup runs.
+- Fix for src/utils/rate-limiter.ts:248 - The cleanup condition compares a stale promise reference against a freshly-created derived promise — they're guaranteed different objects on every subsequent call.


Auto-generated PR review tool state committed to repo

Low Severity

The .prr/lessons.md file contains auto-generated debugging output from the prr review tool — fix attempt logs, RESULT: ALREADY_FIXED markers, and stale line-number references. While the tool suggests committing it, the content is process-specific noise (e.g., "tool modified wrong files", references to now-outdated line numbers) that provides no value to repository readers and clutters the codebase.

Copilot AI review requested due to automatic review settings February 14, 2026 22:30

Copilot started reviewing on behalf of odilitime February 14, 2026 22:30 View session

greptile-apps bot reviewed Feb 14, 2026

View reviewed changes

src/utils/rate-limiter.ts Outdated Show resolved Hide resolved

coderabbitai bot reviewed Feb 14, 2026

View reviewed changes

package.json Outdated Show resolved Hide resolved

src/models/embedding.ts Outdated Show resolved Hide resolved

src/models/embedding.ts Outdated Show resolved Hide resolved

Copilot AI reviewed Feb 14, 2026

View reviewed changes

cursor bot reviewed Feb 14, 2026

View reviewed changes

src/banner.ts Outdated Show resolved Hide resolved

src/utils/rate-limiter.ts Outdated Show resolved Hide resolved

odilitime added 11 commits February 18, 2026 02:28

cursor bot reviewed Feb 18, 2026

View reviewed changes

src/utils/rate-limiter.ts Show resolved Hide resolved

utils: fix in cursor

cbf84d1

Iteration 11 prr-fix:prrc_kwdon0mv_86of0ln prr-fix:prrc_kwdon0mv_86nx3jh

coderabbitai bot reviewed Feb 20, 2026

View reviewed changes

src/models/embedding.ts Outdated Show resolved Hide resolved

odilitime added 2 commits February 20, 2026 01:50

banner.ts: fix string

1ef8bab

Changes: - banner.ts: ### Banner title and billing rows off-by-one width **Low Severity** <!-- DE... - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ...

coderabbitai bot reviewed Feb 20, 2026

View reviewed changes

cursor bot reviewed Feb 20, 2026

View reviewed changes

src/models/embedding.ts Outdated Show resolved Hide resolved

src/utils/rate-limiter.ts Show resolved Hide resolved

src/utils/rate-limiter.ts Show resolved Hide resolved

odilitime added 3 commits February 20, 2026 02:48

models: fix in cursor

386f55c

Iteration 2 prr-fix:prrc_kwdon0mv_86ovemv

utils: simplify implementation

546ea10

Changes: - rate-limiter.ts: ### Promise queue cleanup comparison is always false **Low Severity** <!-- ... - rate-limiter.ts: ### Redundant sleep before acquire causes double-wait on retry **Low Severit...

utils: fix in cursor

6e91d47

Iteration 1 prr-fix:prrc_kwdon0mv_86ovemz

coderabbitai bot reviewed Feb 20, 2026

View reviewed changes

src/banner.ts Show resolved Hide resolved

src/banner.ts Show resolved Hide resolved

odilitime added 14 commits February 20, 2026 02:57

banner.ts: fix it if

bd7defc

Iteration 1 prr-fix:prrc_kwdon0mv_86ov2j0 prr-fix:prrc_kwdon0mv_86ov2jy

banner.ts: fix string

1e105f1

Iteration 1 prr-fix:prrc_kwdon0mv_86nx3jh

utils: simplify implementation

786bd25

Iteration 2 prr-fix:prrc_kwdon0mv_86nxzss prr-fix:prrc_kwdon0mv_86nx3ql prr-fix:prrc_kwdon0mv_86nx3qh prr-fix:prrc_kwdon0mv_86nx3jl prr-fix:prrc_kwdon0mv_86ovemz prr-fix:prrc_kwdon0mv_86ovemx

utils: fix in cursor

5e06d42

Iteration 7 prr-fix:prrc_kwdon0mv_86nx1xh prr-fix:prrc_kwdon0mv_86nx3qc prr-fix:prrc_kwdon0mv_86ovemv prr-fix:prrc_kwdon0mv_86of0ln

cursor bot reviewed Feb 20, 2026

View reviewed changes

Conversation

odilitime commented Feb 14, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

coderabbitai bot commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Feb 20, 2026

Choose a reason for hiding this comment

Auto-generated PR review tool state committed to repo

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

odilitime commented Feb 14, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 14, 2026 •

edited

Loading