Skip to content

feat: Add OpenAI Batch API support for cost-optimized translations#3458

Draft
gabrielshanahan wants to merge 7 commits intomainfrom
gabrielshanahan/openai-batch-api
Draft

feat: Add OpenAI Batch API support for cost-optimized translations#3458
gabrielshanahan wants to merge 7 commits intomainfrom
gabrielshanahan/openai-batch-api

Conversation

@gabrielshanahan
Copy link
Contributor

@gabrielshanahan gabrielshanahan commented Feb 6, 2026

Summary

  • Adds OpenAI Batch API support for cost-optimized bulk translations, offering ~50% cost reduction compared to synchronous API calls
  • Implements a two-phase execution model: submission builds JSONL and uploads to OpenAI, then a poller monitors completion and re-queues chunks for result application -- all without blocking worker coroutines
  • Introduces a new WAITING_FOR_EXTERNAL chunk execution status and OpenAiBatchJobTracker entity to track the full lifecycle of external batch jobs with optimistic locking
  • Provides frontend batch mode selection with discount display, progress indicators for batch processing, and in-app/email notifications on completion
  • Structured as 7 independently deployable vertical slices across 80 files (7,045 lines added)

Commits

1. feat: add batch mode submission (df56a15b2)

Overview: Introduces the full OpenAI Batch API submission pipeline -- entity model, database migration, JSONL builder, validation service, and framework integration via the new WAITING_FOR_EXTERNAL execution status.

Implementation Flow: When MachineTranslationChunkProcessor detects useBatchApi=true, it delegates to BatchApiSubmissionService, which builds a JSONL file (encoding {jobId}:{keyId}:{languageId} in each custom_id), uploads it via the OpenAI Files API, creates a batch via the Batches API, persists an OpenAiBatchJobTracker, and throws WaitingForExternalException. The ChunkProcessingUtil catches this and sets the chunk to WAITING_FOR_EXTERNAL status, freeing the worker coroutine. Validation enforces provider type (OpenAI only), batch enablement, item count limits, and per-org/global concurrency caps.

Key Files:

  • ee/.../service/batch/BatchApiSubmissionService.kt -- Builds JSONL from Tolgee's prompt pipeline and submits to OpenAI
  • backend/.../batch/WaitingForExternalException.kt -- New exception type that signals the framework to pause (not fail) the chunk
  • backend/.../model/batch/OpenAiBatchJobTracker.kt -- JPA entity tracking external batch lifecycle with optimistic locking
  • ee/.../service/batch/BatchApiValidationService.kt -- Validates provider type, concurrency limits, and item counts
  • ee/.../service/OpenAiBatchApiServiceImpl.kt -- REST client for OpenAI Files API and Batches API with fake delegate support for testing

2. feat: add batch polling and result application (7bcf201b3)

Overview: Adds the asynchronous polling loop that monitors pending OpenAI batches and the result applier that writes completed translations back to the database.

Implementation Flow: OpenAiBatchPoller registers via SchedulingManager on application startup, polling at a configurable interval (default 60s). It uses LockingProvider for multi-instance safety. Each tracker is polled in its own transaction. When a batch completes, the poller downloads the JSONL output, parses results into OpenAiBatchResult objects, stores them on the tracker, and re-queues the chunk execution as PENDING. On the second pass, MachineTranslationChunkProcessor detects the existing tracker and delegates to BatchApiResultApplier, which reads parsed results and calls TranslationService.setTranslationText() for each. Progress is reported via WebSocket with a new batchApiPhase field.

Key Files:

  • ee/.../service/batch/OpenAiBatchPoller.kt -- Scheduled poller with distributed locking, timeout handling, and file cleanup
  • ee/.../service/batch/BatchApiResultApplier.kt -- Applies parsed results as translations with graceful handling of deleted keys/languages
  • ee/.../unit/batch/BatchApiTranslationIntegrationTest.kt -- End-to-end integration test for the full submit-poll-apply flow
  • backend/.../batch/ProgressManager.kt -- Extended with reportExternalProgress() for batch API phase updates via WebSocket

3. feat: add batch mode UI selection with discount display (67145eeda)

Overview: Adds the frontend batch mode selector component, LLM provider batch API configuration UI, and a new backend endpoint that returns batch availability and discount information.

Implementation Flow: The BatchTranslateInfoController exposes a GET /v2/projects/{projectId}/batch-translate-info endpoint that queries the organization's LLM providers for batch API availability and calculates the discount percentage. The frontend BatchModeSelector component renders an instant/batch radio toggle in the machine translation dialog, showing the discount (e.g., "Batch (50% cheaper)"). The BatchIndicator component shows a clock icon with "Batch processing..." during the WAITING_FOR_OPENAI phase. The LLM provider form gains a new "Batch API" section with a toggle switch.

Key Files:

  • webapp/.../components/BatchModeSelector.tsx -- Radio toggle for instant vs. batch mode with discount display
  • webapp/.../LlmProviderEdit/BatchApiSettings.tsx -- Batch API enable/disable toggle in LLM provider configuration form
  • ee/.../controllers/BatchTranslateInfoController.kt -- REST endpoint returning batch availability and discount percentage
  • webapp/.../OperationsSummary/BatchIndicator.tsx -- Shows batch processing status with clock icon during async phase
  • webapp/.../OperationMachineTranslate.tsx -- Integrates batch mode selector into the machine translation dialog

4. feat: add batch cancellation and error recovery (95a6ecbea)

Overview: Adds the ability to cancel in-flight OpenAI batch jobs and extends the batch job cancellation manager to handle WAITING_FOR_EXTERNAL chunk executions.

Implementation Flow: OpenAiBatchCancellationHandler implements the BatchApiCancellationHandler interface. When a user cancels a batch job, the BatchJobCancellationManager detects chunks in WAITING_FOR_EXTERNAL status and delegates to the handler, which sends a POST /v1/batches/{id}/cancel request to OpenAI and marks the tracker as CANCELLED. The handler gracefully handles cases where the provider or API key is unavailable.

Key Files:

  • ee/.../service/batch/OpenAiBatchCancellationHandler.kt -- Sends cancel requests to OpenAI and updates tracker status
  • backend/.../batch/BatchJobCancellationManager.kt -- Extended to handle WAITING_FOR_EXTERNAL executions via the handler interface
  • ee/.../controllers/batch/BatchApiErrorHandlingTest.kt -- Integration tests for cancellation and error recovery

5. feat: add batch notifications (2de1f0655)

Overview: Adds in-app and email notifications when batch translation jobs complete, fail, or are cancelled.

Implementation Flow: BatchJobNotificationListener listens for OnBatchJobSucceeded, OnBatchJobFailed, and OnBatchJobCancelled Spring events. It creates a BATCH_JOB_FINISHED notification linked to the batch job and project. The BatchJobEmailComposer generates email content and the frontend BatchJobFinishedItem renders the notification with job type, status, and progress.

Key Files:

  • ee/.../component/BatchJobNotificationListener.kt -- Transactional event listener for batch job completion events
  • backend/.../notification/BatchJobEmailComposer.kt -- Composes email content for batch job notifications
  • webapp/.../Notifications/BatchJobFinishedItem.tsx -- Renders batch job notification with type, status, and progress

6. feat: add batch health monitoring (b2fe44200)

Overview: Adds a Spring Boot health indicator for the batch API poller, enabling monitoring via /actuator/health.

Implementation Flow: BatchPollerHealthIndicator reports the number of active batch trackers (in SUBMITTED or IN_PROGRESS status) alongside the configured global concurrency limit, allowing ops teams to monitor batch API utilization and detect stuck jobs.

Key Files:

  • backend/.../health/BatchPollerHealthIndicator.kt -- Health indicator reporting active tracker count and concurrency limits

7. docs: add batch API documentation (8087d9093)

Overview: Adds comprehensive documentation covering architecture, configuration, state machine, monitoring, and operational runbook.

Key Files:

  • docs/batch-api/README.md -- Full architecture, configuration, monitoring, and ops runbook documentation

Screenshots

Screenshots captured from local testing with fake batch API (mock server):

LLM Provider Configuration

Server-configured "openai-batch" provider with batch API enabled, and the LLM provider creation form showing the new "Batch API" toggle section.

Batch Mode Selection

When selecting "Machine translation" batch operation, users see a mode selector with "Instant" (default) and "Batch" radio buttons. Batch mode shows the discount percentage.

Batch Processing Flow

After submission, a progress dialog shows "Translating..." with a minimizable view. The batch indicator shows "Batch processing - safe to close" in the translation toolbar. On completion, a "Done" indicator appears with a "Translations might be outdated" refresh prompt.

Batch Cancellation

Clicking "ABORT" during batch processing sends a cancellation request to OpenAI. The dialog shows "Canceled" status on completion.

Test plan

  • Unit tests pass (60 tests across 8 test classes)
  • Backend compiles with each vertical slice commit independently
  • Batch mode selection UI shows with discount percentage when batch-enabled provider configured
  • Batch submission creates OpenAI batch job and shows progress indicator
  • Batch completion applies translations and shows "Done" indicator
  • Batch cancellation sends cancel request and marks job as cancelled
  • Notifications are sent when batch jobs complete
  • Health indicator reports batch poller status at /actuator/health
  • Existing non-batch translation flow is unaffected

🤖 Generated with Claude Code

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 6, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch gabrielshanahan/openai-batch-api

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Gabriel Shanahan and others added 7 commits February 6, 2026 10:39
Add OpenAI Batch API submission pipeline: entity model, migration,
JSONL builder, validation, and submission service with framework
integration for WAITING_FOR_EXTERNAL status.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add OpenAI Batch API poller, result applier, and progress tracking.
Include WebSocket support for batch API phase updates and integration
tests for the full batch translation flow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add frontend batch mode selector, LLM provider batch API settings,
batch translate info endpoint, and discount calculation for batch
vs sync pricing comparison.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add OpenAI batch cancellation handler, extend cancellation manager
to handle WAITING_FOR_EXTERNAL executions, and add error handling
integration tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add in-app and email notifications for batch job completion,
including notification model extensions, email composer, and
frontend notification components.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Spring Boot health indicator for batch API poller status,
enabling operational monitoring via /actuator/health endpoint.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive documentation covering architecture, configuration,
monitoring, and operational runbook for the OpenAI Batch API feature.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@gabrielshanahan gabrielshanahan force-pushed the gabrielshanahan/openai-batch-api branch from 82d7f77 to 8087d90 Compare February 6, 2026 09:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant