feat: Add OpenAI Batch API support for cost-optimized translations#3458
Draft
gabrielshanahan wants to merge 7 commits intomainfrom
Draft
feat: Add OpenAI Batch API support for cost-optimized translations#3458gabrielshanahan wants to merge 7 commits intomainfrom
gabrielshanahan wants to merge 7 commits intomainfrom
Conversation
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Add OpenAI Batch API submission pipeline: entity model, migration, JSONL builder, validation, and submission service with framework integration for WAITING_FOR_EXTERNAL status. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add OpenAI Batch API poller, result applier, and progress tracking. Include WebSocket support for batch API phase updates and integration tests for the full batch translation flow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add frontend batch mode selector, LLM provider batch API settings, batch translate info endpoint, and discount calculation for batch vs sync pricing comparison. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add OpenAI batch cancellation handler, extend cancellation manager to handle WAITING_FOR_EXTERNAL executions, and add error handling integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add in-app and email notifications for batch job completion, including notification model extensions, email composer, and frontend notification components. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Spring Boot health indicator for batch API poller status, enabling operational monitoring via /actuator/health endpoint. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive documentation covering architecture, configuration, monitoring, and operational runbook for the OpenAI Batch API feature. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
82d7f77 to
8087d90
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
WAITING_FOR_EXTERNALchunk execution status andOpenAiBatchJobTrackerentity to track the full lifecycle of external batch jobs with optimistic lockingCommits
1. feat: add batch mode submission (
df56a15b2)Overview: Introduces the full OpenAI Batch API submission pipeline -- entity model, database migration, JSONL builder, validation service, and framework integration via the new
WAITING_FOR_EXTERNALexecution status.Implementation Flow: When
MachineTranslationChunkProcessordetectsuseBatchApi=true, it delegates toBatchApiSubmissionService, which builds a JSONL file (encoding{jobId}:{keyId}:{languageId}in eachcustom_id), uploads it via the OpenAI Files API, creates a batch via the Batches API, persists anOpenAiBatchJobTracker, and throwsWaitingForExternalException. TheChunkProcessingUtilcatches this and sets the chunk toWAITING_FOR_EXTERNALstatus, freeing the worker coroutine. Validation enforces provider type (OpenAI only), batch enablement, item count limits, and per-org/global concurrency caps.Key Files:
ee/.../service/batch/BatchApiSubmissionService.kt-- Builds JSONL from Tolgee's prompt pipeline and submits to OpenAIbackend/.../batch/WaitingForExternalException.kt-- New exception type that signals the framework to pause (not fail) the chunkbackend/.../model/batch/OpenAiBatchJobTracker.kt-- JPA entity tracking external batch lifecycle with optimistic lockingee/.../service/batch/BatchApiValidationService.kt-- Validates provider type, concurrency limits, and item countsee/.../service/OpenAiBatchApiServiceImpl.kt-- REST client for OpenAI Files API and Batches API with fake delegate support for testing2. feat: add batch polling and result application (
7bcf201b3)Overview: Adds the asynchronous polling loop that monitors pending OpenAI batches and the result applier that writes completed translations back to the database.
Implementation Flow:
OpenAiBatchPollerregisters viaSchedulingManageron application startup, polling at a configurable interval (default 60s). It usesLockingProviderfor multi-instance safety. Each tracker is polled in its own transaction. When a batch completes, the poller downloads the JSONL output, parses results intoOpenAiBatchResultobjects, stores them on the tracker, and re-queues the chunk execution asPENDING. On the second pass,MachineTranslationChunkProcessordetects the existing tracker and delegates toBatchApiResultApplier, which reads parsed results and callsTranslationService.setTranslationText()for each. Progress is reported via WebSocket with a newbatchApiPhasefield.Key Files:
ee/.../service/batch/OpenAiBatchPoller.kt-- Scheduled poller with distributed locking, timeout handling, and file cleanupee/.../service/batch/BatchApiResultApplier.kt-- Applies parsed results as translations with graceful handling of deleted keys/languagesee/.../unit/batch/BatchApiTranslationIntegrationTest.kt-- End-to-end integration test for the full submit-poll-apply flowbackend/.../batch/ProgressManager.kt-- Extended withreportExternalProgress()for batch API phase updates via WebSocket3. feat: add batch mode UI selection with discount display (
67145eeda)Overview: Adds the frontend batch mode selector component, LLM provider batch API configuration UI, and a new backend endpoint that returns batch availability and discount information.
Implementation Flow: The
BatchTranslateInfoControllerexposes aGET /v2/projects/{projectId}/batch-translate-infoendpoint that queries the organization's LLM providers for batch API availability and calculates the discount percentage. The frontendBatchModeSelectorcomponent renders an instant/batch radio toggle in the machine translation dialog, showing the discount (e.g., "Batch (50% cheaper)"). TheBatchIndicatorcomponent shows a clock icon with "Batch processing..." during theWAITING_FOR_OPENAIphase. The LLM provider form gains a new "Batch API" section with a toggle switch.Key Files:
webapp/.../components/BatchModeSelector.tsx-- Radio toggle for instant vs. batch mode with discount displaywebapp/.../LlmProviderEdit/BatchApiSettings.tsx-- Batch API enable/disable toggle in LLM provider configuration formee/.../controllers/BatchTranslateInfoController.kt-- REST endpoint returning batch availability and discount percentagewebapp/.../OperationsSummary/BatchIndicator.tsx-- Shows batch processing status with clock icon during async phasewebapp/.../OperationMachineTranslate.tsx-- Integrates batch mode selector into the machine translation dialog4. feat: add batch cancellation and error recovery (
95a6ecbea)Overview: Adds the ability to cancel in-flight OpenAI batch jobs and extends the batch job cancellation manager to handle
WAITING_FOR_EXTERNALchunk executions.Implementation Flow:
OpenAiBatchCancellationHandlerimplements theBatchApiCancellationHandlerinterface. When a user cancels a batch job, theBatchJobCancellationManagerdetects chunks inWAITING_FOR_EXTERNALstatus and delegates to the handler, which sends aPOST /v1/batches/{id}/cancelrequest to OpenAI and marks the tracker asCANCELLED. The handler gracefully handles cases where the provider or API key is unavailable.Key Files:
ee/.../service/batch/OpenAiBatchCancellationHandler.kt-- Sends cancel requests to OpenAI and updates tracker statusbackend/.../batch/BatchJobCancellationManager.kt-- Extended to handleWAITING_FOR_EXTERNALexecutions via the handler interfaceee/.../controllers/batch/BatchApiErrorHandlingTest.kt-- Integration tests for cancellation and error recovery5. feat: add batch notifications (
2de1f0655)Overview: Adds in-app and email notifications when batch translation jobs complete, fail, or are cancelled.
Implementation Flow:
BatchJobNotificationListenerlistens forOnBatchJobSucceeded,OnBatchJobFailed, andOnBatchJobCancelledSpring events. It creates aBATCH_JOB_FINISHEDnotification linked to the batch job and project. TheBatchJobEmailComposergenerates email content and the frontendBatchJobFinishedItemrenders the notification with job type, status, and progress.Key Files:
ee/.../component/BatchJobNotificationListener.kt-- Transactional event listener for batch job completion eventsbackend/.../notification/BatchJobEmailComposer.kt-- Composes email content for batch job notificationswebapp/.../Notifications/BatchJobFinishedItem.tsx-- Renders batch job notification with type, status, and progress6. feat: add batch health monitoring (
b2fe44200)Overview: Adds a Spring Boot health indicator for the batch API poller, enabling monitoring via
/actuator/health.Implementation Flow:
BatchPollerHealthIndicatorreports the number of active batch trackers (inSUBMITTEDorIN_PROGRESSstatus) alongside the configured global concurrency limit, allowing ops teams to monitor batch API utilization and detect stuck jobs.Key Files:
backend/.../health/BatchPollerHealthIndicator.kt-- Health indicator reporting active tracker count and concurrency limits7. docs: add batch API documentation (
8087d9093)Overview: Adds comprehensive documentation covering architecture, configuration, state machine, monitoring, and operational runbook.
Key Files:
docs/batch-api/README.md-- Full architecture, configuration, monitoring, and ops runbook documentationScreenshots
Screenshots captured from local testing with fake batch API (mock server):
LLM Provider Configuration
Server-configured "openai-batch" provider with batch API enabled, and the LLM provider creation form showing the new "Batch API" toggle section.
Batch Mode Selection
When selecting "Machine translation" batch operation, users see a mode selector with "Instant" (default) and "Batch" radio buttons. Batch mode shows the discount percentage.
Batch Processing Flow
After submission, a progress dialog shows "Translating..." with a minimizable view. The batch indicator shows "Batch processing - safe to close" in the translation toolbar. On completion, a "Done" indicator appears with a "Translations might be outdated" refresh prompt.
Batch Cancellation
Clicking "ABORT" during batch processing sends a cancellation request to OpenAI. The dialog shows "Canceled" status on completion.
Test plan
/actuator/health🤖 Generated with Claude Code