-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Rejig dictation #6844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Rejig dictation #6844
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the voice dictation feature to simplify the codebase before implementing local whisper support. The changes consolidate multiple scattered files and components into a unified architecture, replacing localStorage-based settings with backend configuration storage.
Changes:
- Consolidated audio transcription endpoints from
/audio/*to/dictation/*with unified provider handling - Replaced localStorage settings persistence with backend config API (
voice_dictation_provider) - Simplified UI components by merging 5 separate dictation components into a single
DictationSettingscomponent - Removed the
VOICE_DICTATION_ELEVENLABS_ENABLEDfeature flag - Replaced complex waveform visualizer with simple "Recording..." indicator
- Removed 378 lines from
useWhisper.tsand replaced with 249-line simplifieduseAudioRecorder.ts
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| ui/desktop/src/updates.ts | Removed ElevenLabs feature flag |
| ui/desktop/src/hooks/useWhisper.ts | Deleted, replaced by useAudioRecorder.ts |
| ui/desktop/src/hooks/useDictationSettings.ts | Deleted, settings now in backend config |
| ui/desktop/src/hooks/dictationConstants.ts | Deleted, constants moved to backend |
| ui/desktop/src/hooks/useAudioRecorder.ts | New simplified audio recording hook using backend API |
| ui/desktop/src/components/settings/dictation/* | Five component files deleted, replaced with single DictationSettings.tsx |
| ui/desktop/src/components/settings/dictation/DictationSettings.tsx | New unified settings component with provider selection and API key management |
| ui/desktop/src/components/settings/chat/ChatSettingsSection.tsx | Updated imports and reorganized settings layout |
| ui/desktop/src/components/ChatInput.tsx | Updated to use new hook, removed WaveformVisualizer, simplified recording UI |
| ui/desktop/src/api/* | Generated types and SDK methods for new dictation endpoints |
| ui/desktop/openapi.json | Added new dictation endpoint schemas |
| crates/goose-server/src/routes/mod.rs | Replaced audio module with dictation module |
| crates/goose-server/src/routes/dictation.rs | New unified backend handling both OpenAI and ElevenLabs with provider-agnostic API |
| crates/goose-server/src/routes/audio.rs | Deleted old implementation |
| crates/goose-server/src/openapi.rs | Updated OpenAPI schema definitions |
| audio: { | ||
| echoCancellation: true, | ||
| noiseSuppression: true, | ||
| autoGainControl: true, |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sampleRate: 44100 constraint was removed from the audio configuration. While the browser will use its default sample rate, this could result in inconsistent audio quality across different browsers and devices. Consider keeping an explicit sample rate to ensure consistent transcription quality.
| autoGainControl: true, | |
| autoGainControl: true, | |
| sampleRate: 44100, |
| pub fn routes(state: Arc<AppState>) -> Router { | ||
| Router::new() | ||
| .route("/dictation/transcribe", post(transcribe_dictation)) | ||
| .route("/dictation/config", get(get_dictation_config)) | ||
| .with_state(state) | ||
| } |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old audio.rs tests were removed but no replacement tests were added for the new dictation.rs module. Consider adding tests to verify the transcription endpoints and provider configuration logic work correctly.
| <b>Chat</b> {'>'} <b>Voice Dictation.</b> | ||
| </p> | ||
| ) : dictationSettings.provider === null ? ( | ||
| ) : dictationProvider === null ? ( |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable 'dictationProvider' cannot be of type null, but it is compared to an expression of type null.
| ) : dictationProvider === null ? ( | |
| ) : dictationProvider === undefined ? ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 22 out of 22 changed files in this pull request and generated no new comments.
|
we talked about cleaning this up @Abhijay007 & @lifeizhou-ap - here's my take. my next plan is to build support for local whisper. |
Summary
I was going to look at making whisper run locally, but noticed that the current flows are, eh, a mess, so cleaning that up first.