Skip to content

Conversation

@DOsinga
Copy link
Collaborator

@DOsinga DOsinga commented Jan 30, 2026

Summary

I was going to look at making whisper run locally, but noticed that the current flows are, eh, a mess, so cleaning that up first.

Copilot AI review requested due to automatic review settings January 30, 2026 17:26
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the voice dictation feature to simplify the codebase before implementing local whisper support. The changes consolidate multiple scattered files and components into a unified architecture, replacing localStorage-based settings with backend configuration storage.

Changes:

  • Consolidated audio transcription endpoints from /audio/* to /dictation/* with unified provider handling
  • Replaced localStorage settings persistence with backend config API (voice_dictation_provider)
  • Simplified UI components by merging 5 separate dictation components into a single DictationSettings component
  • Removed the VOICE_DICTATION_ELEVENLABS_ENABLED feature flag
  • Replaced complex waveform visualizer with simple "Recording..." indicator
  • Removed 378 lines from useWhisper.ts and replaced with 249-line simplified useAudioRecorder.ts

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
ui/desktop/src/updates.ts Removed ElevenLabs feature flag
ui/desktop/src/hooks/useWhisper.ts Deleted, replaced by useAudioRecorder.ts
ui/desktop/src/hooks/useDictationSettings.ts Deleted, settings now in backend config
ui/desktop/src/hooks/dictationConstants.ts Deleted, constants moved to backend
ui/desktop/src/hooks/useAudioRecorder.ts New simplified audio recording hook using backend API
ui/desktop/src/components/settings/dictation/* Five component files deleted, replaced with single DictationSettings.tsx
ui/desktop/src/components/settings/dictation/DictationSettings.tsx New unified settings component with provider selection and API key management
ui/desktop/src/components/settings/chat/ChatSettingsSection.tsx Updated imports and reorganized settings layout
ui/desktop/src/components/ChatInput.tsx Updated to use new hook, removed WaveformVisualizer, simplified recording UI
ui/desktop/src/api/* Generated types and SDK methods for new dictation endpoints
ui/desktop/openapi.json Added new dictation endpoint schemas
crates/goose-server/src/routes/mod.rs Replaced audio module with dictation module
crates/goose-server/src/routes/dictation.rs New unified backend handling both OpenAI and ElevenLabs with provider-agnostic API
crates/goose-server/src/routes/audio.rs Deleted old implementation
crates/goose-server/src/openapi.rs Updated OpenAPI schema definitions

audio: {
echoCancellation: true,
noiseSuppression: true,
autoGainControl: true,
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sampleRate: 44100 constraint was removed from the audio configuration. While the browser will use its default sample rate, this could result in inconsistent audio quality across different browsers and devices. Consider keeping an explicit sample rate to ensure consistent transcription quality.

Suggested change
autoGainControl: true,
autoGainControl: true,
sampleRate: 44100,

Copilot uses AI. Check for mistakes.
Comment on lines +407 to +412
pub fn routes(state: Arc<AppState>) -> Router {
Router::new()
.route("/dictation/transcribe", post(transcribe_dictation))
.route("/dictation/config", get(get_dictation_config))
.with_state(state)
}
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old audio.rs tests were removed but no replacement tests were added for the new dictation.rs module. Consider adding tests to verify the transcription endpoints and provider configuration logic work correctly.

Copilot uses AI. Check for mistakes.
<b>Chat</b> {'>'} <b>Voice Dictation.</b>
</p>
) : dictationSettings.provider === null ? (
) : dictationProvider === null ? (
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable 'dictationProvider' cannot be of type null, but it is compared to an expression of type null.

Suggested change
) : dictationProvider === null ? (
) : dictationProvider === undefined ? (

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings January 30, 2026 18:54
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated no new comments.

@DOsinga
Copy link
Collaborator Author

DOsinga commented Jan 30, 2026

we talked about cleaning this up @Abhijay007 & @lifeizhou-ap - here's my take. my next plan is to build support for local whisper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants