fix(voice): create ElevenLabs tools before agent setup#459
fix(voice): create ElevenLabs tools before agent setup#459Shujakuinkuraudo wants to merge 1 commit intotiann:mainfrom
Conversation
There was a problem hiding this comment.
Findings
- [Major] Existing Hapi tool records are reused by name without any schema reconciliation, so this
tool_idsmigration will not actually apply laterVOICE_TOOLSchanges to accounts that already have migrated tools. This PR already changes the sharedprocessPermissionRequestschema by adding an enum, butensureHapiToolIds()just reattaches the old IDs and skips any update path, leaving those tool definitions stale indefinitely. Evidencehub/src/web/routes/voice.ts:168,web/src/api/voice.ts:129,shared/src/voice.ts:188.
Suggested fix:async function upsertTool(apiKey: string, existing: ElevenLabsTool | undefined, toolConfig: VoiceToolConfig): Promise<string> { if (!existing) { return await createTool(apiKey, toolConfig) } if (!isSameToolConfig(existing.tool_config, toolConfig)) { await updateTool(apiKey, existing.id, toolConfig) } return existing.id }
Summary
Review mode: initial
1 issue found in the new ElevenLabs tool-id migration path: existing tool IDs are reused without updating their definitions, so users with previously created Hapi tools will keep stale schemas/config after this change. No repo test coverage was added for create-vs-reuse-vs-update behavior in the voice tool provisioning flow.
HAPI Bot
Testing
Not run (automation): bun is not installed in this runner, so repo typecheck/tests could not be executed.
|
|
||
| const toolIds: string[] = [] | ||
| for (const request of buildVoiceToolRequests()) { | ||
| const existingId = toolIdByName.get(request.tool_config.name) |
There was a problem hiding this comment.
[MAJOR] Reusing any existing tool ID here means previously migrated Hapi tools never receive later schema/config changes. This PR already mutates processPermissionRequest in shared/src/voice.ts, but accounts that already have that tool will keep the stale definition forever because there is no PATCH/update path.
Suggested fix:
const existingTool = existingToolsByName.get(request.tool_config.name)
const toolId = existingTool
? await updateToolIfNeeded(apiKey, existingTool, request.tool_config)
: await createTool(apiKey, request.tool_config)
toolIds.push(toolId)There was a problem hiding this comment.
Findings
- [Major] Switching
voiceModeonly swaps which status/toggle the composer reads; it never stops the previously active assistant/dictation pipeline. If a user changes modes while voice is already connected, the old session keeps running in the background while the UI hides its stop/mute controls. Evidenceweb/src/components/AssistantChat/HappyComposer.tsx:210,web/src/routes/settings/index.tsx:425.
Suggested fix:useEffect(() => { if (voiceMode !== 'assistant' && (voiceStatus === 'connected' || voiceStatus === 'connecting')) { onVoiceToggle?.() } if (voiceMode !== 'dictation-local' && (dictation.status === 'connected' || dictation.status === 'connecting')) { dictation.stop() } if ( voiceMode !== 'dictation-elevenlabs' && (elevenLabsDictation.status === 'connected' || elevenLabsDictation.status === 'connecting') ) { void elevenLabsDictation.stop() } }, [voiceMode, voiceStatus, onVoiceToggle, dictation.status, dictation.stop, elevenLabsDictation.status, elevenLabsDictation.stop])
- [Major] Both dictation hooks snapshot the composer text once at start and rebuild the draft from that stale base on every transcript event, so any manual edits made while recording are overwritten on the next partial/final transcript. That is a draft-loss regression. Evidence
web/src/hooks/useSpeechToText.ts:89,web/src/hooks/useElevenLabsTranscription.ts:106.
Suggested fix:const currentText = config.getCurrentText() if (currentText !== lastTranscriptTextRef.current) { baseTextRef.current = currentText } const nextText = normalizeText(baseTextRef.current, nextTranscript) lastTranscriptTextRef.current = nextText config.onTextChange(nextText)
Summary
Review mode: follow-up after new commits
2 issues found in the new dictation-mode flow: mode changes can orphan an active voice capture session, and transcript updates can clobber manual draft edits while recording. No tests cover mode-switch cleanup or transcript/manual-edit interleaving.
Testing
Not run (automation): bun is not installed in this runner, so repo typecheck/tests could not be executed.
HAPI Bot
| onTextChange: (text) => api.composer().setText(text) | ||
| }) | ||
|
|
||
| const effectiveVoiceStatus = voiceMode === 'dictation-local' |
There was a problem hiding this comment.
[MAJOR] Switching voiceMode here only changes which status/toggle the composer points at; it never shuts down the previously active pipeline. If assistant voice is already connected and the user flips to a dictation mode, the WebRTC session stays live but the UI no longer exposes stop/mute controls. The same orphaning happens between dictation modes because both hooks stay mounted.
Suggested fix:
useEffect(() => {
if (voiceMode !== 'assistant' && (voiceStatus === 'connected' || voiceStatus === 'connecting')) {
onVoiceToggle?.()
}
if (voiceMode !== 'dictation-local' && (dictation.status === 'connected' || dictation.status === 'connecting')) {
dictation.stop()
}
if (
voiceMode !== 'dictation-elevenlabs'
&& (elevenLabsDictation.status === 'connected' || elevenLabsDictation.status === 'connecting')
) {
void elevenLabsDictation.stop()
}
}, [voiceMode, voiceStatus, onVoiceToggle, dictation.status, dictation.stop, elevenLabsDictation.status, elevenLabsDictation.stop])
web/src/hooks/useSpeechToText.ts
Outdated
| recognition.interimResults = true | ||
| recognition.lang = localStorage.getItem('hapi-voice-lang') || navigator.language || 'en-US' | ||
|
|
||
| baseTextRef.current = config.getCurrentText() |
There was a problem hiding this comment.
[MAJOR] baseTextRef is frozen once at start, and every transcript update rewrites the composer from that stale snapshot. Any manual edits the user makes while recording are lost on the next onresult, which is draft data loss. useElevenLabsTranscription has the same pattern.
Suggested fix:
const currentText = config.getCurrentText()
if (currentText !== lastTranscriptTextRef.current) {
baseTextRef.current = currentText
}
const nextText = normalizeText(baseTextRef.current, nextTranscript)
lastTranscriptTextRef.current = nextText
config.onTextChange(nextText)10ec0c0 to
e08f518
Compare
Summary
prompt.toolsusage withprompt.tool_idsTesting
createOrUpdateHapiAgent()succeeds with an ElevenLabs key that has ConvAI write accessmessageCodingAgentandprocessPermissionRequestbun run build:single-exe