Skip to content

fix(voice): create ElevenLabs tools before agent setup#459

Open
Shujakuinkuraudo wants to merge 1 commit intotiann:mainfrom
Shujakuinkuraudo:fix/elevenlabs-voice-tools
Open

fix(voice): create ElevenLabs tools before agent setup#459
Shujakuinkuraudo wants to merge 1 commit intotiann:mainfrom
Shujakuinkuraudo:fix/elevenlabs-voice-tools

Conversation

@Shujakuinkuraudo
Copy link
Copy Markdown
Contributor

Summary

  • replace deprecated ElevenLabs prompt.tools usage with prompt.tool_ids
  • auto-create or reuse the required client tools before creating/updating the Hapi voice agent
  • keep the shared web helper aligned with the new ElevenLabs tool-based agent setup

Testing

  • verified createOrUpdateHapiAgent() succeeds with an ElevenLabs key that has ConvAI write access
  • verified the resulting ElevenLabs agent includes messageCodingAgent and processPermissionRequest
  • built the Linux x64 single-executable binary successfully via bun run build:single-exe

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Existing Hapi tool records are reused by name without any schema reconciliation, so this tool_ids migration will not actually apply later VOICE_TOOLS changes to accounts that already have migrated tools. This PR already changes the shared processPermissionRequest schema by adding an enum, but ensureHapiToolIds() just reattaches the old IDs and skips any update path, leaving those tool definitions stale indefinitely. Evidence hub/src/web/routes/voice.ts:168, web/src/api/voice.ts:129, shared/src/voice.ts:188.
    Suggested fix:
    async function upsertTool(apiKey: string, existing: ElevenLabsTool | undefined, toolConfig: VoiceToolConfig): Promise<string> {
        if (!existing) {
            return await createTool(apiKey, toolConfig)
        }
        if (!isSameToolConfig(existing.tool_config, toolConfig)) {
            await updateTool(apiKey, existing.id, toolConfig)
        }
        return existing.id
    }

Summary
Review mode: initial

1 issue found in the new ElevenLabs tool-id migration path: existing tool IDs are reused without updating their definitions, so users with previously created Hapi tools will keep stale schemas/config after this change. No repo test coverage was added for create-vs-reuse-vs-update behavior in the voice tool provisioning flow.

HAPI Bot

Testing
Not run (automation): bun is not installed in this runner, so repo typecheck/tests could not be executed.


const toolIds: string[] = []
for (const request of buildVoiceToolRequests()) {
const existingId = toolIdByName.get(request.tool_config.name)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] Reusing any existing tool ID here means previously migrated Hapi tools never receive later schema/config changes. This PR already mutates processPermissionRequest in shared/src/voice.ts, but accounts that already have that tool will keep the stale definition forever because there is no PATCH/update path.

Suggested fix:

const existingTool = existingToolsByName.get(request.tool_config.name)
const toolId = existingTool
    ? await updateToolIfNeeded(apiKey, existingTool, request.tool_config)
    : await createTool(apiKey, request.tool_config)
toolIds.push(toolId)

Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Switching voiceMode only swaps which status/toggle the composer reads; it never stops the previously active assistant/dictation pipeline. If a user changes modes while voice is already connected, the old session keeps running in the background while the UI hides its stop/mute controls. Evidence web/src/components/AssistantChat/HappyComposer.tsx:210, web/src/routes/settings/index.tsx:425.
    Suggested fix:
    useEffect(() => {
        if (voiceMode !== 'assistant' && (voiceStatus === 'connected' || voiceStatus === 'connecting')) {
            onVoiceToggle?.()
        }
        if (voiceMode !== 'dictation-local' && (dictation.status === 'connected' || dictation.status === 'connecting')) {
            dictation.stop()
        }
        if (
            voiceMode !== 'dictation-elevenlabs'
            && (elevenLabsDictation.status === 'connected' || elevenLabsDictation.status === 'connecting')
        ) {
            void elevenLabsDictation.stop()
        }
    }, [voiceMode, voiceStatus, onVoiceToggle, dictation.status, dictation.stop, elevenLabsDictation.status, elevenLabsDictation.stop])
  • [Major] Both dictation hooks snapshot the composer text once at start and rebuild the draft from that stale base on every transcript event, so any manual edits made while recording are overwritten on the next partial/final transcript. That is a draft-loss regression. Evidence web/src/hooks/useSpeechToText.ts:89, web/src/hooks/useElevenLabsTranscription.ts:106.
    Suggested fix:
    const currentText = config.getCurrentText()
    if (currentText !== lastTranscriptTextRef.current) {
        baseTextRef.current = currentText
    }
    const nextText = normalizeText(baseTextRef.current, nextTranscript)
    lastTranscriptTextRef.current = nextText
    config.onTextChange(nextText)

Summary
Review mode: follow-up after new commits

2 issues found in the new dictation-mode flow: mode changes can orphan an active voice capture session, and transcript updates can clobber manual draft edits while recording. No tests cover mode-switch cleanup or transcript/manual-edit interleaving.

Testing
Not run (automation): bun is not installed in this runner, so repo typecheck/tests could not be executed.

HAPI Bot

onTextChange: (text) => api.composer().setText(text)
})

const effectiveVoiceStatus = voiceMode === 'dictation-local'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] Switching voiceMode here only changes which status/toggle the composer points at; it never shuts down the previously active pipeline. If assistant voice is already connected and the user flips to a dictation mode, the WebRTC session stays live but the UI no longer exposes stop/mute controls. The same orphaning happens between dictation modes because both hooks stay mounted.

Suggested fix:

useEffect(() => {
    if (voiceMode !== 'assistant' && (voiceStatus === 'connected' || voiceStatus === 'connecting')) {
        onVoiceToggle?.()
    }
    if (voiceMode !== 'dictation-local' && (dictation.status === 'connected' || dictation.status === 'connecting')) {
        dictation.stop()
    }
    if (
        voiceMode !== 'dictation-elevenlabs'
        && (elevenLabsDictation.status === 'connected' || elevenLabsDictation.status === 'connecting')
    ) {
        void elevenLabsDictation.stop()
    }
}, [voiceMode, voiceStatus, onVoiceToggle, dictation.status, dictation.stop, elevenLabsDictation.status, elevenLabsDictation.stop])

recognition.interimResults = true
recognition.lang = localStorage.getItem('hapi-voice-lang') || navigator.language || 'en-US'

baseTextRef.current = config.getCurrentText()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] baseTextRef is frozen once at start, and every transcript update rewrites the composer from that stale snapshot. Any manual edits the user makes while recording are lost on the next onresult, which is draft data loss. useElevenLabsTranscription has the same pattern.

Suggested fix:

const currentText = config.getCurrentText()
if (currentText !== lastTranscriptTextRef.current) {
    baseTextRef.current = currentText
}
const nextText = normalizeText(baseTextRef.current, nextTranscript)
lastTranscriptTextRef.current = nextText
config.onTextChange(nextText)

@Shujakuinkuraudo Shujakuinkuraudo force-pushed the fix/elevenlabs-voice-tools branch from 10ec0c0 to e08f518 Compare April 14, 2026 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant