Audio recognition can also be achieved by deploying a TTS server (like https://github.com/Ksuriuri/index-tts-vllm) locally through the OpenAI-compatible interface.