-
Notifications
You must be signed in to change notification settings - Fork 57
Description
Problem description
Espeak-ng is an open-source text-to-speech synthesizer. However, it's most important feature is rule-based text phonemization, implemented for vast range of different languages.
As it stands for now, the Phonemis package uses a syllabification algorithm with predefined syllabe phonemizations to provide a fallback mechanism for words not included in the main .json lexicon. This significantly increases the size of the lexicon. For example, for 🇺🇸 American English, the lexicon grows from around 200 000 entries (6 MB) to almost 500 000 entries (13 MB).
Replacing the fallback phonemization mechanism with espeak-ng would significantly reduce the memory footprint of the lexicon file, as well as make it easier to add phonemization pipelines for different languages.
What should be done
- Explore the espeak-ng repository
- Select the minimal subset of the code base to provide a proper phonemization mechanism
- If possible, rewrite the selected espeak-ng code to utilize within the Phonemis package
Benefits to React Native ExecuTorch
Potential reduction in memory usage of the Text to Speech module (based on Kokoro) and easier implementation of new languages support.