-
Notifications
You must be signed in to change notification settings - Fork 7
Helping French TTS to pronounce numbers #739
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…TTS to pronounce numbers of all kinds
services/espnet-tts-fr/src/app.py
Outdated
| def isfloat(num): | ||
| try: | ||
| float(num) | ||
| return True | ||
| except ValueError: | ||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this because there's (supposedly) no built-in function in Python to check if a number is a float.
JRegimbal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great!
services/espnet-tts-fr/src/app.py
Outdated
| logger.debug(segment.split()) | ||
| for s in segment.split(): | ||
| try: | ||
| logger.debug(f'Performing on: "{s}"') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to get very chatty very quickly...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not merging right now, but do you anticipate this being useful going forward? I'm a little worried about the slim possibility of this leaking some personal info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this helpful to debug what component the TTS couldn't convert to numbers sometimes, but it can definitely be case specific (log in the except statement) rather than logging out everything.
You might be right... In that case, I would have to change the logs of multilang-support as well because it has been logging the whole segment out everytime, both in English and French 🫤
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah please do that too then!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done that!
|
@JRegimbal I just reformat the |
services/espnet-tts-fr/src/app.py
Outdated
| s = num2words(float(tempns), lang='fr') | ||
| elif "-" in s: | ||
| logger.debug("case 3: -") | ||
| logger.debug("Case: has a - as separator") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this would likely fail for cases where two separators are used. E.g., "39,5-40,0". We could probably do something with regexps with the - separator as well. Get the fields we need excluding text with ([\d,.]+)-([\d,.]+), replace with de $1 à $2 and then apply another find and replace on $1 and $2 to replace occurrences of , with .. Does this make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it does. I will try them out.
services/espnet-tts-fr/src/app.py
Outdated
| s = num2words(s, lang='fr') | ||
| elif "," in s: | ||
| logger.debug(f"Case: '{s}' has , as separator") | ||
| logger.debug("Case: has a , as separator") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, does this still get run for text uses of ,?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also catch with a regexp find replace ((\d+),(\d+) to $1.$2).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It catches text with ,, but it won't perform any changes since on the next line, it couldn't cast the text to float.
I will look into regex and string processing more, thanks for the suggestion! I genuinely forgot about regex and went straight for the simpliest implementation.
|
JRegimbal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
|
@notkaramel you can make a new issue, but I'd put it in backlog for right now and bring it up tomorrow if you're still looking for something to do. There may be a higher priority issue. |
Changes Made
espnet-tts-fr) struggles with numerical numbers #723num2wordslibrary to help convert numbers in numeric form to text (e.g. "42" to "quarante deux")multilang-supporttranlated to Metropolitan French where they have a different notation system.Testings
Please ensure you've followed the checklist and provide all the required information before requesting a review.
If you do not have everything applicable to your PR, it will not be reviewed!
If you don't know what something is or if it applies to you, ask!
Don't delete below this line.
Required Information
Coding/Commit Requirements
New Component Checklist (mandatory for new microservices)
docker-compose.ymlandbuild.yml..github/workflows.