Skip to content

Conversation

@lxy2304
Copy link
Collaborator

@lxy2304 lxy2304 commented Feb 28, 2025

Code changes:

  • Added two functions save_track_from_csv and save_track_from_csvs, updated function enrich_tokens_with_csv with timestamp option.
  • Added unit tests for the new functions
  • Minor fixes

Docs changes:

  • Added FastTrack example for save_track_from_csvs
  • Addressed comments from last PR, adding example praat scripts showing analyze_script's two input formats
  • Added tutorial 6, voice quality praat script, and expected outputs.

@lxy2304 lxy2304 requested a review from msonderegger February 28, 2025 11:47
----------
The :code: `analyze_vot` function has a variety of parameters that are important for running the function properly.
The :code:`analyze_vot` function has a variety of parameters that are important for running the function properly.
`classifier` is a string which has a paht to an AutoVOT classifier directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "paht"

# Clean up
select all
Remove

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks good. Can you also:

  • say in prose somewhere (or a comment in the Praat script) what the script does ("Computes average F0 over a sound file")
  • Make this an actual Praat script (.praat), in addition to putting in the tutorial. Put the script somewhere in the polyglotdb repo. Put a link here to the script.
    (this is because people are very used to just running Praat scripts, as opposed to copy-pasting into a new praat script)

Copy link
Member

@msonderegger msonderegger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing!

select all
Remove


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great. same comments as above:

  • add high-level desciprtion
  • add actual Praat script file somewhere


- Always use :code:`Open long sound file` to ensure compatibility with the system.
- The `padding` field allows flexibility by extending the actual start and end times of the segment (default is 0.1s).
- The `padding` field allows flexibility by extending the actual start and end times of the segment (default is 0).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add an issue (for future devs + Michael/me) to make this more detailed. can we give any guidance on when padding is and isn't needed (e.g., yes for pitch, no for power spectrum analysis?) ?

c.encode_type_subset('phone', ['S', 'Z', 'SH', 'ZH'], 'sibilant')

# Uses a praat script that takes as input a filename and begin/end time, and outputs measures we'd like to take for sibilants
# The analyze_script call then applies this script to every phone of type "sibilant" in the corpus.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good comment

Encoding acoustic tracks from CSV
=================================

Sometimes, you may want to use external software to extract specific measurement tracks. For example, `FastTrack`_ is a Praat plugin that can generate formant tracks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a bit more detail here on what "specific measurement tracks" means.

Like: sometimes you may want to use external software to generate measurement tracks. Examples include:

  • F0 (pitch) tracks computed by an external library, across entire files
  • Voice quality tracks for each vowel, computed using VoiceSauce (give link)
  • Vowel formant tracks, e.g. using FastTrack...

c.save_track_from_csv('formants', '/path/to/csv', ['f1','b1','f2','b2','f3','b3','f1p','f2p','f3p','f0','intensity','harmonicity'])
# loading multiple csv files
c.save_track_from_csvs('formants', '/path/to/directory', ['f1','b1','f2','b2','f3','b3','f1p','f2p','f3p','f0','intensity','harmonicity'])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you make an issue to write (full) example scripts that:

  • import dynamic tracks from FastTrack
  • aggregate (what you show below)
  • then query and do some output

(In the future I think we'll want an "examples" part of the Polyglot documentation, but not now)

*************************

The main objective of this tutorial is to perform `voice quality`_ analysis on the corpus using a Praat script and extract
spectral measures like H1-H2, H1-A1, H1-A2, and H1-A3.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tutorial is good! Let's add an issue to have some users (in the lab + James T) work through and give comments Some comments:

  • Make clearer in initial setup that this is an example of a custom script where the result is tracks -- as opposed to the sibilants tutorial example, which uses a custom script, but the result is static measures.
  • I think it's fine to put a link here to where to read more about voice quality (the slides you found)

@@ -0,0 +1,67 @@
"('61', 'AO2')","[('mean_H1_A1', 5.21375241700153), ('mean_H1_H2', 1.4283178345991416), ('mean_H1_A2', 18.57861883658481), ('mean_H1_A3', 28.8672225446156)]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit surprised at the output here.

Could we get this to look more like other output CSVs, which are rows and columns without extra formatting (no nesting)?

an example would be the output of Tutorial 5: the f0 column is actually the mean F0 over the vowel.

so in the current case, I think we'd want

speaker, vowel, mean_H1_A1, mean_H1_H2, ...
61, A02, 5.21375241700153, 1.4283178345991416
...

?


return type_map

def import_token_csv_with_timestamp(corpus_context, path, annotated_type, timestamp_column, discourse_column, properties=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add documentation of the import CSV with vs. without timestamp cases to readthedocs? I think currently the docs for CSV import assume IDs and don't support timestamps.

@msonderegger msonderegger merged commit f886c82 into main Mar 18, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants