Skip to content

Commit 6114e69

Browse files
authored
Merge branch 'main' into case-studies
2 parents a589006 + 9a2da8c commit 6114e69

26 files changed

+9202
-1444
lines changed

.github/workflows/stress-test.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
name: stress testing
22

33
on:
4-
push:
54
pull_request:
65
workflow_dispatch:
76

docs/images/sibilant_plot.png

39.7 KB
Loading

docs/source/case_studies.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,3 +13,4 @@ Contents:
1313
case_study_duration.rst
1414
case_study_vowel_dynamics.rst
1515
case_study_sibilants.rst
16+
case_study_praat.rst

docs/source/case_study_praat.rst

Lines changed: 205 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,205 @@
1+
.. _case_study_praat:
2+
3+
*********************************************************
4+
Case study 4: Sibilant analysis using custom Praat script
5+
*********************************************************
6+
7+
Motivation
8+
----------
9+
10+
Sibilants, and in particular, /s/, have been observed to show interesting sociolinguistic variation according to a range of intersecting factors, including gendered, class, and ethnic identities (Stuart-Smith, 2007; Levon, Maegaard and Pharao, 2017). Sibilants - /s ʃ z ʒ/ - also show systematic variation according to place of articulation (Johnson, 2003). Alveolar fricatives /s z/ as in send, zen, are formed as a jet of air is forced through a narrow constriction between the tongue tip/blade held close to the alveolar ridge, and the air strikes the upper teeth as it escapes, resulting in high pitched friction. The post-alveolar fricatives /ʃ ʒ/, as in ‘sheet’, ‘Asia’, have a more retracted constriction, the cavity in front of the constriction is a bit longer/bigger, and the pitch is correspondingly lower. In many varieties of English, the post-alveolar fricatives also have some lip-rounding, reducing the pitch further.
11+
12+
Acoustically, sibilants show a spectral ‘mountain’ profile, with peaks and troughs reflecting the resonances of the cavities formed by the articulators (Jesus and Shadle, 2002). The frequency of the main spectral peak, and/or main area of acoustic energy (Centre of Gravity), corresponds quite well to shifts in place of articulation, including quite fine-grained differences, such as those which are interesting for sociolinguistic analysis: alveolars show higher frequencies, more retracted post-alveolars show lower frequencies.
13+
14+
* How do English /ʃ/ and /ʒ/ differ in their spectral peaks and centre of gravity?
15+
16+
Step 1: Import
17+
--------------
18+
19+
As with previous case studies, the Python libraries are loaded, and the aligned Librispeech corpus is imported.
20+
21+
.. code-block:: python
22+
23+
import os # for parsing the paths to the corpus enrichment files
24+
25+
# PolyglotDB imports
26+
from polyglotdb import CorpusContext
27+
import polyglotdb.io as pgio
28+
29+
## name and path to the corpus
30+
corpus_root = '.data/LibriSpeech-aligned'
31+
corpus_name = 'Librispeech-aligned'
32+
33+
## names of the enrichment files
34+
speaker_filename = "SPEAKERS.csv"
35+
stress_data_filename = "iscan_lexicon.csv"
36+
37+
## get the paths to the corpus enrichment files
38+
speaker_enrichment_path = os.path.join(corpus_root, 'enrichment_data', speaker_filename)
39+
lexicon_enrichment_path = os.path.join(corpus_root, 'enrichment_data', stress_data_filename)
40+
41+
## use the MFA parser
42+
parser = pgio.inspect_mfa(corpus_root)
43+
parser.call_back = print
44+
45+
with CorpusContext(corpus_name) as c:
46+
print("Loading data...")
47+
c.load(parser, corpus_root)
48+
49+
Step 2: Basic enrichment
50+
------------------------
51+
52+
Also as with previous case studies, utterance, syllabic, and speaker information is encoded.
53+
54+
.. code-block:: python
55+
56+
## set of syllabic segments
57+
syllabics = ["ER0", "IH2", "EH1", "AE0", "UH1", "AY2", "AW2", "UW1", "OY2",
58+
"OY1", "AO0", "AH2", "ER1", "AW1", "OW0", "IY1", "IY2", "UW0", "AA1", "EY0",
59+
"AE1", "AA0", "OW1", "AW0", "AO1", "AO2", "IH0", "ER2", "UW2", "IY0", "AE2",
60+
"AH0", "AH1", "UH2", "EH2", "UH0", "EY1", "AY0", "AY1", "EH0", "EY2", "AA2",
61+
"OW2", "IH1"]
62+
63+
## use syllabic labels to encode syllables
64+
with CorpusContext(corpus_name) as c:
65+
print("Encoding syllables...")
66+
c.encode_type_subset('phone', syllabics, 'syllabic')
67+
c.encode_syllables(syllabic_label='syllabic')
68+
69+
## pause label
70+
pause_labels = ['<SIL>']
71+
72+
## encode utterances from both
73+
## pause labels and 150ms stretches
74+
with CorpusContext(corpus_name) as c:
75+
print("Encoding utterances...")
76+
c.encode_pauses(pause_labels)
77+
c.encode_utterances(min_pause_length=0.15)
78+
79+
with CorpusContext(corpus_name) as c:
80+
print("Encoding speakers...")
81+
c.enrich_speakers_from_csv(speaker_enrichment_path)
82+
83+
with CorpusContext(corpus_name) as c:
84+
print("Encoding lexicon...")
85+
c.enrich_lexicon_from_csv(lexicon_enrichment_path)
86+
c.encode_stress_from_word_property('stress_pattern')
87+
88+
Step 3: Sibilant acoustic enrichment
89+
------------------------------------
90+
91+
PolyglotDB supports the enrichment of custom information from Praat scripts. Here, a custom Praat script has been written to extract spectral information -- spectral Centre of Gravity (COG) and spectral peak -- for a given segment. PolyglotDB will apply this script to the subset of segments, and enrich the database with these measures. `Praat script <https://github.com/MontrealCorpusTools/PolyglotDB/blob/main/examples/case_studies/praat_sibilants/polyglotdb_sibilant.praat>`_
92+
93+
First a subset of segments are defined -- `sibilants` -- which are going to be analysed for spectral information.
94+
95+
.. code-block:: python
96+
97+
sibilant_segments = ["S", "Z", "SH", "ZH"]
98+
99+
Polyglot is provided both the path to the Praat executable and the specific sibilant enrichment script.
100+
101+
.. code-block:: python
102+
103+
praat_path = "/usr/bin/praat" # default path on Unix machine
104+
sibilant_script_path = "./polyglotdb_sibilant.praat"
105+
106+
The script is then called via the `analyze_script` function.
107+
108+
.. code-block:: python
109+
110+
with CorpusContext(corpus_name) as c:
111+
c.encode_class(sibilant_segments, 'sibilant')
112+
c.analyze_script(annotation_type='phone', subset='sibilant', script_path=sibilant_script_path, duration_threshold=0.01)
113+
114+
Step 4: Query
115+
-------------
116+
117+
Now with sibilant spectral information enriched in the database, a query can be generated to extract the sibilant tokens of interest. Here, the focus is on syllable-onset sibilant segments. Columns for the segmental, syllabic, and word-level information are extracted, as well as the spectral measurements made from the Praat script (`cog`, `peak`).
118+
119+
.. code-block:: python
120+
121+
output_path = "./sibilant_spectral_output.csv"
122+
123+
with CorpusContext(corpus_name) as c:
124+
print("Generating query...")
125+
## use the sibilant subset to filter segments
126+
q = c.query_graph(c.phone).filter(c.phone.subset == "sibilant")
127+
## syllable-initial (onset) only
128+
q = q.filter(c.phone.begin == c.phone.syllable.word.begin)
129+
130+
q = q.columns(
131+
## segmental information
132+
c.phone.id.column_name("phone_id"),
133+
c.phone.label.column_name('phone_label'),
134+
c.phone.duration.column_name('phone_duration'),
135+
c.phone.begin.column_name("phone_begin"),
136+
c.phone.end.column_name("phone_end"),
137+
138+
## surrounding segmental labels
139+
c.phone.following.label.column_name("following_phone_label"),
140+
c.phone.previous.label.column_name("previous_phone_label"),
141+
142+
## syllabic information
143+
c.phone.syllable.label.column_name("syllable_label"),
144+
c.phone.syllable.stress.column_name("syllable_stress"),
145+
c.phone.syllable.duration.column_name("syllable_duration"),
146+
147+
## labels for each part of the syllable
148+
c.phone.syllable.phone.filter_by_subset('onset').label.column_name('onset'),
149+
c.phone.syllable.phone.filter_by_subset('nucleus').label.column_name('nucleus'),
150+
c.phone.syllable.phone.filter_by_subset('coda').label.column_name('coda'),
151+
152+
## word, speaker, and utterance-level information
153+
c.phone.syllable.word.label.column_name('word_label'),
154+
c.phone.syllable.word.begin.column_name('word_begin'),
155+
c.phone.syllable.word.end.column_name('word_end'),
156+
c.phone.syllable.word.utterance.speech_rate.column_name('utterance_speech_rate'),
157+
c.phone.syllable.speaker.name.column_name('speaker'),
158+
c.phone.syllable.discourse.name.column_name('file'),
159+
160+
## spectral measures enriched from Praat script
161+
c.phone.cog.column_name('cog'),
162+
c.phone.peak.column_name('peak')
163+
)
164+
165+
print("Writing query to file...")
166+
q.to_csv(export_path)
167+
168+
Step 5: Analysis
169+
----------------
170+
171+
As before, the exported CSV file can then be loaded into R.
172+
173+
.. code-block:: r
174+
175+
library(tidyverse)
176+
177+
df <- read.csv("sibilant_spectral_output.csv")
178+
179+
## check the number of tokens for each segment
180+
df %>%
181+
group_by(phone_label) %>%
182+
tally()
183+
# A tibble: 3 × 2
184+
# phone_label n
185+
# <chr> <int>
186+
# 1 S 3298
187+
# 2 SH 641
188+
# 3 Z 12
189+
190+
Both spectral centre of gravity and spectral peak are plotted below, showing that /ʃ/ generally exhibit both lower peaks and centre of gravity, compared with both /s/ and /z/.
191+
192+
.. code-block:: r
193+
194+
df %>%
195+
## make a single column for spectral measures
196+
## so both measures can be plotted side-by-side
197+
pivot_longer(c(peak, cog), names_to = "measure", values_to = "value") %>%
198+
ggplot(aes(x = phone_label, y = value)) + geom_boxplot() +
199+
facet_wrap(~measure) +
200+
scale_y_sqrt() +
201+
ylab("Frequency (Hz)") +
202+
xlab("Sibilant")
203+
204+
.. image:: ../images/sibilant_plot.png
205+
:width: 400

docs/source/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
'sqlalchemy.orm', 'sqlalchemy.sql', 'sqlalchemy.sql.expression',
2727
'sqlalchemy.ext', 'sqlalchemy.ext.hybrid','sqlalchemy.ext.declarative', 'sqlalchemy.ext.orderinglist',
2828
'sqlalchemy.ext.associationproxy', 'sqlalchemy.ext.hybrid',
29-
'numpy', 'resampy', 'audioread',
29+
'numpy', 'resampy',
3030
'scipy', 'scipy.signal', 'scipy.io',
3131
'librosa', 'librosa.core.spectrum', 'neo4j.v1',
3232
'pyraat',

0 commit comments

Comments
 (0)