|
LING 521 |
|
LAB 4: F0 declination
|
1. F0 tends to decline over the course of phrases and utterances, both in non-tonal and tonal languages. This global downward trend of F0 is called declination. In this lab, we’ll explore some interesting questions related to declination: Do we start with a higher F0 when producing longer sentences? Do longer sentences have a flatter declination slope of F0? Studies of such questions can lead us to a better understanding of pre-planning of pitch contours, and pre-planning of speech production in general.
2. We'll use the TIMIT corpus, which contains a total of 6300 sentences, 10 sentences spoken by each of 630 speakers from 8 major dialect regions of the United States. There is a copy of the corpus on harris: /corpora/timit.
3. Extract the F0s of the 6300 sentences, and write the data (time-F0 pairs) of each sentence into a text file (using Praat). Then extract the following variables for each sentence: the initial F0 of the sentence, the end F0 of the sentence, the slope of the regression line over the F0 curve of the sentence, the intercept of the regression line, the length (duration) of the sentence, and the total number of words in the sentence; write the extracted variables from all the sentences into one file for further analysis (using Python).
4. Since speakers have very different pitch ranges, we need to normalize the F0 data for each speaker before extracting
the variables. Below is the (suggested) procedure for normalization and extracting variables:
For each of the 630 speakers:
5. Do regression and correlation tests to find whether initial F0, end F0, the difference between initial and end F0s, slope, or intercept is correlated with sentence length or the total number of words in the sentence. If interested, you can further look at if there are any differences between male and female, or among different dialect regions (using the speaker information file included in the corpus).
6. Your lab report should (at least) include exploratory data analysis on each of
the variables (histograms, box-plots, five-number summaries, etc.); regression and correlation results (coefficients, p values, scattlerplots
with regression line etc.); and a brief discussion on your results.