Problem Set #2 -- COGS 501

(due 9/28/2011)

1. Download the file pb.Table1 to a place accessible to Matlab. (This is the data from Peterson & Barney, "Control Methods Used in a Study of the Vowels", JASA 23 (1), 1951.)

The table columns are:

1: m = man, w=woman, c=child
2: m=male, f=female
3: N=speaker number
4: V=vowel name (heed/iy, hid/ih, head/eh, had/ae, hod/aa, hawed/ao, hood/uh, who'd/uw, hud/ah, heard/er)
5: IPA
6: F0 in Hz
7: F1
8: F2
9: F3

You can read it into Matlab via a sequence like

fid = fopen('pb.Table1');
PB = textscan(fid, '%s %s %d %s %s %f %f %f %f', 10000);
fclose(fid);

The result of this (in PB) is a "cell array". Here's some code for turning pieces of this into a more useful form:

% Convert column 1 (man, woman, child) from cell array to string array:
MWC=char(PB{1});
% Make some useful binary vectors:
isman=(MWC=='m');
iswoman=(MWC=='w');
ischild=(MWC=='c');

(Some code to get you started is here; suggestions for users of Octave are here.)

Note, by the way, that we can find out how many rows of what we have by e.g.

sum(iswoman)

And continuing with the conversions:

% Convert column 2 (male, female) to string array:
MF=char(PB{2});
% More useful binary vectors:
ismale=(MF=='m');
isfemale=(MF=='f');
% Convert column 4 (vowel ID) to string signatures
% Note that options are 'iy','ih','eh', 'ae','aa','ao','uh','uw','ah','er'
isiy = strcmp('iy',PB{4});
isih = strcmp('ih',PB{4});
iseh = strcmp('eh',PB{4});
isae = strcmp('ae',PB{4});
isaa = strcmp('aa',PB{4});
isao = strcmp('ao',PB{4});
isuh = strcmp('uh',PB{4});
isuw = strcmp('uw',PB{4});
isah = strcmp('ah',PB{4});
iser = strcmp('er',PB{4});

Now we can (for example) extract all the F0, F1, F2, F3 data:

F0123 = [PB{6} PB{7} PB{8} PB{9}];

Or all the female /ae/ data:

F0123wae = F0123(iswoman & isae,:);

1. "Cheating" classification (within training set)

For each of the 10 vowels, calculate the mean vector and covariance matrix for all 76 speakers taken together. (Thus you'll get 10 mean vectors and 10 covariance matrices.) Also get the mean vector and the covariance matrices for the categories of men, women and children (without distinguishing vowels -- thus you'll have three mean vectors and three covariance matrices).

Using the mahalanobis distance as the measure, how well can you classify each of the 1520 vowels as to whether it was spoken by a man, woman or child? How well can you classify them as to vowel category?

(Note that "classify" here means "find the category with minimum distance".)

Make a "confusion matrix" for this classification experiment -- that is, a matrix with 10 rows and 10 columns, such that cell i,j contains the count of times that a vowel of (true) class i was assigned by the classifier to class j.

Extra credit: Can you do better if you first classify man/woman/child, and then use vowel-specific covariance matrices based on the man/woman/child classification?

2. "Honest" classification (separate training and test sets)

Use "Leave one (speaker) out" cross-validation. That is, calculate the mean vectors and covariance matrices for 10 classes based on 75 of the speakers, and test on the remaining ("left out") speaker. Do this leaving out each of the 76 speakers in turn. Calculate the classification performance (percent correct) across the whole experiment.

Make a confusion matrix for this test as well.

Extra credit: Try the two-stage classification in this case.

Extra extra credit (to be done if all the above is obvious and easy to you -- note that we haven't taught you any of the following yet, though we're happy to talk with you about it if you're curious):

A. Recast the classification task in (2) in probabilistic terms.

B. Try some other classification techniques on this same data (which we haven't taught you yet, but maybe you know about them, or can look them up on the web): gaussian mixture models; support vector machines; whatever.