Audio and Prompts Discussions

Nested
Using alternate lexicons
User: colbec
Date: 3/14/2008 12:21 pm
Views: 6876
Rating: 37

I have successfully used the tutorial and howto with a couple of grammars now, and got thinking about alternate lexicons. The HTKBook mentions a list called BEEP which might be more suitable for my speech patterns given that the source is UK, so I downloaded the list and aborbed it into my database. I see that there are differences, including the fact that some of the phonemes are different, the voxforge lexicon knows about 'el' and 'en' but BEEP does not, and BEEP knows about 'ea', 'ia', 'oh', and 'ua' which are foreign to the voxforge list.

My question is whether there are any gotchas to look out for in using "foreign" lexicons with the processes admirably laid out by voxforge processes? I'm only using my own voice for specialist grammars right now, and building from scratch.

--- (Edited on 3/14/2008 12:21 pm [GMT-0500] by colbec) ---

Re: Using alternate lexicons
User: kmaclean
Date: 3/14/2008 1:34 pm
Views: 259
Rating: 37

Hi Colbec,

>the voxforge lexicon knows about 'el' and 'en' but BEEP does not, and BEEP

>knows about 'ea', 'ia', 'oh', and 'ua' which are foreign to the voxforge list.

The HTK-based acoustic model creation process does not really care what you call your phonemes.  You could rename all the phone in either the BEEP dictionary or the VoxForge dictionary, using combinations of new characters to represent phonemes, and as long you consistently use these representations in your grammars, recognition using HTK or Julius will work fine.  There is no intrinsic 'value' to a particular choice of characters to represent a particular sound as a phoneme.  

What matters is which sounds you want to identify in a particular language and which words should get which sounds.  A phoneme is an arbitrary assignment of an identifier to a particular sound (or more that one sound) that makes up a word. 

So the fact that BEEP knows about 'ea', 'ia', 'oh', and 'ua' and the VoxForge pronunication dictionary does not, may be irrelevant since the VoxForge dictionary might use different phoneme 'tags' or 'identifiers' to signify the exact same sound.  In VoxForge, the same sounds might be represented by 'ee', 'ya', 'oo' and 'uia' (for illustration only, these are not real phonemes).

Therefore, you need to know which sounds a particular pronunciation dictionary refers to before you can compare it to another.

The VoxForge dictionary originates from the unstressed The CMU Pronouncing Dictionary (containing 39 phonemes), which includes the following phonemes:

        Phoneme Example Translation
------- ------- -----------
AA odd AA D
AE at AE T
AH hut HH AH T
AO ought AO T
AW cow K AW
AY hide HH AY D
B be B IY
CH cheese CH IY Z
D dee D IY
DH thee DH IY
EH Ed EH D
ER hurt HH ER T
EY ate EY T
F fee F IY
G green G R IY N
HH he HH IY
IH it IH T
IY eat IY T
JH gee JH IY
K key K IY
L lee L IY
M me M IY
N knee N IY
NG ping P IH NG
OW oat OW T
OY toy T OY
P pee P IY
R read R IY D
S sea S IY
SH she SH IY
T tea T IY
TH theta TH EY T AH
UH hood HH UH D
UW two T UW
V vee V IY
W we W IY
Y yield Y IY L D
Z zee Z IY
ZH seizure S IY ZH ER
(Note: VoxForge uses the same phones, but in lower-case form) 

You need to compare these with BEEP's phonemes to judge whether BEEP is better for your purposes.

>My question is whether there are any gotchas to look out for in using

>"foreign" lexicons with the processes admirably laid out by voxforge

>processes?

I guess I've already talked about the main gotcha above.  Others are:

  • You need to think about why you really need another dictionary to improve recognition, and whether you just need to add alternate "UK" pronunciations to your current one (e.g. VoxForge).  If you look through the VoxForge dictionary, you will see numerous instances where there are multiple possible pronunciations for a given word (these are tagged with a number in parenthesis).
  • Licensing: I think the main reason I avoided BEEP was licensing - I think it was only usable for non-commercial purposes or was not compatible with GPL in some other way.

Hope that helps,

Ken

--- (Edited on 3/14/2008 2:34 pm [GMT-0400] by kmaclean) ---

Re: Using alternate lexicons
User: colbec
Date: 3/14/2008 3:14 pm
Views: 533
Rating: 32
Yes, thanks, that helps a lot. I did note that the CMU list of phonemes numbers only 39 and that the BEEP license is restrictive, but I do not have commercial goals in mind and wanted to see what a different list would offer. The BEEP lexicon has 257,059 entries so it is a larger list and might offer me more consistency (less chance of OOV issues) until I get more familiar with phonemes and the confidence to create my own lexicon. It's all very interesting, thanks.

--- (Edited on 3/14/2008 3:14 pm [GMT-0500] by colbec) ---

Re: Using alternate lexicons
User: jimregan
Date: 11/28/2008 7:17 pm
Views: 2489
Rating: 3

Adding the pronunciation differences would be quite easy to automate; however, a separate lexicon for non-American speakers of English would still be required because of spelling differences -- two, at least, because Canadian spelling is somewhere between American and that of the other English-speaking countries.

 

--- (Edited on 11/28/2008 7:17 pm [GMT-0600] by ) ---

PreviousNext