VoxForge
I have successfully used the tutorial and howto with a couple of grammars now, and got thinking about alternate lexicons. The HTKBook mentions a list called BEEP which might be more suitable for my speech patterns given that the source is UK, so I downloaded the list and aborbed it into my database. I see that there are differences, including the fact that some of the phonemes are different, the voxforge lexicon knows about 'el' and 'en' but BEEP does not, and BEEP knows about 'ea', 'ia', 'oh', and 'ua' which are foreign to the voxforge list.
My question is whether there are any gotchas to look out for in using "foreign" lexicons with the processes admirably laid out by voxforge processes? I'm only using my own voice for specialist grammars right now, and building from scratch.
--- (Edited on 3/14/2008 12:21 pm [GMT-0500] by colbec) ---
Hi Colbec,
>the voxforge lexicon knows about 'el' and 'en' but BEEP does not, and BEEP
>knows about 'ea', 'ia', 'oh', and 'ua' which are foreign to the voxforge list.
The HTK-based acoustic model creation process does not really care what you call your phonemes. You could rename all the phone in either the BEEP dictionary or the VoxForge dictionary, using combinations of new characters to represent phonemes, and as long you consistently use these representations in your grammars, recognition using HTK or Julius will work fine. There is no intrinsic 'value' to a particular choice of characters to represent a particular sound as a phoneme.
What matters is which sounds you want to identify in a particular language and which words should get which sounds. A phoneme is an arbitrary assignment of an identifier to a particular sound (or more that one sound) that makes up a word.
So the fact that BEEP knows about 'ea', 'ia', 'oh', and 'ua' and the VoxForge pronunication dictionary does not, may be irrelevant since the VoxForge dictionary might use different phoneme 'tags' or 'identifiers' to signify the exact same sound. In VoxForge, the same sounds might be represented by 'ee', 'ya', 'oo' and 'uia' (for illustration only, these are not real phonemes).
Therefore, you need to know which sounds a particular pronunciation dictionary refers to before you can compare it to another.
The VoxForge dictionary originates from the unstressed The CMU Pronouncing Dictionary (containing 39 phonemes), which includes the following phonemes:
Phoneme Example Translation
------- ------- -----------
AA odd AA D
AE at AE T
AH hut HH AH T
AO ought AO T
AW cow K AW
AY hide HH AY D
B be B IY
CH cheese CH IY Z
D dee D IY
DH thee DH IY
EH Ed EH D
ER hurt HH ER T
EY ate EY T
F fee F IY
G green G R IY N
HH he HH IY
IH it IH T
IY eat IY T
JH gee JH IY
K key K IY
L lee L IY
M me M IY
N knee N IY
NG ping P IH NG
OW oat OW T
OY toy T OY
P pee P IY
R read R IY D
S sea S IY
SH she SH IY
T tea T IY
TH theta TH EY T AH
UH hood HH UH D
UW two T UW
V vee V IY
W we W IY
Y yield Y IY L D
Z zee Z IY
ZH seizure S IY ZH ER
(Note: VoxForge uses the same phones, but in lower-case form)
You need to compare these with BEEP's phonemes to judge whether BEEP is better for your purposes.
>My question is whether there are any gotchas to look out for in using
>"foreign" lexicons with the processes admirably laid out by voxforge
>processes?
I guess I've already talked about the main gotcha above. Others are:
Hope that helps,
Ken
--- (Edited on 3/14/2008 2:34 pm [GMT-0400] by kmaclean) ---
--- (Edited on 3/14/2008 3:14 pm [GMT-0500] by colbec) ---
Adding the pronunciation differences would be quite easy to automate; however, a separate lexicon for non-American speakers of English would still be required because of spelling differences -- two, at least, because Canadian spelling is somewhere between American and that of the other English-speaking countries.
--- (Edited on 11/28/2008 7:17 pm [GMT-0600] by ) ---