VoxForge
Hello all,
I hope to make it easier to improve
a phonetic dictionary with g2p4j.
I welcome advice and anyone
wishing to help with this project:
http://g2p4j.sourceforge.net/
I post here instead of continuing
audio-discussions
medical-technical-language-voice-files
since my interest for now is
understanding phonetic dictionaries.
I have compared machine made dictionaries
of technical words. I find that
Sequitur G2P and Festival introduced
different errors. Both require a high
percentage of manual corrections.
Sequitur lacks the schwa ax.
Could this be added?
Technical words poorly follow the
letter to phoneme model of general English.
Examples:
ey not ah to mean "not" (Greek)
abacterial (ey/ah) b ae k t ih r iy ax l
o vowel ow should not shift to ah
adrenocortical ah d r eh n (ow/ah) k ao r t ah k ah l
Although it may be overly ambitious,
I use a tag for word alternatives.
The same dictionary can be sorted
to give preference to a region or
patched and sorted to adapt a general
acoustic model to an individual.
I would appreciate any references
that detail the role of alternate
phonemes for the same word in
large vocabulary speaker independent
acoustic model creation.
I hope a flexable dictionary format
with conversion tools can be useful.
Best Wishes
pradocs
--- (Edited on 8/2/2009 7:44 am [GMT-0500] by paradocs) ---
> Sequitur lacks the schwa ax. Could this be added?
There is no ax in CMU-40 phonset, and I don't think there is sense to add it. It will not improve the accuracy of recognition significantly otherwise it would be added long time ago. Including ix and other unstressed versions. The original cmudict has information about stress though.
--- (Edited on 8/2/2009 3:28 pm [GMT-0500] by nsh) ---
Hi Paradocs,
>appreciate any references
>that detail the role of alternate
phonemes for the same word
There might be some useful information in the HTK book, section 12.7 Constructing a Dictionary.
Googling: "speech recognition pronunciation dictionary alternate" brings up many papers, here are a few:
Ken
--- (Edited on 8/6/2009 10:57 am [GMT-0400] by kmaclean) ---
Hi nsh
Thanks for the information on the CMU-40
reduced phoneme set. Over complicating
a technical lexicon is not productive
as is also shown here:
Enhanced Tree Clustering with Single Pronunciation
Dictionary for Conversational Speech Recognition
>Instead of adding a variant in the dictionary, we can
>keep the dictionary unchanged, and either augment the
>mixture model of AX with Gaussians from the mixture
>model of IX (as in [1]), or simply tie these two models
>together.
>In summary, due to the complex interaction between
>lexicon and acoustic modeling, adding pronunciation
>variants should be exercised with great care.
Hi kmaclean
Great! I have already studied the HTK book and others
but seeing things in different ways really helps.
I love all the math but as is often the case the
student paper [ J. Burdick ] is a fine starting place.
Now to work.
--- (Edited on 8/7/2009 12:51 am [GMT-0500] by paradocs) ---