VoxForge
Hi kmaclean,
I am using Julius as my SRE. It is recognising the my utterance almost correctly.
1)I want to make it "really speaker independent" please suggest me on that.
2)I want to avoid the predicted utterance of julius when my utterance is not there in the acoustic model(vocabulary n grammar). How can I achieve this?
3)How can I increase the performance of two Different utterances with for almost same accent?
Thanks alot for ur contribution to all the queries.
--- (Edited on 3/25/2009 12:24 am [GMT-0500] by vishwabalu) ---
Hi vishwabalu,
>1)I want to make it "really speaker independent" please suggest me on that.
You likely need to train your acoustic model with more speech, from many different speakers... how much speech are you using?
see Arthur Chan's discussion here:Speech Recognition Engine comparison
2)I want to avoid the predicted utterance of julius when my utterance is not there in the acoustic model(vocabulary n grammar). How can I achieve this?
Just ignore it in your application... the predicted utterance comes with a probability score, so only accept those utterances over a minimum threshhold. See this post: HVite log for more info on log scores.
3)How can I increase the performance of two Different utterances with for almost same accent?
Not sure I understand, are you asking how to recognize 2 different words that sound the same?
Ken
--- (Edited on 3/29/2009 2:53 pm [GMT-0400] by kmaclean) ---
Hi vishwabalu,
>3)Yes. To recognize 2 different words that sound the same.
More speech will help.
You can look at context - i.e. only try recognize the similarly sounding words in different grammar contexts (this is a poor example, but helps to illustrate what you need to work towards: for example to distinguish "threw" (past of the verb throw) from from the adverb "through", recognize it in the context of other words: "he threw the ball" vs "he walked through the door".
For similar sounding proper names, etc, your best bet is likely to put logic in your application to ask the user to clarify which one he means.
HTK also allows you to assign probabilities to words in your dictionary (i.e. floating point confidence score - see 6.2.1 HTK Label Files of the HTK book) - maybe one pronunciation of the word is more likely than the other. Not sure if Julius uses these though...
Ken
--- (Edited on 3/30/2009 1:16 pm [GMT-0400] by kmaclean) ---