VoxForge
Hi,
I am developing speech recognition system for english Speech to text module. Im using HTK. I have folowed the steps in this site and developed a system. but my problem is the recognition is not done correctly. it sometimes recognize words correctly but some times the same words are not. How can i improve the accuracy?
if the speaker is a female, then do we need to change parameters in config file?
--- (Edited on 7/13/2009 12:20 am [GMT-0500] by dulanie) ---
HI dulanie,
>it sometimes recognize words correctly but some times the same words
>are not.
It might be because of the CMN parameter... from this post (One word grammar, always recognized? ):
Your problem[...] is likely the result of running Julian without a CMN parameter.
When you first start up Julian, you should see a notice like this:
------------- System Info end -------------
************************************************************
* NOTICE: The first input may not be correctly recognized *
* since no CMN parameter is available on startup. *
************************************************************
This is telling you that Julian takes the cepstral mean of the last 5 seconds of speech as the initial cepstral mean at the beginning of each input. So Julian looks at the previous 5 seconds of speech to get an average (cepstral mean) in order to recognize speech. That is why in Julian's default configuration it never recognizes what you say for the first few utterances, as it tries to figure out this average.
You can get around this by using "-cmnsave filename" to record a representative average for your environment, and then use "-cmnload filename" and "-cmnnoupdate" to use then cmn you saved and not try to recalculate it on the fly. Theoretically your confidence scores should start looking reasonable, and you should be able to determine whether a word is in your grammar or not.
Ken
--- (Edited on 7/13/2009 11:47 am [GMT-0400] by kmaclean) ---
Dear Ken,
Thank you verymuch for helping me. but actualy i didnt used julian or any julius versions. I used only HTK tools to develop my system.But i followed the 10 steps given in the site to train the system. I dont know how to use julian or julius versions.
can you please help me overcome the problem?
Regards,
Dulanie
--- (Edited on 7/14/2009 1:37 am [GMT-0500] by dulanie) ---
HI Dulaine,
>But i followed the 10 steps given in the site to train the system. I dont
>know how to use julian or julius versions.
The eleventh step tells you how to use Julius/Julian... I've found that I get better recognition with Julius/Julian than with HTK, and have not used HTK that much.
Ken
--- (Edited on 7/14/2009 9:45 am [GMT-0400] by kmaclean) ---
Hi dulanie,
I am new to speach recognition. Even i have followed all 10 steps do we reallay need the 11th step to get the text content. can't we extract text from speech without using julius ?? .. Please guide me.. thankx in advance..
My aim is to get text content from .wav file without grammar definition file (.dfa and .dict).. Any help is appriciated.
--- (Edited on 7/20/2009 3:50 pm [GMT-0500] by mikeljoe) ---
hi mikeljoe
i think iam also in your way.
but, i tried julius and could recognise upto 60 % of the wave file accurately and get its corresponding text.
now, to improve that, i am planning to use htks tools for recognition purpose.
please let me know if you have got any clues with you.
--- (Edited on 10/3/2009 2:45 am [GMT-0500] by Visitor) ---
hi dulanie
i think iam also in your way.
but, i tried julius and could recognise upto 60 % of the wave file accurately and get its corresponding text.
now, to improve that, i am planning to use htks tools for recognition purpose.
please let me know if you have got any clues with you.
BTW,
did you know the changes required when we deal with female voice?
--- (Edited on 10/3/2009 2:48 am [GMT-0500] by Visitor) ---