VoxForge
I'm building a new voice grammar and repeately I get an error in step 10 making hmm 13. I have tried both the script-way and the hard oldschool tutorial;-)
I have even tied just to run it with the original data and voice but the same error-message accur every time:
ERROR [+2662] FindProtoModel: no proto for p in hSet
FATAL ERROR - Terminating program HHEd
making hmm14
ERROR [+5010] InitSource: Cannot open source file ./interim_files/tiedlist
ERROR [+7010] InitHMMSet: Can't open list file ./interim_files/tiedlist
ERROR [+2321] Initialise: MakeHMMSet failed
FATAL ERROR - Terminating program HERest
making hmm15
ERROR [+5010] InitSource: Cannot open source file ./interim_files/tiedlist
ERROR [+7010] InitHMMSet: Can't open list file ./interim_files/tiedlist
ERROR [+2321] Initialise: MakeHMMSet failed
FATAL ERROR - Terminating program HERest
cp: cannot stat `./interim_files/hmm15/hmmdefs': No such file or directory
cp: cannot stat `./interim_files/tiedlist': No such file or directory
Additional I thought of replacing the old julius-3.5.2 with the new julius-3.5.3. There should not be any problems just doing this or is there..?
--- (Edited on 3/ 1/2007 10:04 am [GMT-0600] by Visitor) ---
Hi Sonny,
I am thinking that you might have either added a word in your new grammar which does not have any acoustical data, or you do not have enough audio in your training set (since you have created a new grammar), and you need to add audio for words that contain the phone 'p'. Usually, you need about 30-40 lines of speech audio, with 5-10 words each. The set must also be 'phonetically balanced', which in this case means you need least 3 audio samples for each phone in your grammar - the output from the HDMan command in step 2 will give you an idea of your phone counts.
The error message seems to be saying that your 'Tree.hed' file has a 'question' for the letter 'p', and it cannot find it in your monophone Acoustic Model (the one you built in steps 1 - 9). To test this theory out, you might try re-running step 10 with a Tree.hed file that does not have any 'p' questions in it. For example, the Tutorial Tree.hed file contains (note: QS = question):
QS "R_Stop" { *+p,*+pd,*+b,*+t,*+td,*+d,*+dd,*+k,*+kd,*+g }
QS "R_C-Front" { *+p,*+pd,*+b,*+m,*+f,*+v,*+w }
try updating it (temporarily) to:
QS "R_Stop" { *+b,*+t,*+td,*+d,*+dd,*+k,*+kd,*+g }
QS "R_C-Front" {*+b,*+m,*+f,*+v,*+w }
and see if it runs. If you can complete Step 10 without the 'p' question, then you need to find out which words in your new grammar contain the phone 'p', and add new speech audio that contains these same words, and re-train your acoustic model.
Other ideas:
1. It might be a good idea to go through your log files - even for commands that look like they ran OK, and look for warnings that might indicate the problem. HTK is notorious for letting early step commands seem to run OK, only to cause problems in later steps, but without giving a clear indication what the problem might be.
2. Another approach would be to search the HTK archives - search on "no proto for" error messages. You might have to download load the whole thing, and search that way - it is well worth the effort to do so.
Let me know how you make out,
Ken
--- (Edited on 3/ 1/2007 2:20 pm [GMT-0500] by kmaclean) ---
In the step-by-step guide i tried to do as you surgested, but without success. The error message I got after removing the p's QS R_stop and QS R_C-front was precisely the same as before.. Running through there came additional a lot of warnings:
WARNING [-2631] QuestionCommand: No items for question L_Affricate
in HHEdWARNING [-2631] QuestionCommand: No items for question L_Affricate
in HHEd
I believe that these warnings also existed before, I might not have noticed..
Futhermore I have upgraded my prompts file with 30 different sentences of the words and all of them is used as a minimum of 7 times and the sentences is in generel 8 words. I tried again to use the scripts-version and again I got the same error-message.. :o(
I have looked in the logs, i assume you talk about dlog, flog and HVite_log, but there does not seems to be any abnormalities as far as I can se. It look precisely like the one in the tutorial..
Now I will look through the archieve and hope for the best..
--- (Edited on 3/ 2/2007 6:20 am [GMT-0600] by Visitor) ---
I have solved the problem.. Thank you for your effort in helping me..
Problem and solution:
Problem ->
The problem was that the grammer I have designed did not include words that uses the pronounciation of -p. Even though I did not use the any p-words HTK wanted it included in the grammar anyway. After trying to fix the problem, and solved the -p, it began to complain about different other pronounciations, example -ou, so I realised what the problem was and how to solve it.
Solution -> The solution was to add words to the dictionary that uses the prounciation. Simply find them in the lexicon-file and add them to the .voca-file and upgrade your .grammar-file to include the words. After this you create update the prompts-file with these words and make audio-files including the new words (You do not even have to actually make the audio-files for it to work, so you can test if it helps you in your problem before actually using alot time on recording). When this is done the automatic grammar-scripts run smooth.. :o)
Hoping that this can help others with related problems..
What I do not understand is why your first surgestion did not work in praksis. I mean erasing the -p in the surgested file must also exclude this in the excecuting file and therefore eliminate the problem, but apparently not. Well
Well, problem solved and I'm happy now.. :)
--- (Edited on 3/ 2/2007 9:34 am [GMT-0600] by Visitor) ---
>What I do not understand is why your first suggestion did not work in practice. I mean erasing the -p in the suggested file must also exclude this in the executing file and therefore eliminate the problem, but apparently not.
I think I figured out what might be going on here ...
In Step 10, you execute the HDMan command against the entire lexicon file, not just the dictionary based on the words in your grammar, which you used up to Step 9. So if you erase the 'p' questions from the tree.hed file, HDMan will get confused because it will find references to 'p' phones in the larger lexicon file, even though your grammar file does not contain the 'p' phone it it.
So back to the original error:
"no proto for p in hSet".
Basically, the HERest command in step 10 is trying to create new 'tied-state' hmms based on the much bigger lexicon file (you want tied-states because you don't have enough training data for all the triphones in the lexicon).
But you only recorded audio for the words in your grammar file. And then you created monophones (steps 1-8) and triphones (step 9) hmms for only the phones included in your grammar file, which does not include the phone 'p'.
HERest in step 10 looks at the Lexicon file, which includes many more words not included in your grammar file, comes across a word that has the phone 'p', and tries to find it in your triphone hmms, and then spits out an error message "no proto for p in hSet", because your grammar never had that phone.
that is what I think is happening,
Ken
--- (Edited on 3/ 2/2007 4:02 pm [GMT-0500] by kmaclean) ---
This is interesting because I would suspect I have the same problem, but my error is:
Creating HMMset using trees to add unseen triphones
ERROR [+2662] FindProtoModel: no proto for o in hSet
FATAL ERROR - Terminating program ./HTKTools/HHEd
--- (Edited on 5/27/2010 3:39 pm [GMT-0500] by joshuajnoble) ---
Had to check my fulllist file, had an 'o' in it b/c I had c/p'd it from a different project...Sheesh :)
--- (Edited on 5/27/2010 4:26 pm [GMT-0500] by joshuajnoble) ---