VoxForge
Hi,
I was wondering how you are going through for this step.
Namely, I work on my speech recognition system, in polish and I use whole word as smallest unit, not phonemes but this doesn't matter I guess, and when I finish my recordings I create .lab file for every wav file to describe in which region are given words.
For example:
24232670 29628710 alfa
32660890 36014850 na
38873760 44752480 bravo
This numbers are in 10 ns units.
I read about it in HTK Book, but in Voxforge tutorial a don't see any steps describes this way. This means that is some way where I don't need create .lab files (this is the most time-consuming step for me)? And how HTK would like to know at what time of the recording is the each word to create robust accoustic models? If only enough prompts or mlf files?
Please some details.
Regards!
Peter
--- (Edited on 9/11/2012 7:51 pm [GMT-0500] by maxio89) ---
You can use flat start (with HCompV) like voxforge tutorial does
http://www.voxforge.org/home/dev/acousticmodels/linux/create/htkjulius/tutorial/monophones/step-6
then you don't need lab files, only prompt
If you have lab files or mlf files (lab files compressed) you can use them instead with HInit/HRest. HTK Book has a description of this process. Second way is more accurate but it requires some work to create lab files indeed.
You have a choice.
--- (Edited on 9/21/2012 21:06 [GMT+0400] by nsh) ---