VoxForge
The first step in Hidden Markov Model
("HMM") training is defining a prototype model called "proto".
The focus
here is to create a model structure, the parameters are not
important. Create a file called proto in your 'voxforge/tutorial' directory containng the
following:
~o <VecSize> 25 <MFCC_0_D_N_Z> ~h "proto" <BeginHMM> <NumStates> 5 <State> 2 <Mean> 25 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 <Variance> 25 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 <State> 3 <Mean> 25 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 <Variance> 25 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 <State> 4 <Mean> 25 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 <Variance> 25 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 <TransP> 5 0.0 1.0 0.0 0.0 0.0 0.0 0.6 0.4 0.0 0.0 0.0 0.0 0.6 0.4 0.0 0.0 0.0 0.0 0.7 0.3 0.0 0.0 0.0 0.0 0.0 <EndHMM> |
For details of what all this means, see the HTK book.
You also need a configuration file. Create a file called config in your 'voxforge/tutorial' directory and containing the following data:
TARGETKIND = MFCC_0_D_N_Z TARGETRATE = 100000.0 SAVECOMPRESSED = T SAVEWITHCRC = T WINDOWSIZE = 250000.0 USEHAMMING = T PREEMCOEF = 0.97 NUMCHANS = 26 CEPLIFTER = 22 NUMCEPS = 12 |
Note: the target kind in you proto file (the "MFCC_0_D_N_Z" on the first line), needs to match the TARGETKIND in your config file.
You also need to tell HTK where all your feature vector files are located (those are the mfcc files you created in the last step). You do this with an HTK script file. Therefore, create a file called train.scp.
The next step is to create a new folder called hmm0.
Then create a new version of proto in the hmm0 folder - using the HTK HCompV tool as follows:
HCompV -A -D -T 1 -C config -f 0.01 -m -S train.scp -M hmm0 proto |
This creates two files in the hmm0 folder:
Leave one blank line at the end of your file.
- put the phone in double quotes;
- add '~h ' before the phone (note the space after the '~h'); and
- copy from line 5 onwards (i.e. starting from "<BEGINHMM>" to "<ENDHMM>") of the hmm0/proto file and paste it after each phone.
This creates the hmmdefs file, which contains "flat start" monophones.
The final step in this section is to create the macros file.
A new file called macros should be created and stored in your 'voxforge/tutorial/hmm0' folder:
It should look something like this when you have finished:
~o |
Next, create 9 new folders named consecutively in your 'voxforge/tutorial' folder: hmm1 to hmm9.
The Flat Start Monophones are re-estimated using the HERest tool. The purpose of this is to load all the models in the hmm0 folder (these are contained in the hmmdefs file), and re-estimate them using the MFCC files listed in the train.scp script, and create a new model set in hmm1. Execute the HERest command from your 'voxforge/tutorial' directory:
HERest -A -D -T 1 -C config -I phones0.mlf -t 250.0 150.0 1000.0 -S train.scp -H hmm0/macros -H hmm0/hmmdefs -M hmm1 monophones0 |
The files created by this command are:
This process is repeated 2 more times, creating new model sets in hmm2 and hmm3, respectively:
HERest -A -D -T 1 -C config -I phones0.mlf -t 250.0 150.0 1000.0 -S train.scp -H hmm1/macros -H hmm1/hmmdefs -M hmm2 monophones0 |
The files created by this command are:
HERest -A -D -T 1 -C config -I phones0.mlf -t 250.0 150.0 1000.0 -S train.scp -H hmm2/macros -H hmm2/hmmdefs -M hmm3 monophones0 |
The files created by this command are: