VoxForge
1. First you need to create your Julius configuration file. Copy this sample configuration file (sample.jconf) to you 'voxforge/manual' folder. For details on the parameters contained in the Sample.jconf file, see the Juliusbook for more information. The main parameters are shown below:
# VoxForge configurations: -dfa sample.dfa # finite state automaton grammar file -v sample.dict # pronunciation dictionary -h hmm15/hmmdefs # acoustic HMM (ascii or Julius binary) -hlist tiedlist # HMMList to map logical phone to physical -smpFreq 16000 # sampling rate (Hz) -spmodel "sp" # name of a short-pause silence model -multipath # force enable MULTI-PATH model handling -gprune safe # Gaussian pruning method -iwcd1 max # Inter-word triphone approximation method -iwsppenalty -70.0 # transition penalty for the appended sp models -iwsp # append a skippable sp model at all word ends -penalty1 5.0 # word insertion penalty for grammar (pass1) -penalty2 20.0 # word insertion penalty for grammar (pass2) -b2 200 # beam width on 2nd pass (#words) -sb 200.0 # score beam envelope threshold -n 1 # num of sentences to find # you may need to adjust your "-lv" value to prevent the recognizer inadvertently # recognizing non-speech sounds: -lv 4000 # level threshold (0-32767) # comment these out for debugging: -logfile julius.log -quiet |
2. Make sure your Microphone volume is similar to when you created your audio files. Then run Julius with:
C:>julius-4.3.1 -input mic -C Sample.jconf |
Example of your Julius startup output.
The first 2-3 seconds of your speech will not be recognized - Julius
adjusts its recognition levels (that is what the reference to their
being "no CMN
parameter is available on startup" is all about). In addition, Julius will
only recognize phrases from the grammar you created in Step
1.
Example of what the Julius recognition output looks like when I say "Phone Steve" into my microphone.
You should get fair recognition results. To improve recognition, your Acoustic Model needs more audio training data. You need to create new prompts, and record more speech audio files based on these prompts in order to create better acoustic models. You can speed up the training process by using the Acoustic Model creation script in the How-to (i.e. How-to Create an Acoustic Model - using a script).