VoxForge
Julian will find out the best fit sentence in the grammar every time I speak out,EVEN a noise happen,it will give a result.
How do I know a result is generated from a purposeful voice or a noise?
use the score1? but I found that the score of the purposeful voice and the score of the noise are nearly the same.where is the bounds?
is there another way?
thanks
noise's result
--------------------------------------
pass1_best: <s> <n><num>1</num>ZYF</n>
pass1_best_wordseq: 0 2
pass1_best_phonemeseq: sil | jh ow r er n f aa
pass1_best_score: -10684.339844
length: 318 frames (1.06 sec.)
### Recognition: 2nd pass (RL heuristic best-first with DFA)
samplenum=318
sentence1: <s> <n><num>3</num>ZMY</n> </s>
wseq1: 0 2 1
phseq1: sil | jh aa ng m ae ng y iy | sil
cmscore1: 1.000 0.639 1.000
score1: -13599.010742
6 generated, 6 pushed, 4 nodes popped in 318
purposeful voice
---------------------------------------------------------
pass1_best: <s> <n><num>1</num>ZYF</n>
pass1_best_wordseq: 0 2
pass1_best_phonemeseq: sil | jh ow r er n f aa
pass1_best_score: -12398.662109
length: 386 frames (1.28 sec.)
### Recognition: 2nd pass (RL heuristic best-first with DFA)
samplenum=386
sentence1: <s> <n><num>2</num>LDH</n> </s>
wseq1: 0 2 1
phseq1: sil | y ow b aa hh aa | sil
cmscore1: 1.000 1.000 1.000
score1: -14228.536133
8 generated, 8 pushed, 4 nodes popped in 386
Hi manio,
>How do I know a result is generated from a purposeful voice or a noise?
See this thread, specifically the post dealing with: "creating a grammar with a few Out-of-Vocabulary words". It talks about using an out-of-vocabulary grammar in the context of a one word grammar, but this should help in your case too.
Another thing that might help is a better acoustic model. This is especially important when you are trying to recognize short words (like numbers).
If you created your own acoustic model, then you will need more training utterances. You could also take the VoxForge acoustic model and adapt it to your voice (using your training utterances).
You could also submit your text and speech recordings to VoxForge, or submit some speech using the speech submission application, and it will be incorporated into the VoxForge acoustic model nightly build, usually that night (but not always ...).
Ken