VoxForge
Hi
Which value of perplexity is enough for LVCSR task? From HTK book
"Lower perplexities represent better language models,
although this simply means that they ‘model language better’, rather than necessarily work better
in speech recognition systems "
But which exact values? What role does it play in recognition performance
--- (Edited on 4/24/2009 12:02 am [GMT-0500] by Rauf) ---
I am not a language model expert but from the discussions with people who do language modeling I understand that perplexity is always dependent on the task. By the task I mean dictionary and the set of test prompts. Only perplexities computed on the same task are comparable: If a language model has a lower perplexity (and the difference is high enough) it can be considered better and will lead to lower word error rate.
I understand perplexity (hopefully correctly) as an average number of words that can follow a word as predicted by the LM. The lower this number is the easier is the recognition.
Grammar based tasks can have perplexities lower than 10, in LVCSR the number depends on vocabulary size, language, domain etc. A friend doing LM told me that in the case of very large vocabulary (350k words) and Czech language the perplexity can be up to 800.
Here are some perplexities I have computed with the VoxForge corpus. The training and testing data were taken from here.
Trigrams, computed on testing data, no cutoff *)
perplexity 8.1436, var 2.6556, utterances 2985, words predicted 31338
num tokens 34323, OOV 0, OOV rate 0.00% (excl. </s>)
Trigrams, computed on training data, no cutoff
perplexity 56.9558, var 12.8088, utterances 2985, words predicted 29829
num tokens 34323, OOV 559, OOV rate 1.78% (excl. </s>)
Trigrams, computed on testing data, cutoff 1
perplexity 96.8261, var 11.4016, utterances 2985, words predicted 31338
num tokens 34323, OOV 0, OOV rate 0.00% (excl. </s>)
Trigrams, computed on training data, cutoff 1
perplexity 51.4681, var 15.0589, utterances 2985, words predicted 29829
num tokens 34323, OOV 559, OOV rate 1.78% (excl. </s>)
* cutoff specifies the count of n-grams in the data at which (and below) the n-grams are excluded from the model.
--- (Edited on 24.04.2009 10:19 [GMT+0200] by tpavelka) ---