VoxForge
I have some english audio lectures from my professor at school I want to use Julius as a continuous dictation ASR to create some rough transcripts for. The lectures have technical terms that a general english language model/dictionary will not suffice.
How should I start (from scratch)? What's existing in the public domain that I can use?
How do I create a language model? How do I train an acoustic model?
How much transcribed data do I need? How much training and testing data should I prepare?
--- (Edited on 3/1/2008 5:53 pm [GMT-0600] by nestea247) ---
> How should I start (from scratch)?
Download sphinx3, try it. Then collect some recordings for adaptation (not more than half an hour). Collect a lot of texts from your domain for language model (more than 50 Mb at least).
Train language model with cmuclmtk or use online tool:
http://www.speech.cs.cmu.edu/tools/lmtool.html
Adapt wsj to your speaker with MLLR:
http://www.speech.cs.cmu.edu/cmusphinx/moinmoin/AcousticModelAdaptation
Use both adapted model and language model with sphinx3
--- (Edited on 3/2/2008 5:22 am [GMT-0600] by nsh) ---
Thanks for your pointers. They are very helpful. I'm still trying to go through the tutorial and compile sphinx3.
Just curious, why did you suggest sphinx3 and not julius? Is it because sphinx3 has an existing wsj model and julius has none yet? I don't know, but I find that julius is a cleaner and easier to understand system than sphinx3.
Btw, does anyone know when will voxforge's 1.0 model be released?
--- (Edited on 3/4/2008 3:22 am [GMT-0600] by Visitor) ---
Hi nestea247,
>does anyone know when will voxforge's 1.0 model be released?
when we get 140 hours of English Speech :)
See the metrics page to see how much further we have to go.
Ken
--- (Edited on 3/4/2008 1:16 pm [GMT-0500] by kmaclean) ---