VoxForge
--- (Edited on 11/4/2010 3:10 pm [GMT-0500] by gothrog) ---
>What would be the best way to expand the vocabulary of my local repository?
Depends what you want to do: speaker dependant vs speaker independent speech recognition? This CMU SPhinx page has lots of excellent info that is applicable to all speech recognition engines:
When you need to train
You want to create an acoustic model for new language/dialect
OR you need specialized model for small vocabulary application
AND you have plenty of data to train on:
1 hour of recording for command and control for single speaker
5 hour of recordings of 200 speakers for command and control for many speakers
10 hours of recordings for single speaker dictation
50 hours of recordings of 200 speakers for many speakers dictation
AND you have knowledge on phonetic structure of the language
AND you have time to train the model and optimize parameters (1 month)
When you don't need to train
You need to improve accuracy - do acoustic model adaptation instead
You don't have enough data - do acoustic model adaptation instead
You don't have enough time
You don't have enough experience
>Do I just create a new "prompts" file with more sentences and then get a
>recording of those words?
Yes you can do this, but you need to make sure that you add any the pronunciations for words that are not in you pronunciation dictionary
>Is there .wavs or .mfcc files that I can just take from?
There's lots of audio here: http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Audio/Main/16kHz_16bit/
>I'm kind of confused in how to utilize the VoxForge Consortium properly.
VoxFOrge is just an Open Source project, not a consortium...
>Do you need me to do this too?
Depends what you are trying to do... see above
--- (Edited on 11/4/2010 5:56 pm [GMT-0400] by kmaclean) ---