VoxForge
Here is a thread with respect to another user's (very valid) opinion as to the state of the VoxForge Submission System:
On 7/18/07, mahasamoot wrote:
Good god. Do you think you could make it any more of a hassle to contribute to your project? No wonder you don't have enough data yet.
On 7/19/07, Ken MacLean wrote:
thanks for the feedback!
--- (Edited on 7/22/2007 10:41 pm [GMT-0400] by kmaclean) ---
--- (Edited on 7/22/2007 10:42 pm [GMT-0400] by kmaclean) ---
--- (Edited on 7/22/2007 10:42 pm [GMT-0400] by kmaclean) ---
--- (Edited on 7/22/2007 10:43 pm [GMT-0400] by kmaclean) ---
--- (Edited on 7/22/2007 10:44 pm [GMT-0400] by kmaclean) ---
We need hundreds of hours of English speech ... this is a *long-term* project. The Sphinx group of recognizers use about 140 hours of speech ... [but] they don't approach commercial quality speech recognition.
... We have about 22 hours of English speech so far. But this is misleading, because I'm basically counting all submissions, whether it would make sense to include them or not (i.e. ... or the speech is not the right dialect for the first release of the corpus).
If you can get a few hours of transcribed audio, I can include it on the site.
A good first step might be to look at (or contact) http://thaispeech.longdo.org/.
--- (Edited on 7/22/2007 10:44 pm [GMT-0400] by kmaclean) ---
So is the main difference between Julius/Sphinx & ViaVoice/DNS the size or quality of the corpus? Do you think one of these systems has better underlying technology? Or some combination of the above?
What dialect(s) are you most interested in? I assume General American? Do you want only one dialect to start with? What's the right mix between speakers and sample size for say, 200 hrs; 2 pl x 100 hrs, 4 pl x 50 hrs, 8 pl x 25 hrs, 16 pl x 12.5 hrs, 32 pl x 6.25 hrs., 64 pl x 3.125 hrs, 128 pl x 1.5 hrs, 256 pl x 40 min, 512 pl x 20 min.,1024 pl x 10 min, 2024 pl x 5 min, 4098 pl x 2.5 min, or 8192 pl x 1.25min?
Thanks for the link. It looks like their project is stalled, though.Can you tell if they have a pronunciation dictionary?
--- (Edited on 7/22/2007 10:45 pm [GMT-0400] by kmaclean) ---