
140 hours for each language or for all together?
User: Sergey
Date: 5/2/2008 8:49 am
Views: 5837
Rating: 24


Does submission of audio in any language counts toward the 140 hour goal, or should it be 140 hours for each language?




Re: 140 hours for each language or for all together?
User: kmaclean
Date: 5/2/2008 11:23 am
Views: 257
Rating: 25

Hi Sergey,

> Does submission of audio in any language counts toward the 140 hour goal,

>or should it be 140 hours for each language?

The metrics page only applies to the English language. 

I don't know exactly how much speech is required for a good command and control acoustic model.  The Acoustic Models used by Sphinx were trained using 140 hours of 1996 and 1997 hub4 training data.  I just used that as a target.  I assume that the same would apply to other languages. 

Note: nsh has mentioned that it's not really the amount of speech that is important, but the quality (accurate transcriptions, low noise, ...).


