How-to Rate an Audio Submission to VoxForge

Frequently Asked Questions

User: kmaclean
Date: 1/1/2010 10:35 am

Views: 9783
Rating: 7

VoxForge is not looking for TV or radio announcer quality voices (just listen to my voice recordings ...) or perfect audio quality.

For Free and Open Source Speech Recognition to work, we need a large variety of speech (from different people, with different dialects/accents, and using different prompts files with various phonemes and triphones) recorded in a variety of environments (rooms with echo, such as hardwood floors or tiles, and rooms with no echo, such as carpet, etc.) and on a variety of recording equipment (headset mics, desktop mics, built in mics, and USB mics, integrated audio, audio cards ...).

That is not to say that you should not try to minimize non-speech noise in your audio submissions, it just that the submissions we are looking for should reflect the environments where the acoustic model might be used for speech recognition.

Therefore, most audio submitted to the VoxForge site should receive a thumbs up. This is because it takes some effort to create a recording (when you are first starting out), and new submitters should be encouraged, not discouraged. A "thumbs up" rating would go a long way to encouraging submissions.

What should result in a thumbs down is when a transcription doesn't match its corresponding audio or when there is excessive background noise (i.e. non-speech noise or talking in the background). What is "excessive noise" is subjective, since the Acoustic Model creation process can tolerate some low level hiss and/or hum (usually heard in quiet periods of some recordings). But if enough people submit their rating of a submission, on average we should get a good view of the quality of a recording for use in the creation of acoustic models.

Previous • Next •


Username	Password