If you have contributed speech for all the Prompts in the Speech Submisson Section, you may be interested in contributing additional speech recordings using your own prompts or transcribing audio recorded by others.

These types of submissions will help to ensure that we get speech audio for as many different words as possible (especially words not already included in our Phoneme Coverage Prompts), and thus provide coverage for as many triphones as possible.  It is not enough to get many different people reading the same VoxForge created Phoneme Prompts files (why? because the resulting Acoustic Models will only be as good as the triphones covered in those files). We need a large variety texts to ensure we cover as many of the triphones in the English language as possible.

User Submitted Audio

Suggestions for user-submitted prompts:

  • Gutenberg Project - The Gutenberg project is a library of 17,000 free ebooks whose copyright has expired.  Pick one and record all or part of it.  Then submit a compressed audio version to the Gutenberg Audio Books project (e.g. mp3) and submit the uncompressed version (in wav format) to VoxForge.
  • Wikipedia - Wikipedia is the free encyclopedia built collaboratively using Wiki software.  Pick your favourite article and record all or part of it. Then submit a compressed version of your audio (in ogg format) to the Spoken Wikipedia project and submit an uncompressed version (in wav format) to VoxForge.
  • Google Books - Google Books gives you access to many out-of-copyright books that you can download (you need to select the "Full view" radio button when you search on

Don't worry if you don't have the time (or the inclination) to create VoxForge style prompts and/or audio files.  We can convert your "one big prompt file" and corresponding "one big speech audio file" (in uncompressed wav format) into VoxForge style prompt and audio files.  What is important is that we get as many varied speech audio contributions as possible.

Transcribing Speech Audio to VoxForge Format

There may be some legal issues with reading Wikipedia, since they are now licensed under the Creative Commons Attribution-Sharealike 2.0 license.   From the GNU website:

Creative Commons Attribution-Sharealike 2.0 license (a.k.a. CC-BY-SA)

This is a copyleft free license that is good for artistic and entertainment works, and educational works. Please don't use it for software or documentation, since it is incompatible with the GNU GPL and with the GNU FDL.