Acoustic Model Discussions

Nested
Using Government Speeches?
User: ndl
Date: 12/21/2009 3:46 pm
Views: 5196
Rating: 2

Does VoxForge use the pre-transcribed audio downloadable from the US or other governments' press releases?  I have found large amounts of transcribed speech released public domain on US government sites, such as White House press briefings.


If not, is there a reason why this audio is unsuitable for use?

--- (Edited on 12/21/2009 3:46 pm [GMT-0600] by ndl) ---

Re: Using Government Speeches?
User: Robin
Date: 12/22/2009 4:52 am
Views: 169
Rating: 3

That does seem like a very interesting option. I didn't know that government produced material in the US is in the public domain. That is not the case where I live.

It could be that the material is not available in a suitable (i.e. the most suitable) format, such as MP3. It would be perfect if it were available in an uncompressed format (or lossless compressed). If it is indeed compressed speech, then it is not more useful than speech from LibriVox. If that's the case, then it is an additional source of speech, but not a proper replacement for uncompressed speech, which is what we prefer right now.

Also, the hardware used for the recording might not be comparable to what one would use for dictation with a computer, but that is not really something negative. In the future we might also want to make speech recognition software to transcribe speeches after all.

In the copyright notice on the website it says that the US government can hold copyright under certain circumstances (e.g. assignment). I don't know if the speechwriters that are hired by the US government transfer their copyright, or if the general rule applies? That would be something to keep in mind, but it seems like a good possibility.

--- (Edited on 12/22/2009 4:52 am [GMT-0600] by Robin) ---

--- (Edited on 12/22/2009 5:11 am [GMT-0600] by Robin) ---

Re: Using Government Speeches?
User: ndl
Date: 12/29/2009 4:58 pm
Views: 133
Rating: 1

I've only found MP3 files so far, but I haven't been searching long enough to discount the possibility of finding uncompressed speech.

Is compressed speech still worth including?

If so, what are the next steps?  Is there anything I can help with in terms of aggregating or preprocessing this stuff?

--- (Edited on 12/29/2009 4:58 pm [GMT-0600] by ndl) ---

Re: Using Government Speeches?
User: kmaclean
Date: 1/5/2010 1:27 pm
Views: 2182
Rating: 2

>If so, what are the next steps?  Is there anything I can help with in

>terms of aggregating or preprocessing this stuff?

Yes!

The acoustic model training process requires segmented speech - i.e. speech in 15-25 word segment lengths, preferably cut-off at a natural pause in the speech. 

This tutorial explains the process:  Automated Audio Segmentation Using Forced Alignment (Draft); and this script automates some of the process (and contains in-line documentationon the steps to take):  Audiobook.pm.

Though we prefer uncompress speech, if we can get high quality transcribed/segments mp3 speech audio, then we will take it, and simply label it as such in the corpus.

thanks for your interest in VoxForge!

Ken

 

--- (Edited on 1/5/2010 2:27 pm [GMT-0500] by kmaclean) ---

PreviousNext