VoxForge
1. OK, am I understanding this right. Situation: I submit an audio file, containing my voice and a text file that would contain the words I said. Will that audio and text file automatically be GPL'd and be made public? (accessible to all)
I'm not fond of the idea that everyone can download my 'voice examples' and they can do whatever they want with it. I would rather suggest to GPL the compiled files for the various speech recogition engines.
2. When will files be uploaded for languages other than English?
3. Using speech-2-text for programming would require some sentences or stories to contain words like slash, comma, tilde, ampersant etc. How will you handle the 'special characters'?
Personal note: It's great that you set up this project and I hope it will succeed in making Open Source Speech Recognition fast and most of all acurrate.
--- (Edited on 10/10/2006 7:24 pm [GMT-0500] by Visitor) ---
1. all submitted Speech Audio Files will be in GPL format - this means that any derivative works (i.e. Acoustic Models) must also be GPL. Section 2b) of the GPL license states:
"You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License."
Program is defined as any work (e.g. VoxForge Speech Audio) with a GPL notice.
2. Other languages - no timeframe yet - but it is one of our objectives; what language did you have in mind?
3. speech-2-text for programming & special characters: you can create your own prompt files and submit your own speech audio, and get others to submit speech audio. The Triphone Coverage Prompts section of the Submit Audio How-to covers this.
Remember though, the english language is made up of about 40 phonemes - i.e. the sounds that make up all words (see this FAQ entry). So once we have enough speech audio data, it's really up to how you set up your speech grammar and your application to respond to different commands. However, having audio for specific words in the VoxForge Audio database will help improve recognition accuracy.
Hope this answers your question,
Ken
--- (Edited on 10/10/2006 10:50 pm [GMT-0400] by kmaclean) ---
Well, I'm still not satisfied with the fact that 'the world' could download and use the files containing my voice, but that's my cup of tea. It might sound a bit paranoid, but people could misuse these sound recordings. If anyone would want to make a prank call or threaten someone, I wouldn't want them to use my voice!
On the other hand I would have no problem at all if Voxforge would use my voice files to create Accoustic models etc.
Other languages that came to mind were: Dutch and German (perhaps even British-English)
--- (Edited on 10/19/2006 7:05 am [GMT-0500] by Visitor) ---
Yes, I struggled with that too. But then though about all the TV personalities, movie stars and politicians who are in the same boat - they would be much better targets for the types of misuse you refer to in your post. People graciously donate their speech audio all over the Internet - look at the public domain audio books sites (Librivox, Gutenberg Audio Books Project, etc.).
The whole reason for this site is to collect the 'source' audio for Acoustic Models, so others can innovate and create their own, and hopefully better, models, and thus get free Speech Recognition moving forward on GNU/Linux.
I think a compromise approach would be for you to register with a non-identifying user name, submit your speech, but rather than stating your name in the copyright clause, assign your copyright to the Free Software Foundation. That way there is no link to who you are.
Ken
--- (Edited on 10/19/2006 11:07 am [GMT-0400] by kmaclean) ---