VoxForge
Emailed offer to help:
Hello,
My name is Craig Harper, I own an operate a VoIP telephone network for business around the world. I would like to know if we can help you aquire the audio that you need to build an acoustic model that would be sutible to allow us to use Sphinx with asterisk with require an 8K LM and AM
What we could provide is oneside recorded conversation in any format that you like. We will of course get permission of our clients as this inturn will allow us to develop ASR for our clients.
If this would be of any use please let me know we could provide thousands of mins of recording everyday and in a number of languages.
Best Regards
Craig
--- (Edited on 7/28/2011 2:36 pm [GMT-0400] by kmaclean) ---
My reply:
Hi Craig,
Thanks for the offer!
This would be a great resource in itself for speech research purposes.
However,
if we were to use these one-sided telephone conversations to train
acoustic models, they would need to be transcribed. This is a very time consuming process at present. There are papers that discuss assisted transcription, but with current open source technology, it still requires a fair bit of manual work to get this done.
I would like to post this thread in the VoxForge forum to see what others think. Please let me know if that is OK.
thanks,
Ken
--- (Edited on 7/28/2011 2:37 pm [GMT-0400] by kmaclean) ---
Craig's last reply:
Hi Ken
Thanks for your reply, so if we had the full converstation it would be more useful?
I want to help in what ever way i can, im happy to provide resources ie server for doing any processing if required. The only thing is for client security is to do the processing inside our network.
I would also be happy to put some physical resource in place if we need to transcribe conversations, however this will of course reduce the volumes we can deal with.
I just feel there is a resource that should be used. I have no problems with posting in the forum.
Craig
Any comments or feedback are encouraged and welcome.
thanks,
Ken
--- (Edited on 7/28/2011 2:39 pm [GMT-0400] by kmaclean) ---
Did we ever get this added. I definitely.would transcribe as part of the open source project. If not added yet, is offer still on the table? Also yes the whole conversation would be best for transcription.
--- (Edited on 12/7/2014 1:16 am [GMT-0600] by Visitor) ---