Comments

Nested
participating in project
User: [email protected]
Date: 7/5/2010 5:27 pm
Views: 11634
Rating: 16

i need an open source speech rec model for my business. I can contribute 100,000+ dictations and associated transcriptions to the project (medical reports with all patient and physician information redacted). i also need a very bright programmer/technical person who is into speech recognition to work for money to build a speech model for my company. this person would be responsible for managing the dictations and the contributions to the VoxForge project. I don't know a lot about open source - we are big users of several commercial applications. however, the commercial applications are not as good as they could be - but are all closed up black boxes, so we cannot get at them to do better - we just spend LOTS of money to get what we get. We know we can do better - we have a lot of technical resources - but we need a strong engineer who knows HMM and how to build a real application with it. anybody interested. drop me an email. totally confidential.

Re: participating in project
User: kmaclean
Date: 7/14/2010 3:44 pm
Views: 344
Rating: 17

>I can contribute 100,000+ dictations and associated transcriptions to the

>project (medical reports with all patient and physician information redacted)

While I appreciate the offer, I am not sure of the Copyright implications of such a donation. 

Even though identifying info has been removed, someone might object to having their voice released publically, and since Copyright takes hold automatically, I am not sure how we can legally use this corpus. 

Did all the participants in whose voices were used in your corpus assign their Copyrights to your organization?

>i also need a very bright programmer/technical person who is into speech

>recognition to work for money to build a speech model for my company.

Nickolay at Nexiwave is your best bet.

Ken

 

Re: participating in project
User: Daniel
Date: 1/30/2011 9:26 am
Views: 359
Rating: 17

If voice files specific to medicine are required to create data sets specific for medical dictation, could we set up a way in which individual physicians could donate their own speech and text on voxforge (sans patient data obviously)?

I've tried in the past to incorporate Sphinx into Freemed (also GPL) but was unable to come up with sufficient voice and text files to even start the project.  Voxforge seems the perfect venue to make this happen.

Thoughts?

Dan

Re: participating in project
User: kmaclean
Date: 2/1/2011 9:04 am
Views: 347
Rating: 17

Hi Dan,

>If voice files specific to medicine are required to create data sets specific

>for medical dictation

The main non-software components used by speech recognition engines (such as Sphinx or Julius) are: a language model and an acoustic model

For dictation, in addition to the speech recognition engine, another software component would be required: a way continuously improve the acoustic model by updating the standard acoustic model with speech from the user - a process called 'adaption' (Nickolay talks about the importance to implementing adaptation for dictation here)

We would only be  addressing the speech audio requirements for creating acoustic models if we were to collect medical speech. 

Depending on the sentences you provide, these might be used for language model creation too. 

I am not sure if there are open source dictation systems that implement user adaptation of acoustic models: the EvalDictator dictation dialog manager might do this...

>could we set up a way in which individual physicians could donate their

>own speech and text on voxforge

Yes, we could start with a list of common medical words and phrases in a separate 'medical' section (similar to how each language has its own 'read' page).  We would target this to physicians, since they would presumably know how to correctly pronounce this specialized vocabulary.

Once we get a few hours of speech, we could start on a pronunciation dictionary (I would need your help for that) so that people could generate their own acoustic models based on the submitted medical speech.

Ken

PreviousNext