Untitled - voxforge.org

Comments

Flat

Untitled

User: Visitor
Date: 10/10/2006 6:14 pm

Views: 2242
Rating: -39 Rate [ ]

Probably a stupid question, but I'll go anyway: instead of collection a
lot of samples couldn't we just generate them by using something like
festival or any other text-to-speech software?

Reply •

Re: Untitled

User: kmaclean
Date: 10/10/2006 7:08 pm

Views: 542
Rating: -8 Rate [ ]

Unfortunately, Festival voice output is not of the quality we need in order to create good Acoustic Models. We would essentially be training the Acoustic Models to recognize Festival output, rather than human speech. We need human speech to ensure that the Acoustic Models can recognize other humans.

Having said that I have heard some commercial quality TTS (Text-to-Speech) engines that comes pretty close to human voice, but I get the sense that with the current state of technology we still need human speech to train the Acoustic Models.

Hope that clarifies things a bit,

Ken

Reply •

Previous • Add •


Username	Password