Edit Message

Visitor Name

Re: Untitled

Unfortunately, Festival voice output is not of the quality we need in order to create good  Acoustic Models.  We would essentially be training the Acoustic Models to recognize Festival output, rather than human speech.  We need human speech to ensure that the Acoustic Models can recognize other humans.

Having said that I have heard some commercial quality TTS (Text-to-Speech) engines that comes pretty close to human voice, but I get the sense that with the current state of  technology we still need human speech to train the Acoustic Models.

Hope that clarifies things a bit,
