General Discussion

Flat
dictation
User: colore
Date: 4/23/2007 10:32 am
Views: 13458
Rating: 29
hello

I am interested to speech dictation for greek

is there a program to do it?

thanks

--- (Edited on 4/23/2007 10:32 am [GMT-0500] by colore) ---

Re: dictation
User: kmaclean
Date: 4/23/2007 12:43 pm
Views: 317
Rating: 29

Hi colore,

Sphinx, ISIP, HTK, and Julius can all do languages other than English.  It's just that you will need to create Acoustic Models and Language Models for the target language. 

The Julius Speech Recognition Engine is the only one of the 4 listed above that actually states that it can perform dictation.  However, they only supply Japanese Acoustic Models and Language Models.

You might want to start with the VoxForge Create Acoustic Model Tutorial to get a better understanding of what is involved in creating an English Acoustic Model for Julius (using the HTK HMM Toolkit).  And then try creating a Greek Acoustic Model using your voice, and test it with a Greek grammar.  Once you have accomplished this, go to the HTK Book for information on creating Language Models for dictation.

It's not easy, but it can be done,

Ken 

--- (Edited on 4/23/2007 1:43 pm [GMT-0400] by kmaclean) ---

Re: dictation
User: Tony Robinson
Date: 4/23/2007 1:04 pm
Views: 343
Rating: 34

I've built a Greek LVCSR system in the past.   One of the advantages that  you have is that just a few rules get you from the Greek letters to pronunciations (unlike English).

Good luck,

 

Tony 

-- 

Dr Tony Robinson, CEO Cantab Research Ltd
Phone:  +44 845 009 7530, Fax: +44 845 009 7532


--- (Edited on 23-April-2007 7:04 pm [GMT+0100] by Tony Robinson) ---

Re: dictation
User: colore
Date: 4/23/2007 1:05 pm
Views: 369
Rating: 35
thanks for your replies

can you send me your greek dictation system please

thanks

--- (Edited on 4/23/2007 1:05 pm [GMT-0500] by Visitor) ---

Re: dictation
User: Tony Robinson
Date: 4/23/2007 1:30 pm
Views: 286
Rating: 24

The system I built (for my previous company), is "enterprise-class" commercial software, i.e. priced at corporate not personal budgets.

You might get lucky and persuade a Greek university to part with the acoustic and language models needed, but somehow I doubt it as there can be quite a lot of support issues if you are not used to speech recognition (and even if you are).

Which gets you back to Ken's post, if I were you I'd follow his advice, build a system in English based on the code and data on this site, then when you are confident you know what you are doing, record your own voice and build a Greek system.

 

Tony 

-- 

Dr Tony Robinson, CEO Cantab Research Ltd
Phone:  +44 845 009 7530, Fax: +44 845 009 7532


--- (Edited on 23-April-2007 7:30 pm [GMT+0100] by Tony Robinson) ---

Re: dictation
User: kmaclean
Date: 4/24/2007 8:53 am
Views: 298
Rating: 29

I forgot to mention HDecode (included with HTK version 3.4), which is a large vocabulary speech recognition decoder, which should be able to perform dictation.  However, it has license restrictions, which limits the use of the software and generated acoustic models to research purposes only.

Another option is the Spice Project (still under construction), which is working on a web site that will provide the ability to create an Acoustic Model, Language Model and Dictionary in the language of your choice, for use with the Janus Speech Recognition Engine.  Unfortunately, Janus is not open source, but you might want to contact them to get a Janus run-time.  Spice was reviewed in this post.

Ken 

--- (Edited on 4/24/2007 9:53 am [GMT-0400] by kmaclean) ---

Re: dictation
User: Visitor
Date: 4/25/2007 4:46 pm
Views: 289
Rating: 19
unfortunately after extensive research I only found a dead and never published (available for download) greek dictation system

http://www.speech.tuc.gr/projects/logotypografy_main.html

there is no other open source dictation system for greek :(

unfortunately, there is only two commercial products Logografos and MLS Talk and Write

ViaVoice used to support greek, but not any more

options for greek dictation are really limited :(

--- (Edited on 4/25/2007 4:46 pm [GMT-0500] by Visitor) ---

Re: dictation
User: colore
Date: 4/29/2007 6:22 pm
Views: 285
Rating: 24
can you tell me please, since I will try to build my own voice library, which program is the best to use? so that my effort wont waste
any comparison?

thanks

--- (Edited on 4/29/2007 6:22 pm [GMT-0500] by Visitor) ---

Re: dictation
User: kmaclean
Date: 4/30/2007 3:37 pm
Views: 334
Rating: 29

Hi colore,

This was discussed in this post.

I can't really tell you which is better.  I have not done any performance comparisons.  

I'm biased towards HTK/Julius, because that is the first package I started working on.  I picked HTK because it seemed to have the best documentation (at the time ...).

Julius is supposed to work in dictation applications (in Japanese at least ...) - once VoxForge has robust enough Acoustic Models, we will be able to test this claim for English.  It can also be used in Command and Control applications and telephony applications using its Julian module.

On the other hand Sphinx has a larger community.  But it tends to be used more in command and control or telephony applications.  The xVoice project tried to use Sphinx-2 for dictation (trying to replace IBM's closed source Via-Voice), but gave up on it.

I don't have much experience with ISIP yet. 

So it really depends on what you are trying to do, and how much help you will need to do it ... 

Ken 

--- (Edited on 4/30/2007 4:37 pm [GMT-0400] by kmaclean) ---

Re: dictation
User: kmaclean
Date: 5/1/2007 2:35 pm
Views: 2901
Rating: 23

Hi colore,

You might find this article of interest: 

A COMPARISON OF PUBLIC DOMAIN SOFTWARE TOOLS FOR SPEECH RECOGNITION, by Samudravijaya K and Maria Barot. 

They compare HTK and Sphinx for a Hindi speech recognition system.  They conclude that "although recognition accuracies of the two systems are comparable, [they] observe that the acoustic modeling of Sphinx is superior"

In my experience,  Julius (even though it uses Acoustic Models created using the HTK toolkit) performs much better than HTK in grammar related tasks.  I don't know how well it compares against the Sphinx group of speech recognizers.

Ken 

--- (Edited on 5/1/2007 3:35 pm [GMT-0400] by kmaclean) ---

PreviousNext