German acoustic model and grammar

German

User: kmaclean
Date: 11/19/2009 8:52 am

Views: 9806
Rating: 8

Email from Sven:

Hi everyone,

I am a native German translator with 12 years experience for
translation of technical texts from English to German. I am also a linux system administrator (though administration
is not paying me yet ;)

As you can imagine, a reliable speech to text engine would make my life hundreds of times easier on many days.

So, on searching for free speech recognition I came across Julius and voxforge available in ubuntu repositories. I also studied the
voxforge website a bit.

As I understood your info on the website, your *English* acoustic
model is currently relying on getting more user audio input to reach usability for an actual dictation application.

However, do you already posess a grammar model (the file sample.voca and sample.grammar in your release) that is capable of more than a few words and the digits? Is it able to download such a voca file somewhere, so I could at least test the reliability of your acoustic model when it comes to dictation?

Is there any *working* dictation application available for Linux,
even if it is for English speech so I could get a picture of the
current reliability of the efforts?

And then: What effort would be needed to produce a reliable GERMAN acoustic model together with complete grammar files?

Where are you located? Do you have a German contact?

I would love to get into some deeper discussion about this with one of your engineers/maintainers.

Kind Regards,
Sven

Re: German acoustic model and grammar

User: kmaclean
Date: 11/19/2009 8:53 am

Views: 122
Rating: 7

my reply:

Hi Sven,

As I understood your info on the website, your *English* acoustic
model is currently relying on getting more user audio input to reach
usability for an actual dictation application.

No, we are only targeting command and control acoustic models for now ('read' speech is more suitable for such an application)

However, do you already posess a grammar model (the file sample.voca
and sample.grammar in your release) that is capable of more than a
few words and the digits? Is it able to download such a voca file
somewhere, so I could at least test the reliability of your acoustic
model when it comes to dictation?

Not sure what you mean by a grammar model... see this page for explanation of the difference: Step 1 - Task Grammar

If you want to add words to the grammar file, then try it out... if the word is in the pronunciation dictionary, then it should work. Step 1 of the tutorial also explains the internals of a julius/julian grammar file.

Is there any *working* dictation application available for Linux,
even if it is for English speech so I could get a picture of the
current reliability of the efforts?

The limiter is not the software (Julius was designed for Japanese dictation). The limiter is a good transcribed, and segmented speech corpus for the target language.

For the reason why, see Arthur Chan's article on "Why there is no Open Source Dictation" in this thread: Speech Recognition Engine comparison .

And then: What effort would be needed to produce a reliable GERMAN
acoustic model together with complete grammar files?

1000+ hours of transcribed, segmented speech (continuous - not read); a German phoneme list, and pronunciation dictionary; and a German "acoustic tree questions" file. You would also need a million+ word Language Model.

Where are you located? Do you have a German contact?

VoxForge is not a commercial enterprise, though Nickolay (nsh) has a company that does speech recognition consulting work.

>I would love to get into some deeper discussion about this with one
>of your engineers/maintainers.

You might want to try the VoxForge acoustic model creation tutorial to get an idea of what is involved in creating an acoustic model.

I'd like to post this thread on the VoxForge forum to get feedback from others, please let me know if that is OK

thanks,

Ken

Edit: Fixed links

Re: German acoustic model and grammar

User: kmaclean
Date: 11/19/2009 8:54 am

Views: 152
Rating: 7

His reply:

Thanks for the anwsers, Ken! Of course, you can post this on the forum.

As said, I am a long experienced technical translator (English > German). And, amongst my colleagues every
now and then somebody brings up the question of speech recognition for dictation (Usually after having typed,
again, some word like 'Kontrollkästchen' (check box) a hundred times in the last few minutes ;). But, no one
ever really bought an dictation app as far as I know. Translators naturally are fast in typing and efficient
in using key shortcuts for controlling the typing application. So a recognition engine would have to be fast,
too, and allow for a high degree of customization when it comes to controlling commands. And then, our texts
are usually highly specialized, which might be a disadvantage when using "general" dictation apps, but might
actually be an advantage when one would try to tailor the recognition engine (speech corpus?) for technical
texts. A lot of technical terms, but low in overall count of unique words and mostly simple in grammar
actually used in the texts.

I really do not know what market there would be to get some return on the efforts for a German dictation app.
I just know that, when I am facing tight deadlines for a project, my hands often get stiff, and my wrists
start to hurt after typing for hours and hours. At that point, I sometimes would welcome, and perhaps even pay
for, a dictation app if it would work fast and reliably.

thanks,
Sven

German dictionary needs prefix and suffix morphological rules

User: ralfherzog
Date: 11/19/2009 2:43 pm

Views: 149
Rating: 8

Hi Ken!

"You would also need a million+ word Language Model."

Yes, that is true. But currently, we have performance problems with large lexicons. A solution could be to learn from how OpenOffice.org spelling dictionaries are beeing compressed. Their dictionaries are being split into two files (.dic and .aff). In the long term, such an approach would help us. I think that pronunciation dictionaries for German, Latin, Dutch, Italian, Spanish need a special compression method. It would be good if someone who knows how OpenOffice.org spelling dictionaries are being compressed would provide prefix and suffix morphological rules. So at the moment, we are far away from a solution for the German language.

Hello Sven,

"Is there any *working* dictation application available for Linux"

No, there isn't.

"What effort would be needed to produce a reliable GERMAN acoustic model together with complete grammar files?"

Well, I could need help with the improvement of Ralf’s German dictionary (GPLv3; contains more than 300.000 German words).

I want to use sam for the development of a German acoustic model. In my opinion, this approach probably would be the easiest way.

Greetings,

Ralf

Re: German dictionary needs prefix and suffix morphological rules

User: kmaclean
Date: 11/19/2009 3:31 pm

Views: 3946
Rating: 8

Hi Ralf,

>Yes, that is true. But currently, we have performance problems with

>large lexicons.

Language models and pronunciation dictionaries (i.e pronunciation lexicons) are separate things.

A language model is a very large list of words/phrases with their probability of occurrence and is used in dictation applications. Simon is used for command and control and I believe uses grammar files.

Pronunciation dictionaries are used in the creation of acoustic models, and for pronunciation information in grammar files.

Ken

P.S. thanks for letting me know about the link problems... they worked fine in the email, but something happened when I copied them to the WebGUI forum... must be a feature :)

Previous • Next •


Username	Password