Txt2Pho - voxforge.org

German

Txt2Pho

User: kmaclean
Date: 4/16/2008 9:33 pm

Views: 27245
Rating: 30

This might be useful for the creation of a German pronunication dictionary:

TXT2PHO - a TTS front end for the German inventories of the MBROLA project.

However, the software has some restrictive licensing provisions:

Permission is granted to use this software for non-commercial, non-military purposes, with and only with the lexicon and prosody files made available by the author from the HADIFIX for MBROLA project ...

Not sure if that would apply to pronunciations generated with the toolkit.

Ken

Re: Txt2Pho

User: timobaumann
Date: 4/18/2008 10:03 am

Views: 3076
Rating: 30

I don't think we can use it.
Using TXT2PHO in order to create a dictionary is close to reading the dictionary it uses (BOMP) directly. And both the dictionary and TXT2PHO itself clearly state they are non-military, which the GPL -- unfortunately -- is not.

Anyway, if we could use it, then we could just as well use BOMP directly.

I've had a first look at Sequitur G2P (which is a trainable g2p-tool) and it's likely that I will be allowed to use another trainable g2p-tool (without name, published in [1]). Thus, I will be able to compare the two and see which performs better.

So, we need some data to bootstrap these trainable systems. I just checked in some tools that extract pronunciations from the German Wiktionary.

The resulting data has to be post-processed, before we can use it for bootstrapping. In order to priorize that, we could use the word frequency information from Wortschatz-project, for which a Perl-module (EDIT: newer version with fixed frequency extraction) is available.

I hope to be able to setup a webtool that helps to post-process the wiktionary output. Would there be anyone volunteering to actually use that webtool and help in creating the dictionary? Ralf, would you be willing (and able) to help?

Cheers!
Timo

[1]: Phonological Constraints and Morphological Preprocessing for Grapheme-to-phoneme Conversion
Vera Demberg, Helmut Schmid and Gregor Möhler, 2007
In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL-07), Prague, Czech Republic, June 2007

Re: Txt2Pho

User: kmaclean
Date: 4/18/2008 11:25 am

Views: 297
Rating: 27

Hi Timo,

Good work!

thanks,

Ken

help to create the German dictionary

User: ralfherzog
Date: 4/20/2008 12:03 am

Views: 453
Rating: 29

Hello Timo, I would use this web tool, and help you with the creation of the dictionary. Greetings, Ralf

Re: help to create the German dictionary

User: Visitor
Date: 5/14/2008 4:12 pm

Views: 257
Rating: 26

Hi Ralf,

sorry for not getting back to you any earlier.

I've set up a dictionary tool on http://www.ling.uni-potsdam.de/~timo/projekte/voxforge.html . The main task is to paste the entries in the first row on the right (Aussprachen) to the corresponding field on the left.

Now, if it was just that, it would be too easy and too boring...

Often, there are far more variants of the word on the left than there are transcriptions. In these cases it would be nice, if you could add the missing transcriptions (often it is just a matter of appending -? or -? or whatever.

Sometimes the list on the left contains ridiculous word forms -- just leave the corresponding field empty (or press "Wort entfernen", but the result will be the same). It may also happen, that you are asked for the same word more than once (there are different entries for "bin", "ist" "sind" in the wiktionary and each entry will ask about all different sein-forms). If you are sure you've entered a transcription already, then just ignore it the second time.

Sometimes there are actually more transcribed word forms than words on the left. (Or they are different.) Then you can add a word form on the left with "Wort hinzufügen". Note: Often there are different transcriptions for the same word form (?v?ltn?, ?v?lt?n). Usually you would want to pick the form that would be used most in colloquial speech (here: v?ltn?).

Also, there may just be erroneous transcriptions (quite often), where people just guessed how IPA works. It's important, that we catch most of these errors. So you might actually want to start out with the Wiktionary Transcription Guideline which shows, how the transcription *should* be.

To enter IPA symbols into the textfields directly, just type the keys listed on the right (for ? type N) and they will automagically be transformed to IPA. (This works in Firefox, I don't have Windows, so I can't check Internet Explorer.)

Please input your e-mail address or another kind of ID into the first textfield. This way we can later compare who's the most hard working transcriber!

Cheers, Timo

Re: help to create the German dictionary

User: timobaumann
Date: 5/14/2008 4:14 pm

Views: 477
Rating: 25

clickable link: http://www.ling.uni-potsdam.de/~timo/projekte/voxforge.html

UPDATE: It's important that you transcribe, how something would be spoken in colloquial standard German. By the way, what region of Germany are you from? ;-)

Re: help to create the German dictionary

User: nsh
Date: 5/15/2008 1:50 am

Views: 177
Rating: 31

Another good way and a popular nowdays method to get a dictionary is the following. You select a phoneset, build an LTS system that will generate variants and then use forced-alignment against the recording to check are pronuncations valid or not. This way you'll ensure automatically that you dictionary is correct.

Also you would probably be interested to look on Unilex dictionary available from CSTR to check how the modern dictionary looks.

Re: help to create the German dictionary

User: kmaclean
Date: 5/16/2008 10:38 am

Views: 423
Rating: 19

Hi nsh,

>You select a phoneset, build an LTS system that will generate variants and

>then use forced-alignment against the recording to check are pronuncations

>valid or not.

Forgive my ignorance, but by "LTS" do you mean "Letter to Sound"? If so, do you mean that for each letter in the alphabet for a target language, you create a table that contains the different sounds that the letter might have, then you create a dictionary that would have multiple alternate pronunciations of the same word. Then you take a transcribed speech recording and let the speech recognizer figure out the correct phonemes for each word (using forced alignment), based on what it recognizes in the recording?

For example, if someone wants to create a dictionary for a new language, do you first start with a set of speech transcriptions for the target language (i.e. speech audio files with a transcription of the actual words spoken in a text file).

Then create the letter-to-sound rules. For example the word "house" in the VoxForgdict is pronounced as follows:

HOUSE [HOUSE] hh aw s

If I were using your approach, first I would create a phone list like this (CMU's phone list in this case):

Phoneme Example Translation
        ------- ------- -----------
        AA      odd     AA D
        AE      at      AE T
        AH      hut     HH AH T
        AO      ought   AO T
        AW      cow     K AW
        AY      hide    HH AY D
        B       be      B IY
        CH      cheese  CH IY Z
        D       dee     D IY
        DH      thee    DH IY
        EH      Ed      EH D
        ER      hurt    HH ER T
        EY      ate     EY T
        F       fee     F IY
        G       green   G R IY N
        HH      he      HH IY
        IH      it      IH T
        IY      eat     IY T
        JH      gee     JH IY
        K       key     K IY
        L       lee     L IY
        M       me      M IY
        N       knee    N IY
        NG      ping    P IH NG
        OW      oat     OW T
        OY      toy     T OY
        P       pee     P IY
        R       read    R IY D
        S       sea     S IY
        SH      she     SH IY
        T       tea     T IY
        TH      theta   TH EY T AH
        UH      hood    HH UH D
        UW      two     T UW
        V       vee     V IY
        W       we      W IY
        Y       yield   Y IY L D
        Z       zee     Z IY
        ZH      seizure S IY ZH ER

I would then create a set of letter-to-phone rules as follows (phonemes converted to lower case for easier reading):

H hh
O ow, oy, uw
U uh
S z
E iy

Then create rules for letter combinations to sounds (only for such letter combinations that have a unique sound in the target language):

HO hh aa, hh uh,hh ow
OU aw
US ax s,
SE s

Then generate all the possible pronunciations for the word "house":

HOUSE hh ow uh z iy
HOUSE hh oy uh z iy
HOUSE hh uw uh z iy
HOUSE hh aa uh z iy
HOUSE hh uh uh z iy
...

And then use the forced alignment feature of a speech recognition engine(like Sphinx, HTK, ...) to look the text of a particular recording (in this case of the single word "house"), and see what phonemes it identifies as the most likely used in the recording (HTK format in this example):

    0 9400000 sil -5373.277832 SENT-END
    9400000 10400000 hh -750.756897 HOUSE
    10400000 11300000 aw -659.823364
    11300000 12900000 s -962.888245
    12900000 13300000 sil -238.437622 SENT-END

Which can then be input into a script to create the final correct pronunciation to the word "house":

HOUSE [HOUSE] hh aw s

Ken

German dictionary acquisition project

User: ralfherzog
Date: 5/28/2008 2:09 am

Views: 307
Rating: 19

Hello Timo,

Congratulations for the great work that you have done. I have just submitted my first entry for the dictionary acquisition project. It was the German word "Vater."

I would say that I speak standard German. I live in Bonn, but I don't speak "Bönnsch" or "Kölsch."

I am often online at the IRC-chat #cmusphinx. Feel free to join.

Greetings, Ralf

retroflex nasal; dictionary acquisition project crashes

User: ralfherzog
Date: 5/30/2008 5:08 pm

Views: 367
Rating: 19

Hello Timo,

Here are three additional remarks.

1. In the transcription key of the dictionary acquisition project, there is the last entry "= n?". What is the meaning of this sign? I assume that it indicates the retroflex nasal (IPA-number 117; Unicode: U+006E (n), U+0329). But do we need this sign in the German language? I can't find this sign in your proposition for the German phone set.
In the entry for the German word "festesten" (superlative from "fest"), it is indicated that this word has the speech sound "'f?st?stn?". Are you sure that we need the retroflex nasal? I would prefer "f?st?stn".

2. Sometimes the dictionary acquisition project crashes. This happened yesterday and today at my computer (Win XP, Firefox; both actual version). So I had to restart the Firefox browser again. Obviously, the dictionary acquisition project doesn't like it when I input signs that are not allowed.

3. Do you plan to release the results coming from the dictionary acquisition project under the Pronunciation Lexicon Specification?

Greetings, Ralf

[ «Previous Page | 1 2 | Next Page» ]

Previous • Next •


Username	Password