Phoneme set discussion
User: timobaumann
Date: 12/12/2007 4:27 am
Views: 20785
Rating: 53


we are currently discussing the design of our phoneme set in . Feel free to join, either on the wiki page or here in the forums.


IPA transcription of "Europa"
User: ralfherzog
Date: 6/3/2008 11:49 pm
Views: 463
Rating: 29
Hello, currently I am trying to get involved into the dictionary acquisition project.  At the moment, I have a question concerning the IPA transcription of the German word "Europa".  The IPA transcription is displayed as "??????o?pa".  But in the PhoneSet, there is the example "Kreuz [???]". Which transcription should I use?
Re: IPA transcription of "Europa"
User: timobaumann
Date: 6/4/2008 3:28 am
Views: 291
Rating: 67

Hi Ralf,

actually, both are fine. The Wiktionary-Guideline uses ???, so we should probably stick with that. The phoneset definition used ???, mostly because the cited SAMPA-versions used OY.

Anyways, phonetically speaking there is a difference (??? ends in a rounded manner, while ??? does no), but phonemically (the level we are using here, because we can't model all the different variants anyway) they are identical for German. 

There will pe quite an amount of (partly) automatic checking and streamlining, to get the dictionary in a usable shape. Identifying and unifying ??? and ??? won't be a problem, so don't worry about them too much.

Cheers! Timo

In other news: just checked your submissions (thank you!) and have some comments about the first "r" in "festeren, festerer": I've changed those to fEst@r@n and fEst@r6, as the r-sound is actually realized (only use /6/ when the "r" is not audible). I also added this example to the Wiktionary-page because this was not clear on their page.

I also changed ve:nIg to ve:nIC, due to Auslautverhärtung. (southern germans may actually say ve:nIg, but that's dialect :-)

Please excuse my using SAMPA instead of IPA in this post. It's just so much easier to type.

??? and ??; fEst66 is wrong; ve:nIg is dialect
User: ralfherzog
Date: 6/4/2008 3:48 am
Views: 347
Rating: 30
Hello Timo,

1. OK, I won't worry about the difference between ??? and ?? too much.

2. I will keep that in mind that I have to transcribe fEst@r6 (correct) instead of fEst66 (wrong).

3.  Now I have learned something.  I didn't know that ve:nIg is dialect.

Greetings, Ralf
IPA transcription of "Donnerstag"
User: ralfherzog
Date: 6/4/2008 10:01 pm
Views: 476
Rating: 23
At the moment, I am preparing the entry for "Donnerstag".  I would like to say something about the "R".  In my opinion, the correct IPA transcription should be "d?n??sta?k".  But I decided to choose the transcription like indicated in the wiktionary.  And I think that you should allow the "?" (examples: dort [d??t], wird [v??t]).  I would like to make it right from the beginning.
Re: IPA transcription of "Donnerstag"
User: timobaumann
Date: 6/6/2008 2:03 am
Views: 481
Rating: 22

I believe the wiktionary is right. The case is different from the cases you mention (dort and wird) in two regards. First, the /s/ in Donnerstag is a linking-s (Fugen-s) which is preceded by a morpheme boundary (unlike the [t]s in dort and wird) and thus the attachment between /??/ is stronger than between /?s/. Now, why am I writing /??/? Because with "//" we're on the phonemic level while with "[]" we're on the phonetic level. The phonemes /??/ are reduced to [?] by so-called postphonological processes. This reduction is the other thing which makes Donnerstag different from dort, as it does not occur in the latter.

Now it will become difficult: I argue against different Rs, as the realization of (phonologic) /r/ in German varies widely both between individuals as well as dialectally. It's a mess. Thus, I would actually just use [r] as our generic R and keep the uppercase symbol if we ever want to transcribe English Rs (which also vary widely between dialects). As before, don't worry about this too much yet, but I'll likely add some post-processing to change all r-variants to [r].

Re: IPA transcription of "Donnerstag"
User: ralfherzog
Date: 6/7/2008 2:21 pm
Views: 282
Rating: 20
Hello Timo,

What do you think about the following idea?  We could build two German pronunciation lexicons.

Lexicon #1: IPA (with different Rs), PLS, GPL.

Lexicon #2: Your PhoneSet (our generic R), designed for the use with HTK, or Sphinx; GPL.

You have the necessary programming skills.  And it wouldn't be bad, if there would be a free IPA-lexicon.

I just took a look into the Wikipedia to learn more about the difference between phone and phoneme.  Well, I understand the concept.  But what is the impact of this distinction when it comes to speech recognition?  I don't know.

Greetings, Ralf
Re: IPA transcription of "Donnerstag"
User: timo
Date: 6/9/2008 10:40 am
Views: 155
Rating: 27

Hi Ralf,

that sounds good to me. I hope to be able to look into the lexicon a bit this weekend and can then automatically generate both PLS as well as the plain-text stuff I need for Sphinx. Your work till then (and beyond) is greatly appreciated!

The distinction between phonetics and phonology in ASR is always a bit weird: In general we want to and can only recognize phones (that is, the actual realizations of abstract phonemes). Phones are supposed to describe precisely the differences between all human speech sounds. They tell us exactly, what has been uttered and how. On the other hand, phonemes have the nice property of being abstract thus (in theory) working cross-realization, cross-person, cross-dialect, etc. Obviously this is doomed to fail: As current ASR doesn't have proper phonologic modules that describe how the cross-* phonemes should be mapped on realization-specific phones, having a pure phonologic lexicon doesn't help. A pure phonetic lexicon doesn't help either, because every realization is unique. So, we are stuck somewhere in between, where we try to model phonetic differences between realizations that are relevant to ASR.

One example: [x] as in "Nacht" and [ç] as in "nicht" are definitely two different sounds (and thus different phones), but they are both the same phoneme in German (/x/ or /ç/ either way you like), because the realization as [x] or [ç] can be contextually determined from the preceding vowel: [x] after /a/, /u/, /o/ and /au/, [ç] otherwise. *But*: For ASR, their is a huge difference between [x] and [ç]. Relying on triphones (senones in Sphinx-lingo) for the different contextual realizations of [x]/[ç] would be possible. But it would be very inefficient, because state-tying and even more context-independent modelling assumes, that segments always sound more-or-less alike. Thus, coding [x]/[ç] explicitly greatly improves performance.

A counter example: /r/ is realized in many different ways, depending mostly on dialect, speaker, etc. This is impossible to model in a dictionary. But, as there are so many possible realizations, it doesn't help to split the /r/ into slightly different context-dependent realizations, as the superimposed inter-personal differences are much larger. It would actually harm in training (and decoding), because the training material (respectively the probabilities for decoding) would be split between the different models, resulting in worse recognition. Nonetheless, for an instructive dictionary, distinguishing both /r/'s would be fine.

Hope this helps a little and take it with a grain of salt,

two lexicons; phones [x]/[�] and phoneme /r/
User: ralfherzog
Date: 6/10/2008 2:18 am
Views: 919
Rating: 22
Hello Timo,

1. Thanks for your explanation concerning the word [f????lasn?].  Your explanation sounds reasonable, but I wouldn't be able to produce the same result on my own.  The details are too complicated for me.  I just hope that I don't produce too much mistakes when adding transcriptions to the dictionary acquisition project.

2.  That would be great if you would generate this weekend the two different lexicons (PLS, and a plaintext version for Sphinx).  I will continue to submit entries.  Please tell me if you find any minor or major mistakes.

3. I didn't know that "current ASR doesn't have proper phonologic modules."  Or to be more precise: I haven't thought about this important question. I am planning to learn more about how Sphinx etc. work.  But at the moment, it seems to be a wise decision to focus on the dictionary acquisition project.

3.a.  Thanks for the very good explanation concerning the phones [x]/[ç].  So we should distinguish these phones in the plaintext lexicon for Sphinx even though these two phones build the same phoneme in German.

3.b. OK, it seems to be a good decision to use just one generic phoneme /r/ for the plaintext lexicon.  But would this be the case too, if we would target to produce not a speaker independent, but a speaker dependent lexicon? Well, in the long term, I would like to have a free speech recognition software that is speaker independent.  But to achieve this goal, maybe we need in the long term speaker dependent lexicons?  So it might be a truly wise decision to develop two versions of the lexicon (one version with the different Rs under the PLS, and a plaintext version with just one /r/ for Sphinx).

Yes, your explanations helped.  Sometimes it seems to be useful to distinguish between phones ([x]/[ç]), and sometimes this is obviously not the case (/r/).

Greetings from Bonn, Ralf
User: Visitor
Date: 8/15/2012 6:26 am
Views: 2858
Rating: 8


we are currently discussing the design of our phoneme set in . Feel free to join, either on the wiki page or here in the forums.