VoxForge
Hello,
we currently work on a solution to automatically transcribe recorded (German) lectures from our university. To achieve a good recognition rate, we are about to transcribe some lectures from each lecturer and add the transcriptions to the accoustic model.
We would like to use the German Sphinx acoustic model from the german download section. Our plan is to add our transcripts to the .transcription file. Having a lot of lectures about computer science, we probably have to add some technical terms to the dictionary file (.dic). There we find rows like this:
AUFNAHME qq au f n aa: m @
How can we insert new entries to this dictionary? What we currently have is a collection of phonetics in the IPA format from the German wiktionary. Which format is used in these .dic-files and is there a way to convert IPA phonetics to this format?
Thanks,
Christoph
--- (Edited on 8/25/2010 6:19 am [GMT-0500] by Christoph) ---
> we currently work on a solution to automatically transcribe recorded
>(German) lectures from our university. To achieve a good recognition rate,
> we are about to transcribe some lectures from each lecturer and add the transcriptions to the accoustic model.
That's a great project, I'm looking forward to see it's done!
> How can we insert new entries to this dictionary?
To insert word in a dictionary you just need to open dictionary with text editor and insert the word there. Sort dictionary with Unix sort command after that.
The issue is that this dictionary is _VERY UGLY_. It was built with espeak and not really well done. It was just a quick attempt. I suggest you to use Timo's/Ralphs hand-made german dictionary, BOMP or wiktionary and redo the dictionary using your phoneset from scratch. Just try to convert IPA to plain ASCII phones, avoid numbers and special symbols. Use multi-letter phones if needed.
--- (Edited on 8/25/2010 17:11 [GMT+0400] by nsh) ---
Hi Christoph,
The format of the example line from the dictionary file is that everything upto the first whitespace is the word, then you have a list of whitespace delimited phonemes.
To work out the mapping between IPA and what you have it's probably easiest to look up the existing dictionary on wiktionary. For example, http://de.wiktionary.org/wiki/Aufnahme says:
IPA: ['aʊ̯fnaːmə]
From which you get:
au -> aʊ̯
f -> f
n -> n
aa: -> aː
m -> m
@ -> ə
I'm not sure what the qq is doing in there - perhaps I've misinterpreted the dictionary format.
Once you've built your main dicitonary you'll have a good feel for how the phonetics work. Then you need to use a tool like g2p to generate pronunciations for the words you don't have. This isn't perfect, you'll probably want to check the results, but it's a good starting point and chances are you'll only need to use it for relatively rare words.
I've done a lot of silimar work over the last ten years, let me know if you want to work together on this.
Tony
--- (Edited on 25-August-2010 2:14 pm [GMT+0100] by TonyR) ---
Hello Christoph,
"How can we insert new entries to this dictionary?" - You can read my articles How I create Ralf's Swiss German dictionary or Improving Ralf’s Hungarian dictionary to get an impression how dictionary development can be done efficiently.
"collection of phonetics in the IPA format from the German wiktionary." - Wiktionary's license is not GPL-compatible. I suggest that you download Ralf's German dictionary (because it is GPLv3). This PLS dictionary contains more than 300000 German words. To see Ralf's German dictionary in action, please watch the Youtube video: German speech model ‘xad’. This Youtube video makes use of Ralf's German dictionary.
"is there a way to convert IPA phonetics to this format?" I am using my XSLT style-sheet improve-german.xsl for dictionary improvement (it replaces SAMPA phonemes by IPA phonemes). You can develop your own XSLT style-sheet that transforms the Wiktionary IPA phonemes into SAMPA phonemes. E.g. use the XPath expression replace($sierra, 'ʃ', 'S') to transform from IPA to SAMPA. XSLT/XPath are excellent languages for dictionary development.
Hi TonyR,
"I'm not sure what the qq is doing in there" - qq means probably quiet, and this is the German "Knacklaut" (=glottal stop). It is unclear whether the glottal stop is a phoneme. Two examples from the Wiktionary: "Theater /[teˈʔaːtɐ]/, beantworten /[bəˈʔantvɔʁtn̩]/"
Regards, Ralf
--- (Edited on 2010-08-25 10:42 am [GMT-0500] by ralfherzog) ---
Thank you for your answers and the valueable information. Ralf's German dictionary is amazing, we will try to convert it to the syntax Sphinx is expecting.
Audio transcription is part of our chair's project to build an online video portal for our lectures. Our part is kind of expiremental, we want to see how reliable automated transcription works in our special use case.
Thank you for your help so far,
Christoph
--- (Edited on 8/26/2010 4:43 am [GMT-0500] by Christoph) ---