VoxForge
Hi,
I like to add romanian lagnuage to voxforge but I don't know where to start.
First I want to say that romanian language is very phonetic, and I wonder if is an easy way to add it to voxforge (julian/sphinx or anything else).
For more info about Romanian language please take a look here http://en.wikibooks.org/wiki/Romanian/Pronunciation_and_alphabet
, http://www.phon.ucl.ac.uk/home/sampa/romanian.htm
Thanks,
BogDan,
--- (Edited on 2/15/2009 4:58 am [GMT-0600] by bogdan) ---
Hi Bogdan
It's not a problem to add Romanian, but we need a several hours of properly transcribed recordings. Record a book or a talk, try to record as much speakers as you can. Once you'll have enough data it will be easy to add everything else.
--- (Edited on 2/18/2009 2:19 am [GMT-0600] by nsh) ---
Hi
I want to tank you for you response. I can ask some friends from radio, they have many hours and they have the transcription too, this will help me to add romanian language ?
What is the audio format I need ?
The transcription should be like subs (with time) or only the plain transcription ?
Thanks,
BogDan,
--- (Edited on 2/18/2009 4:17 am [GMT-0600] by bogdan) ---
> I want to tank you for you response. I can ask some friends from radio, they have many hours and they have the transcription too, this will help me to add romanian language ?
We accept only a free GPL-licensed audio. The recordings from a radio could have license issues. Also, there might be issues with speech variability. If you have nothing else, it would be nice to use radio recording
> What is the audio format I need ?
Check the FAQ
http://www.voxforge.org/home/docs/faq/faq/what-kind-of-audio-formats-is-voxforge-looking-for#E95ZmEBb3zy2k4HutEWUxw
> The transcription should be like subs (with time) or only the plain transcription ?
Check the FAQ
http://www.voxforge.org/home/docs/faq/faq/what-is-transcribed-or-annotated-speech-audio-file#o49C1POV3O54wFLNGupEhQ
And the examples
http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Audio/Original/48kHz_16bit/
Basically audio should be in 5-20 words chunks each with a proper transcription.
--- (Edited on 2/18/2009 3:34 pm [GMT-0600] by nsh) ---