VoxForge
I would like to thank Timo very much for his donation of a German corpus to VoxForge. From his e-mail to me:
Hi Ken,
I've just uploaded a corpus of spontaneous German speech to the FTP-site (I put it into openpento/). It contains 270 chunks of speech (utterance-like) from 9 speakers, totalling 90 minutes of speech (not counting silence). Files are 44.1kHz, 16 bit, FLAC-encoded. There is a short README detailing the transcription.
...Cheers,
Timo
README:
_____________________________________________________________
_ _ _
(_)_ __ _ __ _ __ ___ _ __ _ __ ___ (_) ___ ___| |_
| | '_ \| '_ \| '__/ _ \ | '_ \| '__/ _ \| |/ _ \/ __| __|
| | | | | |_) | | | (_) | | |_) | | | (_) | | __/ (__| |_
|_|_| |_| .__/|_| \___/ | .__/|_| \___// |\___|\___|\__|
|_| |_| |__/
_____________________________________________________________
[email protected]
PENTOMINO NAMING CORPUS
this corpus contains spontaneous speech of 9 subjects
in which they try to select, rotate and place pieces in
a simple 2D board game.
each audio file in CutFiles/ contains a single utterance,
and is accompanied by a corresponding label file,
which details start and ending of each utterance.
files {1,2,3}_3 contain lists of files with the contained words.
Transcription is as follows:
- " . " signifies a short pause of 0.0-0.33 seconds
- ":" signifies lengthening of the transcribed word
- other punctuation (",", "+") can be ignored
- uncomplete or otherwise mispronounced words are enclosed
with "BAD". Such files should probably be skipped during
ASR training.
this corpus is released under the terms of the GPL to voxforge.org.
please contact [email protected] if you have any questions.
(c) 2006, 2007, 2008 DEAWU and INPRO projects, University of Potsdam
Links to the corpus on the VoxForge submission forum:
Many thanks to NSH for re-compiling the German Acoustic Models to include the audio from the Pentomino Naming Corpus. The updated acoustic models can be downloaded from here.