VoxForge
Hi dims,
>What is the relation of this project and a LibriVox?
There is no formal relationship, though some of their readers have very generously given us some uncompressed speech.
>Do LibriVox utterances are incorporated into VoxForge or not?
Yes, we do have Librivox audio incorporated into the VoxForge corpus and acoustic models.
See here: audiobooks, and look under the "Processed?" column... if it says 'yes' (you may need to go to the next page on the forum display) then that audiobook chapter has been segmented and incorporated into the VoxForge corpus, which you can confirm by looking up the reader's username on the metrics page.
It is a very time consuming process to segment the audio for processing by an acoustic model trainor(even with a segmenting scripts that automates most of the process), that is why there are still some left that have not been processed yet.
Ken
--- (Edited on 10/1/2009 2:19 pm [GMT-0400] by kmaclean) ---
Great! This is exactly what I was intereted in! Thank you very much for your answer!
Would you explain me briefly, how it is possible to automate a-book segmentation?
Am I right thinking, that this means splitting a-book into sentences?
I think this requires rough recognition stage. Is it incorporated in your perl script? Would be very intereting to know!
--- (Edited on 10/2/2009 7:35 am [GMT-0500] by dims) ---
Hi dims,
>Would you explain me briefly, how it is possible to automate a-book
>segmentation?
The perl script has inline documentation: AudioBook.pm
See also: Automated Audio Segmentation Using Forced Alignment (Draft)
>Am I right thinking, that this means splitting a-book into sentences?
ye
>I think this requires rough recognition stage. Is it incorporated in your perl
>script?
Yes, using HTK and forced alignment.
Ken
--- (Edited on 10/2/2009 9:25 am [GMT-0400] by kmaclean) ---