VoxForge
Hello,
I have downloaded the speech files and found out that the audio files from
http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Audio/Main/16kHz_16bit/
do not correspond with the MFC files in
http://www.repository.voxforge1.org/downloads/SpeechCorpus/Trunk/Audio/MFCC/16kHz_16bit/MFCC_0_D/
While there are approx 58 hours of MFCCs there are only about 2.5 hours of audio files in Main/16kHz_16bit. I am guessing that the 58 hours came from all the files in the directory Original which are in different sampling rates.
I would like to use the speech data to train HTK acoustic models which will later be used in my own recognizer. The problem is that the parametrization is not 100% compatible with HTK so I will need to parametrize the audio files myself.
I am trying to figure out how do I get all the audio files in the same sample rate (preferably 16KHz). So the question is were the audion files downsampled before the MFCCs in 16kHz_16bit/MFCC_0_D were created? And if so, can you tell me, what software did you use for that, or, even better can I see the scripts and configs?
--- (Edited on 1/30/2009 7:35 am [GMT-0600] by tpavelka) ---
Hi tpavelka,
>While there are approx 58 hours of MFCCs there are only about 2.5
>hours of audio files in Main/16kHz_16bit.
No.
Not sure how you made your calculations, but the MFCCs are generated from the Main directory - there should be the same number of hours of speech audio in both...
>So the question is were the audion files downsampled before the
>MFCCs in 16kHz_16bit/MFCC_0_D were created?
Yes, from the 'Original' directory into the 'Main' directory.
>And if so, can you tell me, what software did you use for that, or, even
>better can I see the scripts and configs?
Sox,
see here: Main.pm calls the 'Downsample' method in: AUDIO.pm
Ken
--- (Edited on 1/30/2009 10:16 am [GMT-0500] by kmaclean) ---
Hi, thanks for the fast reply. I have used an automatic downloader to fetch all the speech files. I guess the process got terminated for some reason and I did not notice. That's why when I measured the total length of speech data on my hard drive I only got 2.5 hours.
Thanks for the info on conversion software.
Tomas
--- (Edited on 1/30/2009 10:01 am [GMT-0600] by tpavelka) ---