VoxForge
Hi, I want to get a spanish corpus. I´m now developing an aplication with speech recognition and it´s works perfectly with your english corpus. Now I want to do the recognition with spanish. I know that I´ll need to train the model with my voice but, how can I get the recorded voices that you have? I try to get all from de svn but I haven´t got access.
Of course I want to colaborate and upload my records when I finish my job.
Now I´m doing my records with 48000 hz I know than when I want to integrate it with your voices I´ll need to downsample it. But what is the reason of doing the records with that rate?
Thanks
> how can I get the recorded voices that you have?
wget -N -nd -c -e robots=off -A tgz,html -r -np \
hhttp://www.repository.voxforge1.org/downloads/es/Trunk/Audio/Main/8kHz_16bit/
> Now I´m doing my records with 48000 hz I know than when I want to integrate it with your voices I´ll need to downsample it. But what is the reason of doing the records with that rate?
Are you asking yourself why do you record with such rate? Nobody else except you can answer on this question.
I know why I record with that rate. I want to know why do you record the voices with 16 khz or 8 khz rate? I read something about the Nyquist theorem.
Also I supouse that I must downsample my records to integrate it with yours.
> I want to know why do you record the voices with 16 khz or 8 khz rate?
Voxforge audio is recorded with various sample rates. Models are built with 16 kHz because 16 kHz is the sample rate that allows you to decode both 16 kHz and 48 kHz audio without decrease of performance. Telephone models are built with 8kHz because it's the sampling rate of voip codecs.
> I read something about the Nyquist theorem.
I'm not sure how is it related