Click here to register.

French

Flat
download french corpus
User: Marion
Date: 5/6/2009 9:45 am
Views: 1065
Rating: 1    Rate [
]

Hi!

I was looking at the French submited speech data, and I saw that only a part of it was in the Voxforge repository for download. The rest seems to be in the upload directory, which is access restricted, so it is not very easy to recover all the corpus except manually from the download page.


Is there a specific reason for that, or is there a way to get the corpus easily? I saw this post where Ken says:

Unfortunately I have not moved any German audio to subversion.

However, here is quick and dirty way to get the audio:

1.  $wget -r -l2   http://www.voxforge.org/home/downloads/speech/german-speech-files -A "ralfherzog*" 

this will create a directory called www.voxforge.org

2. search the directory for *.zip files using Gnome's search tool, and drag the results to the directory you want.

I'm not a wget expert but I don't think it's going to get files which are not in the specified directory. Any help?

Thanks a lot!

Marion

Reply
Re: download french corpus
User: nsh
Date: 5/6/2009 5:09 pm
Views: 9
Rating: 1    Rate [
]

Hello Marion

> Is there a specific reason for that

Ken is a bit busy nowdays :) let's not distract him

I think

wget -r -l2   http://www.voxforge.org/home/downloads/speech/french-speech-files

will just work for you.

Reply
Re: download french corpus
User: Marion
Date: 5/7/2009 5:31 am
Views: 171
Rating: 1    Rate [
]

It's what I did but as I said before, most of the corpus is in the updload directory, and to access it you need the complete address to each zip file, like http://voxforge.org/uploads/q0/0Q/q00QgKBqYb4KK6_qzhITig/phil_be-20090310-mif.zip, so you can't just do

$wget -r -l3  http://www.voxforge.org/uploads

I found a solution using WinHTTrack but wget should have worked too, it's just that you have to download a lot of stuff and then erase all but the zip files you want.

I just wanted to point out that not all French data is in the repository, but I perfectly understand that you don't have time to process all!

Thanks anyway for the answer and this project, this is great!

Marion

Reply
Re: download french corpus
User: kmaclean
Date: 5/25/2009 9:24 pm
Views: 199
Rating: 1    Rate [
]

Hi Marion,

>I just wanted to point out that not all French data is in the repository,

All the French submissions are now in the repository:

http://www.repository.voxforge1.org/downloads/fr/Trunk/Audio/

Ken

Reply
Re: download french corpus
User: samuel buffet
Date: 7/2/2009 11:32 am
Views: 6
Rating: 1    Rate [
]

Hi Ken,

ooooops, I wanted to download the French corpus today but I've made a mistake and I've downloaded much more than expected.


I hope not to have been the cause of trouble for your server.

 

Sorry about that.


Samuel-

Reply
Re: download french corpus
User: kmaclean
Date: 7/2/2009 1:06 pm
Views: 259
Rating: 1    Rate [
]

Hi Samuel,

No worries, that is why the VoxForge respository is on a separate server from the website front-end.

Ken

Reply
PreviousNextAdd