General Discussion

Nested
Acoustic model 0.1.2
User: dano
Date: 9/1/2008 2:50 pm
Views: 15072
Rating: 4

When do we release 0.1.2?

Ticket 202 (20% of 140 goal is fixed, according to the metrics page we have now 36%, but misses a month of speech submissions => 37%, 38%.) http://www.dev.voxforge.org/projects/Main/ticket/202

I thought 376 (http://www.dev.voxforge.org/projects/Main/ticket/376) is fixed (according to the forum thread.)

I don't know about other tickets but 366

http://www.dev.voxforge.org/projects/Main/ticket/366

doesn't seem to be a showstopper for the English acoustic model.

"Update Acoustic Model creation scripts and Tutorials (and Howtos) to Julius 4.0" which is supposed to acoustic model 0.1.3, I think that has a bigger priority to my opinion.

Let me know what you think about it. (What was it again: "release soon, release often" isn't it?

 

--- (Edited on 01-09-2008 9:50 pm [GMT+0200] by dano) ---

Re: Acoustic model 0.1.2
User: kmaclean
Date: 9/1/2008 6:39 pm
Views: 91
Rating: 3

>I don't know about other tickets but 366

>http://www.dev.voxforge.org/projects/Main/ticket/366

>doesn't seem to be a showstopper for the English acoustic model.

actually it is... try the current daily acoustic model, and see what I mean.   We need to figure out which submission(s) have degraded the acoustic model.

>"Update Acoustic Model creation scripts and Tutorials (and Howtos) to

>Julius 4.0" which is supposed to acoustic model 0.1.3, I think that has a

>bigger priority to my opinion.

Are you volunteering to complete this?

>(What was it again: "release soon, release often" isn't it?

That's the old open source... these days (in my experience at least...) if it doesn't work as expected, people don't come back.

Ken

--- (Edited on 9/1/2008 7:39 pm [GMT-0400] by kmaclean) ---

Re: Acoustic model 0.1.2
User: dano
Date: 9/2/2008 12:09 am
Views: 73
Rating: 4

http://www.dev.voxforge.org/projects/Main/ticket/366 is about other languages :)

I will take some amount of submissions tonight to listen to (the VoxForgeIVR Speech Files are almost not of a good quality, much ticks and noise.)

What does it take to convert it to Julius 4.0? The acoustic model does work already in 4.0 (check julius and julius-voxforge in Ubuntu), but needs it a other 'compiling' process to improve accuracy?

As I want to have a better acoustic model for my project, I'll do my best to improve this release (and thereby also future releases.)

Daniël
 

 

 

--- (Edited on 02-09-2008 7:09 am [GMT+0200] by dano) ---

--- (Edited on 02-09-2008 4:53 pm [GMT+0200] by dano) ---

Re: Acoustic model 0.1.2
User: kmaclean
Date: 9/2/2008 12:17 pm
Views: 70
Rating: 3

Hi Dano,

>I will take some amount of submissions tonight to listen to

Thanks! I think the problem is that any recordings that have lots of noise/hiss/distortion in the silence passages need to be removed from the acoustic model.  It is just a matter of flagging these and removing them from the master prompts file.

>(the VoxForgeIVR Speech Files are almost not of a good quality, much ticks and noise.)

We need speech samples from real telephone environments (not just downsampled recordings made from using a PC) so that the 8kHz-16bit acoustic models can be used in telephony speech recognition applications.

Remember the VoxForgeIVR submissions are only 8kHz-16bit.  If you are doing PC based speech recognition, I would recommend the 16kHz-16bit acoustic models (these should provide better recognition results.

>What does it take to convert it to Julius 4.0?

I think the quickstart just needs to be packaged with the Julius 4.0 executables, the readme needs to be updated, and everything tested (on Linux Fedora/Ubuntu/... and Windows XP and Vista), using:

>but needs it a other 'compiling' process to improve accuracy?

No, I don't think the acoustic model needs to be changed.

Ken

--- (Edited on 9/2/2008 1:17 pm [GMT-0400] by kmaclean) ---

Re: Acoustic model 0.1.2
User: kmaclean
Date: 9/2/2008 12:40 pm
Views: 78
Rating: 10

Hi Dano,

>http://www.dev.voxforge.org/projects/Main/ticket/366 is about

>other languages :)

oops...maybe I should read the thing... :)

The one I was thinking of is: ticket #376 - Nightly Build Acoustic Model Performance Decrease.

 

Ken

--- (Edited on 9/2/2008 1:40 pm [GMT-0400] by kmaclean) ---

Re: Acoustic model 0.1.2
User: dano
Date: 9/3/2008 6:18 am
Views: 81
Rating: 4

Hi Ken, 

Is this

http://spraakherkenning.googlepages.com/QuickStart.7z

good for Linux, or are the files on the wrong place?

 

Daniël

--- (Edited on 03-09-2008 1:18 pm [GMT+0200] by dano) ---

Re: Acoustic model 0.1.2
User: kmaclean
Date: 9/3/2008 7:19 am
Views: 75
Rating: 4

Hi Daniël,

>Is this ... good for Linux, or are the files on the wrong place

Cool, thanks for working on this!

I can't seem to read the .7z file on Fedora 9, and 7-Zip (or 7-zip) is not on the Fedora repository...

It's probably best to stick to tar.gz for Linux, and .zip for Windows.

Ken

 

 

--- (Edited on 9/3/2008 8:19 am [GMT-0400] by kmaclean) ---

Re: Acoustic model 0.1.2
User: dano
Date: 9/3/2008 8:32 am
Views: 98
Rating: 3
http://spraakherkenning.googlepages.com/QuickStart.tar.gz

And I created a .po file (currently reviewed) to launchpad (launchpad.net/voxforge .) I will also set up the .po file for translation of the site and prompts if you want to.

--- (Edited on 03-09-2008 9:21 pm [GMT+0200] by dano) ---

Re: Acoustic model 0.1.2
User: kmaclean
Date: 9/5/2008 8:15 am
Views: 91
Rating: 4

Hi Dano,

Sorry for the delay in getting back to you, been working (still am...) on making the VoxForge site multilingual and processing backlogged speech submissions...

>Is this ... good for Linux, or are the files on the wrong place

>http://spraakherkenning.googlepages.com/QuickStart.tar.gz

The directory structure is OK, but I think you need to assume that the user does not want to go searching anywhere for files or instructions to try out the command.  In addition, the level of experience of users of the QuickStart varies greatly.  You need to make it as simple as possible to get started - target this to a newbie who knows some Linux commands, but gets lost easily if things don't go exactly as described (i.e. what to do with error messages,...).  The more advanced users can always skim through the easy stuff and find what they are looking for.

So including a bash script that lets the user type in a single command to get things started and see how it works would be helpful, and a README with some quick explanations on how to get started and another file to give them an idea how to make grammars.  You can use the files in my QuickStart as a starting point, or if you've seen something better, then I am OK with using that too...

Don't get me wrong, I am very happy that you are looking in to updating this, it's just we've got to keep the target user (i.e. a newbie) in mind.

Ken

P.S. if you want to include a decription and example of a simple dialog manager in the QuickStart, that would be _very_ helpful to users - something like this article on Linux.com (written by Colin Beckingham) or RainCT's script.  But that could be for a later date...

 

--- (Edited on 9/5/2008 9:15 am [GMT-0400] by kmaclean) ---

Re: Acoustic model 0.1.2
User: kmaclean
Date: 9/5/2008 8:23 am
Views: 214
Rating: 7

HI Dano,

>And I created a .po file (currently reviewed) to launchpad (launchpad.net/voxforge .) I will also set up the .po file for

>translation of the site and prompts if you want to.

That would be awesome.  Though I'm still not 100% clear on how a Java .po file works (or how collecting translations on Launchpad works...), but the ability to add new languages without tinkering with code seems like a good thing to me!

Do you need access to the Subversion repository?

Ken

--- (Edited on 9/5/2008 9:23 am [GMT-0400] by kmaclean) ---

PreviousNext