General Discussion

Nested
Help proposal
User: royerfa
Date: 3/31/2008 4:00 am
Views: 7209
Rating: 36
Hello,

I am a student. Currently I am doing a Thesis about Speech recognition. I am using Julian.

I would like to do a Benchmark of all the SRE open sources so include SPhinx running on a Laptop for live decoding.
I am interested in simple commands recognition.

Do you Know if this work has already be done in your forum ? If not really, I can propose my work following your main interests.

If you need any help in the development C writing, or documentation writing. Please feel free to ask me, it will be a pleasure.

Regards,

royerfa
Re: Help proposal
User: nsh
Date: 3/31/2008 8:12 am
Views: 278
Rating: 40

Performance comparision is the interesting question of course and some works are dedictated to that but I'd like to mention that this question is very delicate.  You can't just compare the engine, you should compare the model training code, language modelling code, dictionary, feature extraction program and so on. You have to keep beams the same or at least adjust properties of the search method. You probably know that it's possible to make slower precise decoder but people usually prefer faster and not so precise. Speed, dictionary size, performance, adaptation capabilities are  also important thigns. For example recent sphinx supports MLTT training, you won't find analog of this adaptation technique in HTK. While in HTK MCE allows you to get higher error rates with offline training.

For example of sphinx vs htk comparision see well known page:

http://www.inference.phy.cam.ac.uk/kv227/papers/baseline_wsj_recipes.pdf

Actually another interesting article is the following text by David:

http://lima.lti.cs.cmu.edu/moinmoin/SphinxHTK

The whole point is there is no significant difference between HMM implementations just because they are the same algorithm implemented ifferently. The differences between the recognizers are known, actually t's easier to extract them from the code than from the results of a testing. Summarizing above, performance comparision is a useless task in my opinion.

About other tasks, I think that GSOC proposal is a nice review of  our problems:

 http://www.voxforge.org/home/forums/message-boards/googlesoc

Choose one project there. For example the most important Voxforge task is training and evaluation of the Sphinx English model. If one could do that, it would be amazing. In Sphinx itself there is enourmous amount of tasks too, from online adaptation to language understanding. Most of the tasks are about engineering though. If you are interested in scientific research we can also suggest something.

Join #cmusphinx irc channel on freenode, we can discuss details.

 

Re: Help proposal
User: kmaclean
Date: 3/31/2008 11:13 am
Views: 3046
Rating: 42

Hi royerfa,

>I would like to do a Benchmark of all the SRE open sources so include

>SPhinx running on a Laptop for live decoding.

You might also ask the Simon project about the work they did to compare Julian, Sphinx, etc:

Analysis of existing software (translated from German using Google translate - original page)

>If you need any help in the development C writing, or documentation writing.

>Please feel free to ask me, it will be a pleasure.

What are your interests?  What is it that you would really like to do?  Are you interested more in software side of speech recognition (the speech recognition engine, dialog manager) or on the linguistics side (acoustic or language model creation/tuning).  Or is there a particular application you are thinking about? 

What is the time frame for this?  Is this something that must be completed in a semester (3 months), or can it be extended over a longer period of time.  Some of the proposals on the GSoC page are much longer than a semester, but likely can be broken up into more manageable pieces. 

What is the criteria for your thesis to be accepted?  We want to make sure that whatever you pursue, it meets *your* required objectives too.

Ken

PreviousNext