VoxForge
Performance comparision is the interesting question of course and some works are dedictated to that but I'd like to mention that this question is very delicate. You can't just compare the engine, you should compare the model training code, language modelling code, dictionary, feature extraction program and so on. You have to keep beams the same or at least adjust properties of the search method. You probably know that it's possible to make slower precise decoder but people usually prefer faster and not so precise. Speed, dictionary size, performance, adaptation capabilities are also important thigns. For example recent sphinx supports MLTT training, you won't find analog of this adaptation technique in HTK. While in HTK MCE allows you to get higher error rates with offline training.
For example of sphinx vs htk comparision see well known page:
http://www.inference.phy.cam.ac.uk/kv227/papers/baseline_wsj_recipes.pdf
Actually another interesting article is the following text by David:
http://lima.lti.cs.cmu.edu/moinmoin/SphinxHTK
The whole point is there is no significant difference between HMM implementations just because they are the same algorithm implemented ifferently. The differences between the recognizers are known, actually t's easier to extract them from the code than from the results of a testing. Summarizing above, performance comparision is a useless task in my opinion.
About other tasks, I think that GSOC proposal is a nice review of our problems:
http://www.voxforge.org/home/forums/message-boards/googlesoc
Choose one project there. For example the most important Voxforge task is training and evaluation of the Sphinx English model. If one could do that, it would be amazing. In Sphinx itself there is enourmous amount of tasks too, from online adaptation to language understanding. Most of the tasks are about engineering though. If you are interested in scientific research we can also suggest something.
Join #cmusphinx irc channel on freenode, we can discuss details.
Hi royerfa,
>I would like to do a Benchmark of all the SRE open sources so include
>SPhinx running on a Laptop for live decoding.
You might also ask the Simon project about the work they did to compare Julian, Sphinx, etc:
Analysis of existing software (translated from German using Google translate - original page)
>If you need any help in the development C writing, or documentation writing.
>Please feel free to ask me, it will be a pleasure.
What are your interests? What is it that you would really like to do? Are you interested more in software side of speech recognition (the speech recognition engine, dialog manager) or on the linguistics side (acoustic or language model creation/tuning). Or is there a particular application you are thinking about?
What is the time frame for this? Is this something that must be completed in a semester (3 months), or can it be extended over a longer period of time. Some of the proposals on the GSoC page are much longer than a semester, but likely can be broken up into more manageable pieces.
What is the criteria for your thesis to be accepted? We want to make sure that whatever you pursue, it meets *your* required objectives too.
Ken