VoxForge
Hi guys,
I'm willing to create a speech recognition device.
Of course i'm only looking at the open source project, but I don't exactely know which one to use yet.
I already know that the 4 mains one are: Sphinx, ISIP, HTK, and Julius..
But wich one should I choose knowing that I need to create a continuous speech recognizer, in english.
I would really like to have your opinion.
Best regards
--- (Edited on 4/ 6/2007 5:43 am [GMT-0500] by Visitor) ---
Hi,
You might take a look at Arthur Chan's article on "Why there is no Open Source Dictation" on this post,
Note that HTK has licensing restrictions, so you can't distribute the source or binaries of the toolkit. However, you can distribute Acoustic Models or Language Models generated with the toolkit. Being a former commercial product, HTK has the best documentation.
Julius has no distribution restrictions, but uses Acoustic Models and Language Models generated using the HTK toolkit.
ISIP has no distribution restrictions, but is not as popular as Sphinx. They have excellent tutorials on Speech Recognition.
That leaves Sphinx if you want a truly open source solution, from a speech recognition engine and acoustic model creation perspective. However, based on Arthur Chan's article (and he used to be a Sphinx maintainer), Sphinx was not really designed with Dictation in mind.
Only Julius was designed with Dictation in mind. But Julius is only distributed with Japanese Acoustic and Language Models. Which brings us to one of the reasons for the creation of the VoxForge web site (see the VoxForge About page for more ...).
Please consider donating some of your speech.
thanks,
Ken
--- (Edited on 4/ 6/2007 10:35 am [GMT-0400] by kmaclean) ---
--- (Edited on 4/ 6/2007 10:49 am [GMT-0400] by kmaclean) ---
Hi Ken,
Thanks for your clear response.
I've been using sphinx4 for the last weeks. Working pretty well.
I still have one question though.
Do you know where I could find a document explaining the difference beetween the different speech recognition system we were talking about. I'm talking about implementation difference, algorithm's and performance's.
It would be great if you have this kind of information.
Thank you very much for your time.
Best regards.
Jean
--- (Edited on 5/22/2007 11:48 am [GMT-0500] by Visitor) ---
Hi Jean,
For a recent comparison of HTK (note that Julius uses HTK acoustic models) with Sphinx, see Keith Vertanen's site. He created acoustic models for Sphinx and HTK using the Wall Street Journal WSJ0 corpus, and gives the results:
You might try running Julius with the HTK models to get an idea as to how Julius might compare with HTK & Sphinx.For an older comparison of Sphinx and HTK, see this document:
A COMPARISON OF PUBLIC DOMAIN SOFTWARE TOOLS FOR SPEECH RECOGNITION
Ken
--- (Edited on 5/30/2007 11:31 am [GMT-0400] by kmaclean) ---