Speech Recognition Engines

Nested
jsapi, PocketSphinx, mobile phone - newbie
User: johnyjj2
Date: 10/13/2009 4:30 am
Views: 9429
Rating: 7

Hello :-)!

I need to create application for mobile phone which communicates with the user (it requires speech recognition) and creates some data based on this talk. Then it should send this file to server.

I've just read jsapi guide and I thought about using this jsapi but there are some problems (my language is not so popular so it may be unsupported, I don't see any info about creating acoustic model for new languages and I don't see any way to specify order of sounds in the words based on informations from this jsapi guide).

I've got two ideas how to do it:

1) MIDlet which involves jsapi and then sends data with the use of httpconnection POST to TomCat server (I have already developed some code)

2) mobile phone with Symbian which involves PocketSphinx (I guess maybe I should abandon that first idea and learn from beginning how to create programs for Symbian)

In general I've spent much time on reading about speech recognition (you can simply look for my topics in the internet by typing "johnyjj2 speech" on google to see results of my research) and I still don't have anything and it pisses me off.

Can you tell me which way (1) or (2) would be better, please? And suggest any materials connected with (2) if this one would be better choice, please (preferably code of applications which use it, I didn't see in pocketsphinx-0.5.1.zip any examples which would help me in my project)?

I would be very grateful for any help from you :-)

Greetings :-)!

--- (Edited on 10/13/2009 4:31 am [GMT-0500] by johnyjj2) ---

Re: jsapi, PocketSphinx, mobile phone - newbie
User: kmaclean
Date: 10/13/2009 9:25 am
Views: 89
Rating: 10

Hi johnyjj2,

>I thought about using this jsapi but there are some problems

The JSAPI standard describes an API - Application Programming Interface.  It is not a speech recognition engine or system in and of itself. 

For an example of an implementation of the JSAPI standard, see the jvoicexml project.  With jvoicexml, the application layer uses the VoiceXML standard, and this communicates to the speech recognition engine and text-2-speech engine using JSAPI, and to the underlying telephony components using JTAPI.

>Can you tell me which way (1) or (2) would be better, please?

The jvoicexml project woudl be a good place for your to start, and you might want to look at SpeechForge.

I don't know the Symbian OS, but I do believe that Google's Android SDK includes speech recognition.

Ken

--- (Edited on 10/13/2009 10:25 am [GMT-0400] by kmaclean) ---

Re: jsapi, PocketSphinx, mobile phone - newbie
User: johnyjj2
Date: 10/14/2009 3:37 pm
Views: 78
Rating: 5

Thanks very much for your answer :-)!
I appreciate your help really much :).

I guess I'm going to read (is it good choice?):
1) voicexml 2.1 -> http://www.w3.org/TR/2005/CR-voicexml21-20050613/
2) jtapi -> http://blog.devrealm.org/2009/03/26/jtapi-overview/

I don't have any experience with using API documentations (-> http://jvoicexml.sourceforge.net/api-0.7.1/). Can you suggest me any good informations about using API documentations and non-API-doc guide about jvoicexml?

About jsapi - if I read jsapi guide 1.0, do you think I should read about jsapi 2.0? I guess there shouldn't be big differences between 1.0 and 2.0.
About jvoicexml - why is it written "demo implementation" on http://jvoicexml.sourceforge.net/ ? Is it somehow limited? Do you think I should stop writing in Wireless Toolkit and use only Eclipse (I found there is eclipse plugin available)? How is jvoicexml connected with creating speech recognition for MIDlet and how with Sphinx (PocketSphinx, Sphinx4)?
About speechforge.org - do you think that I need to use any of those four projects: MRCP4J, NLSML4J, Cairo, MRCP-TCK?
About Android and Symbian - I heard from some people that Symbian is better developed than Android. In general question is: is it enough to write it as MIDlet or should it be application created strictly for Symbian/Android? Somebody told me that: "Speech recognition (or rather short sound) in S60 is probably the least known Symbian's API. On WinCE, PPC, WN there is some official MS SPEECH SDK. And about Sphinx, in theory it is written in C++ so the library can be recompiled with the use of OpenC/OpenC++. It is worth to create wraper for PyS60 1.4.5 or PyS60 1.9.x".

Greetings :-)!

PS Are you really sure I can run jvoicexml on mobile phone? Isn't it rather J2EE so it cannot be run on mobile phone?

--- (Edited on 10/14/2009 11:45 pm [GMT-0500] by johnyjj2) ---

Re: jsapi, PocketSphinx, mobile phone - newbie
User: kmaclean
Date: 10/15/2009 1:11 pm
Views: 142
Rating: 6

>I guess I'm going to read (is it good choice?):

No... I only mentionned jvoicexml as an example of a JSAPI implementation.  Looking at the code and docs on the jvoicexml site will give you a better idea of what JSAPI is all about (which is what you were asking in your initial question)

There are other, more lightweight, approaches... look at Asterisk or FreeSwitch integration examples with Sphinx that do not use JSAPI.

>About jvoicexml - why is it written "demo implementation" on

>http://jvoicexml.sourceforge.net/ ? Is it somehow limited?

This is open source software... read the docs on the site to see what, if any limitations there might be...

>How is jvoicexml connected with creating speech recognition

>for MIDlet and how with Sphinx (PocketSphinx, Sphinx4)?

I don't know what a MIDlet is...

see here: What is the difference between a VoiceXML Interpreter, a VoiceXML Browser and a VoiceXML Platform? and here: VoiceXML Browsers

>About speechforge.org - do you think that I need to use any of those

>four projects: MRCP4J, NLSML4J, Cairo, MRCP-TCK?

my mistake...likely not - if you were building an enterprise app, then these might be needed.

>is it enough to write it as MIDlet or should it be application

>created strictly for Symbian/Android?

don't know

>It is worth to create wraper for PyS60 1.4.5 or PyS60 1.9.x".

dunno

>Are you really sure I can run jvoicexml on mobile phone?

No... I just gave you pointers on where to look in developing your own app. 

Please note that I can only provide general information about speech recognition.   I don't know the implementation specifics of speech recognition on Android or Symbian.

Ken

 

--- (Edited on 10/15/2009 2:11 pm [GMT-0400] by kmaclean) ---

Re: jsapi, PocketSphinx, mobile phone - newbie
User: kmaclean
Date: 12/18/2009 12:16 pm
Views: 3040
Rating: 6

This paper might be of interest to you: POCKETSPHINX: A FREE, REAL-TIME CONTINUOUS SPEECH RECOGNITION SYSTEM
FOR HAND-HELD DEVICES

--- (Edited on 12/18/2009 1:16 pm [GMT-0500] by kmaclean) ---

PreviousNext