VoxForge
Hi everybody.
I am developing a little app on Android which should partially be controlled by voice commands. As I would like to have offline recognition i've chosen pocketsphinx as my engine. The demo app is up and running and works quite well in English. But as I need german voice commands I put in the apropriate files for german recognition which I found here:
This also works some kind of, but the system hardly recognizes anything.
I replaced the lm file by a small gram-file looking like this:
______________________________________________
#JSGF V1.0;
/******************************
Grammar generated automatically
******************************/
grammar commands;
public <test> = ( JA | NEIN | WEITER | ZURÜCK | ABBRECHEN ) ;
_________________________________________________
The corrosponding dictionary file looks like this:
_________________________________________________
JA j aa:
NEIN n ai n
WEITER v ai t ei
ZURÜCK qq t s u: qq r y: s e: k aa:
_________________________________________________
This model is 16khz model
You need to modify demo sources to record audio at 16khz instead of 8khz
You can uncomment rawlogdir in order to store raw files you are trying to recognize to debug accuracy issues.
For more information see the tutorial
http://cmusphinx.sourceforge.net/wiki/tutorial
and the FAQ:
Hi,
thanks for your reply.
Recording is set to 16kHz. (c.setFloat("-samprate", 16000.0);)
OK, as you are directing me directly to the cmusphinx project I assume that you think it is a sphinx specific problem. I'll ask there.
Thank you.
> Recording is set to 16kHz. (c.setFloat("-samprate", 16000.0);)
It is not enough, you also need to set 16khz in microphone recording properties
> as you are directing me directly to the cmusphinx project I assume that you think it is a sphinx specific problem. I'll ask there.
Thanks for that tip. While analyzing the raw files i just found that out by myself.
After changing the recording to the correct sample rate it started to work much better. ;)
this.rec = new AudioRecord(
MediaRecorder.AudioSource.DEFAULT,
16000,
AudioFormat.CHANNEL_IN_MONO,
AudioFormat.ENCODING_PCM_16BIT, 16384);
Now it recognizes the words "NEIN", "ZURÜCK" and "WEITER" without problems, but the word "JA" is impossible to hit.
Here my complete configs: http://pastebin.com/rvauhR3L
Logfile output: http://pastebin.com/ywJQ0aDB
Any ideas?
You need to share raw files, not just the log
Accuracy can be usually improved by verifying the dictionary and with adapation of the acoustic model. Most likely the dictionary entry for JA is not correct.
Overall, German Voxforge model is not of the best quality, it can be retrained for improved accuracy.
Hi,
here you can download a bunch of raw files just recorded. Not even one "JA" was recognized.
They are 16 Bit signed, Little Endian 16kHz sample rate.
https://rapidshare.com/files/1252972892/Rawlog.zip
OK, I just put together a complete package of what I am doing at the moment. I ran it under Linux for testing, result is the same as on Android, "JA" is not recognized at all
You can find the package here:
https://rapidshare.com/files/721115359/pocketsphinx_testing.zip
There is a .sh shell script which holds the command that was executed. The other file contains the complete output of the console.
The recording quality of the speech is not very good, it's actually what I get from the internal microphone of the tablet. But recognition rate is still quite slick for the low SNR of the files. (There are just a few that don't get recognized, but listening to them separately shows, that they are extremly low amplitude, so no problem here, except that word "JA"...)
Hope this helps