VoxForge
HI Dano,
>And I created a .po file (currently reviewed) to launchpad (launchpad.net/voxforge .) I will also set up the .po file for
>translation of the site and prompts if you want to.
That would be awesome. Though I'm still not 100% clear on how a Java .po file works (or how collecting translations on Launchpad works...), but the ability to add new languages without tinkering with code seems like a good thing to me!
Do you need access to the Subversion repository?
Ken
--- (Edited on 9/5/2008 9:23 am [GMT-0400] by kmaclean) ---
Maybe this link is helpful?
http://www.gnu.org/software/autoconf/manual/gettext/Java.html
--- (Edited on 06-09-2008 1:36 pm [GMT+0200] by dano) ---
Hi Daniël,
>Maybe this link is helpful?
Yes, thanks,
I am glad to see that GNU gettext po files are implemented using Sun's own Java Internationalization mechanism.
>but this seems not very difficult to me
well... as they say: "the devil is in the details "
Ken
--- (Edited on 9/9/2008 12:35 pm [GMT-0400] by kmaclean) ---
Hi Dano,
>http://www.dev.voxforge.org/projects/Main/ticket/366 is about
>other languages :)
oops...maybe I should read the thing... :)
The one I was thinking of is: ticket #376 - Nightly Build Acoustic Model Performance Decrease.
Ken
--- (Edited on 9/2/2008 1:40 pm [GMT-0400] by kmaclean) ---
Well, it's important to have clean data and quantative tests, without them it's impossible to move forward.
Turned by this discussion I started to train sphinx model, it will take a week I suppose on my machine, but probably we'll move training to the cluster.
I already hav found the following problems in prompts:
corno1979-10102006
kylegoetz-10122006
corno1979-10102006-NR - bad PROMPTS
mfread* - no PROMPTS, just prompts.txt
douglaid-20080205 vf-01 instead of vf-1
many PROMPTS has ../../../Audio/MFCC/XXkHz_YYbit/MFCC_0_D/ inside
douglaid-20080203 - incorrect prompt line
mojomove411-20071102-poe/wav/iaf0007 KILTARTAN\342\200\231S - bad word
And conducted the list of problematic utterances for which alignemnt failed, it would be nice to review them:
douglaid-20080219/wav/vf11-07,
douglaid-20080219/wav/vf11-08,
douglaid-20080219/wav/vf11-11,
knotyouraveragejo-20080428-adv/wav/adv0231,
G-20080425-itf/wav/b0002,
xaviergonz-20080419-uje/wav/a0398,
xaviergonz-20080419-uje/wav/a0404,
ductapeguy-20070308b/wav/bab.0023,
peterwhy-20080503-win/wav/win0151,
chocoholic-20070524/wav/eti0091,
chocoholic-20070524/wav/eti0237,
anonymous-20080204-hnl/wav/ar-24,
anonymous-20080716-sfu/wav/a0340,
knotyouraveragejo-20080502-adv/wav/adv0280,
anonymous-20080630-lhi/wav/a0285,
gilrim-20080120-vgs/wav/b0415,
rjmunro-20080517-win/wav/a0236,
Toyo-20080229-ogz.zip/wav/a0104,
Toyo-20080229-ogz.zip/wav/a0105,
Toyo-20080229-ogz.zip/wav/a0106,
Toyo-20080229-ogz.zip/wav/a0108,
Toyo-20080229-ogz.zip/wav/a0111,
Toyo-20080229-ogz.zip/wav/a0112,
mjmm-20080526-hca/wav/b0074,
mjmm-20080526-hca/wav/b0075,
mjmm-20080526-hca/wav/b0076,
mjmm-20080526-hca/wav/b0077,
mjmm-20080526-hca/wav/b0078,
mjmm-20080526-hca/wav/b0079,
mjmm-20080526-hca/wav/b0080,
mjmm-20080526-hca/wav/b0081,
mjmm-20080526-hca/wav/b0082,
knotyouraveragejo-20070621-sci/wav/sci0150,
nestea247-20080301-sbn/wav/a0310,
corno1979-10102006-NR/wav/cc011,
corno1979-10102006-NR/wav/cc012,
corno1979-10102006-NR/wav/cc016,
corno1979-10102006-NR/wav/cc018,
corno1979-10102006-NR/wav/cc026,
corno1979-10102006-NR/wav/cc033,
corno1979-10102006-NR/wav/cc036,
corno1979-10102006-NR/wav/cc039,
Mark_Reynolds-20070531-cc/wav/cc-27,
cebidae-20080522-nsi/wav/b0385,
gilrim-20080120-ohc/wav/a0495,
gilrim-20080120-ohc/wav/a0500,
xenobyte72-20080530-pgo/wav/b0131,
kayray-20070611-ele/wav/ele0116,
chocoholic-20070612-eti33/wav/eti0278,
bloomtom-20080612-pfg/wav/a0401,
KnitGirl-20071113-dil/wav/b0274,
gilrim-20080120-uxi/wav/a0093,
gilrim-20080120-uxi/wav/a0094,
gilrim-20080120-uxi/wav/a0095,
gilrim-20080120-uxi/wav/a0096,
gilrim-20080120-uxi/wav/a0097,
robertburrelldonkin-20070918-vf16/wav/vf16-22,
cebidae-20080522-npq/wav/a0264,
cebidae-20080522-npq/wav/a0265,
cebidae-20080522-npq/wav/a0267,
Thomas-20080507-iya/wav/a0187,
vince-20071118-tez/wav/b0297,
gilrim-20080120-rzu/wav/rp-10,
vikramjb-20080416-cls/wav/a0398,
vikramjb-20080416-cls/wav/a0403,
vikramjb-20080416-cls/wav/a0404,
vikramjb-20080416-cls/wav/a0405,
vikramjb-20080416-cls/wav/a0406,
guilherme-20080123-pfh/wav/b0150,
knotyouraveragejo-20070620-sci/wav/sci0135,
anonymous-20080425-ojw/wav/b0363,
russellfeeed-20080211-upk/wav/b0025,
russellfeeed-20080211-upk/wav/b0026,
russellfeeed-20080211-upk/wav/b0027,
russellfeeed-20080211-upk/wav/b0028,
russellfeeed-20080211-upk/wav/b0031,
russellfeeed-20080211-upk/wav/b0033,
russellfeeed-20080211-upk/wav/b0034,
kayray-20070527-per07/wav/per0007,
kayray-20070527-per07/wav/per0014,
kayray-20070527-per07/wav/per0057,
kayray-20070527-per07/wav/per0071,
kayray-20070527-per07/wav/per0120,
kayray-20070527-per07/wav/per0141,
kayray-20070527-per07/wav/per0179,
kayray-20070527-per07/wav/per0231,
kayray-20070527-per07/wav/per0319,
kayray-20070527-per07/wav/per0335,
CptOatmeal-20080721-vnh/wav/a0426,
Joel-20080716-qoz/wav/b0074,
Joel-20080716-qoz/wav/b0075,
Joel-20080716-qoz/wav/b0076,
Joel-20080716-qoz/wav/b0077,
Joel-20080716-qoz/wav/b0078,
Joel-20080716-qoz/wav/b0080,
Joel-20080716-qoz/wav/b0081,
Joel-20080716-qoz/wav/b0082,
Joel-20080716-qoz/wav/b0083,
kayray-20070425-per04/wav/per0041,
kayray-20070425-per04/wav/per0073,
kayray-20070425-per04/wav/per0100,
kayray-20070425-per04/wav/per0105,
bloomtom-20080612-vya/wav/rb-31,
GrahamPhillips-20071111-oxp/wav/a0115,
GrahamPhillips-20071111-oxp/wav/a0117,
anonymous-20071127-rln/wav/a0575,
anonymous-20080318-eaq/wav/b0073,
jaiger-20061231-vf7/wav/vf7-25,
starlite-20070614-fur2/wav/fur0136
--- (Edited on 9/6/2008 6:07 am [GMT-0500] by nsh) ---
I've not a very fast Internet connection so it takes long to download :( so I take some of the recordings.
douglaid-20080219:
incorrect prompt lines (the prompt 5 is skipped)
5= 6
6 = 7
until douglaid-20080219/mfc/vf11-16 THE ADDED WEIGHT HAD A VELOCITY OF FIFTEEN MILES PER HOUR (15 and 16 are equal))
G-20080425-itf/wav/b0002 a little tap in the beginning
xaviergonz-20080419-uje a0398 seems good, record of a0404 begins too late (the p of PERRAULT is not recorded.)
ductapeguy-20070308b/wav/bab.0023 seems good.
peterwhy-20080503-win/mfc/win0151 seems good, but I think they are two phrases, so he stops a while after lunch.
(peterwhy-20080503-win/mfc/win0150 NOR YOU EITHER IF YOU'VE GOT ANY SENSE AT ALL DON'T EVER REFER TO IT AGAIN PLEASE
peterwhy-20080503-win/mfc/win0151 NOW THEN HERE'S OUR BACKWATER AT LAST WHERE WE'RE GOING TO LUNCH LEAVING THE MAIN STREAM
peterwhy-20080503-win/mfc/win0152 THEY NOW PASSED INTO WHAT SEEMED AT FIRST SIGHT LIKE A LITTLE LAND LOCKED LAKE)
anonymous-20080204-hnl (sounds like breathing in in the first part)
anonymous-20080716 (little tap in sound)
anonymous-20080630-lhi (blows in microphone)
--- (Edited on 9/6/2008 7:28 am [GMT-0500] by Visitor) ---
It was me :)
douglaid-20080219 is very serious as 5 6 7 8 9 10 11 12 13 14 15 are wrong.
--- (Edited on 06-09-2008 4:08 pm [GMT+0200] by dano) ---
some additional files.
anonymous-20080630-lhi wav/a0285 blows in microphone
gilrim-20080120-vgs (all) very noisy, but is comprehendable
rjmunro-20080517-winwav/a0236 big tap
Toyo-20080229-ogz.zip very bad: noisy and can not speak English
mjmm-20080526-hca VERY noisy
nestea247-20080301-sbn begins with tap
corno1979-10102006-NR seems good, but isn't it required to have capitals instead of normal sentences? (I don't know, but the other prompts did have.)
Mark_Reynolds-20070531-cc/mfc/cc-27 AND LAID HER ON HER RIGHT SIDE THEN SARAH CONFIRMED THE VET'S DIAGNOSIS instead of
cc-27 AND LAID HER ON HER RIGHT SIDE THEN SARAH CONFIRMED THE VET'S DIAGNOSIS ? all prompts in this file
cebidae-20080522-ns also previous thing, but says 'that' instead of 'last' and the last words are not good spoken.
--- (Edited on 06-09-2008 10:43 pm [GMT+0200] by dano) ---
--- (Edited on 06-09-2008 11:10 pm [GMT+0200] by dano) ---
Thanks Dano, indeed there is high probability that listed files are broken. The question is what should we do with them - remove, add as fillers, something else.
Training went faster than I expected, I've got a model already, you can download sphinx voxforge model with setup scripts and logs here:
http://www.mediafire.com/?jxy1bkznozb
At least now we have estimation of the model accuracy, on the 1/10 test set with a custom trigram lm trained on the test prompts it has the following quality:
TOTAL Words: 28112 Correct: 25767 Errors: 3158
TOTAL Percent correct = 91.66% Error = 11.23% Accuracy = 88.77%
TOTAL Insertions: 813 Deletions: 415 Substitutions: 1930
Not bad, but I suppose we can raise the accuracy to 97% if we'll try to optimize training.
Here is another list of suspicious prompts:
douglaid-20080219/wav/vf11-07,
douglaid-20080219/wav/vf11-08,
knotyouraveragejo-20080426-adv/wav/adv0190,
knotyouraveragejo-20080426-adv/wav/adv0308,
kayray-20070611-leo/wav/leo0210,
knotyouraveragejo-20080502-adv/wav/adv0280,
Toyo-20080229-ogz.zip/wav/a0111,
mjmm-20080526-hca/wav/b0074,
mjmm-20080526-hca/wav/b0075,
mjmm-20080526-hca/wav/b0076,
mjmm-20080526-hca/wav/b0078,
mjmm-20080526-hca/wav/b0079,
mjmm-20080526-hca/wav/b0080,
mjmm-20080526-hca/wav/b0081,
mjmm-20080526-hca/wav/b0082,
leonMire-20080526-lev/wav/lev0063,
corno1979-10102006-NR/wav/cc020,
corno1979-10102006-NR/wav/cc029,
Mark_Reynolds-20070531-cc/wav/cc-27,
kayray-20070608-rhi/wav/rhi0094,
safi-20071118-swr/wav/b0216,
starlite-20070605-che/wav/che0142,
kayray-20070611-ele/wav/ele0262,
robertburrelldonkin-200709011-vf11/wav/vf11-26,
KnitGirl-20071113-dil/wav/b0274,
gilrim-20080120-uxi/wav/a0093,
gilrim-20080120-uxi/wav/a0096,
gilrim-20080120-uxi/wav/a0101,
ttm-20071024-poe/wav/js0002,
topherfangio-20080604-jvb/wav/a0105,
ductapeguy-20080423-ang/wav/sto0020,
tis-20080416-tou/wav/voy0155,
knotyouraveragejo-20080525-mt2/wav/mtn0261,
vikramjb-20080416-cls/wav/a0398,
vikramjb-20080416-cls/wav/a0399,
vikramjb-20080416-cls/wav/a0400,
vikramjb-20080416-cls/wav/a0402,
vikramjb-20080416-cls/wav/a0403,
vikramjb-20080416-cls/wav/a0404,
vikramjb-20080416-cls/wav/a0405,
vikramjb-20080416-cls/wav/a0406,
CptOatmeal-20080721-vnh/wav/a0426,
Joel-20080716-qoz/wav/b0074,
Joel-20080716-qoz/wav/b0075,
Joel-20080716-qoz/wav/b0076,
Joel-20080716-qoz/wav/b0077,
Joel-20080716-qoz/wav/b0078,
Joel-20080716-qoz/wav/b0080,
Joel-20080716-qoz/wav/b0081,
Joel-20080716-qoz/wav/b0082,
Joel-20080716-qoz/wav/b0083,
anonymous-20071127-rln/wav/a0575,
anonymous-20080318-eaq/wav/b0073,
anonymous-20080318-eaq/wav/b0078,
anonymous-20080318-eaq/wav/b0079,
jaiger-20061231-vf7/wav/vf7-25,
--- (Edited on 9/6/2008 2:53 pm [GMT-0500] by nsh) ---
Hi nsh & Dano,
Good work guys!
>The question is what should we do with them - remove, add as fillers,
>something else.
I will look at these (and any others you may have...) and either correct them (if it is just a section of audio that is causing problems) or just move them to "problem" directory in Subversion (and update the master prompts files) so we always have list of the ones we removed.
thanks,
Ken
--- (Edited on 9/9/2008 12:00 pm [GMT-0400] by kmaclean) ---
HI nsh,
>Training went faster than I expected, I've got a model already, you can
>download sphinx voxforge model with setup scripts and logs here:
>http://www.mediafire.com/?jxy1bkznozb
Awesome!
I will add this to the downloads page.
thanks,
Ken
--- (Edited on 9/9/2008 12:07 pm [GMT-0400] by kmaclean) ---