VoxForge
Hi All,
I have been used HTK for many years, and now I'm trying to test pocketsphinx decoder. I have built a 16 gaussians mixture MFCC_E_D_A_Z cross-word triphones acoustic model using 15 hours of data in HTK. I had great results with HDecode.
I used the Python HTK to Sphinx acoustic model converter:
http://cmusphinx.sourceforge.net/2010/08/python-htk-converter/
It' seems to work! But when I go to test it on pocketsphinx I got the follow error message:
--------------------------------------------------------------------
pocketsphinx_batch \
-adcin yes \
-hmm LapsAM-1.5-16k-Sphinx \
-lw 10 \
-feat 1s_c_d_dd \
-dict Sphinx/dictionary.dic \
-lm /sphinx/test_benchmark/etc/laps1.5.lm \
-ctl wav-list.txt \
-cepdir sphinx/test_benchmark/feat \
-cepext .wav \
-hyp /sphinx/test_benchmark/result/laps1.5-1-1.match \
-agc none \
-varnorm no \
-cmn current \
-ceplen 13 \
-wlen 0.025
Current configuration:
[NAME] [DEFLT] [VALUE]
-adchdr 0 0
-adcin no yes
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-argfile
-ascale 20.0 2.000000e+01
-backtrace no no
-beam 1e-48 1.000000e-48
-bestpath yes yes
-bestpathlw 9.5 9.500000e+00
-bghist no no
-build_outdirs yes yes
-cepdir /home/04080002801/sphinx/test_benchmark/feat
-cepext .mfc .wav
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-ctl wav-list.txt
-ctlcount -1 -1
-ctlincr 1 1
-ctloffset 0 0
-ctm
-debug 0
-dict /home/04080002801/htkEmbedded/Sphinx/dictionary.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-08
-frate 100 100
-fsg
-fsgctl
-fsgdir
-fsgext
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-29
-fwdtree yes yes
-hmm /home/04080002801/LapsAM-1.5-16k-Sphinx
-hyp /home/04080002801/sphinx/test_benchmark/result/laps1.5-1-1.match
-hypseg
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm /home/04080002801/sphinx/test_benchmark/etc/laps1.5.lm
-lmctl
-lmname default default
-lmnamectl
-logbase 1.0001 1.000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+02
-lpbeam 1e-40 1.000000e-40
-lponlybeam 7e-29 7.000000e-29
-lw 6.5 1.000000e+01
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-mixw
-mixwfloor 0.0000001 1.000000e-07
-mllr
-mllrctl
-mllrdir
-mllrext
-mmap yes yes
-nbest 0 0
-nbestdir
-nbestext .hyp .hyp
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+00
-outlatdir
-pbeam 1e-48 1.000000e-48
-pip 1.0 1.000000e+00
-pl_beam 1e-10 1.000000e-10
-pl_pbeam 1e-5 1.000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-sendump
-senmgau
-silprob 0.005 5.000000e-03
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-usewdphones no no
-uw 1.0 1.000000e+00
-var
-varfloor 0.0001 1.000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-29
-wip 0.65 6.500000e-01
-wlen 0.025625 2.500000e-02
INFO: feat.c(848): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12.00, mean[1..12]= 0.0
INFO: mdef.c(520): Reading model definition: /home/04080002801/LapsAM-1.5-16k-Sphinx/mdef
FATAL_ERROR: "mdef.c", line 299: CI-senone-id(337) > #CI-senones(117): aa - - - n/a 55 337 338 112 N
Someone tried to use this converter? works ?
thanks for any help!
[ ]'s
Patrick
--- (Edited on 9/27/2010 2:46 pm [GMT-0500] by krlospatrick) ---
Hello Patrick
Please take into account that this code is hightly exerimental. It does work in some case, but there are several issues known
1. Accuracy after conversion is a little bit worse than with HDecode (2% relatively worse). We are looking on this issue now.
2. Feature extraction is not compatible. You need to use other feature extarction options to be the same as HTK. You used incorrect ones in your command line. You need to add -dct htk -nfilter 22 and set up upperf and lowerf to something like 1 and 8000. Alternatively you can process HTK mfcc file.
3. Your problem reveals that there is a bug in the convertor that fails to convert the model properly. I need to be able to reproduce your issue in order to fix it in htk2s3conv sources. Specifically I need the model files you have. This issue is tracked in CMUSphinx bug tracking system:
https://sourceforge.net/tracker/?func=detail&aid=3077225&group_id=1904&atid=101904
It might be easier for you to retrain your model using SphinxTrain, i think you'll enjoy the results.I also respectively recommend you to ask further questions on CMUSphinx forums, the default support place for CMUSphinx project.
--- (Edited on 9/28/2010 17:10 [GMT+0400] by nsh) ---
Hi Nickolay,
Thanks for your answer!
I follow the CMUSphinx bug tracking system:
https://sourceforge.net/tracker/?func=detail&aid=3077225&group_id=1904&atid=101904
>> I need to be able to reproduce your issue in order to fix it in htk2s3conv sources. Specifically I need the model files you have.
This is the link for the original HTK acoustic Model:
http://www.laps.ufpa.br/patrick/downloads/LapsAM-1.5-16k-Pre-Sphinx.rar
You can use it to test the conversion scripts. Please, let me know any improvement. I'll try to train the model with the SphinxTrain.
Thanks for the help !!
Patrick
--- (Edited on 9/28/2010 1:21 pm [GMT-0500] by Visitor) ---