VoxForge
Hi
Besides my contribution to the german voxforge model, I actually want to use it for controling my PC.
First I used blather for this, but is it is not as advanced as simon, so i switched to simon. It is also very good that simnon can upload the files here on foxforge :-D
Anyhow, I wanted to start with the new gernam base model from guenter:
http://goofy.zamia.org/voxforge/de/voxforge-de-r20140311.tgz
To use it as base and adapt it to my needs.
Unfortunatelly, simon can't use this archive and I compared it with a much older simon-model from 2010.
It is a whole different format.
How can i use the new german model with simon ? How does it needed to cerverted?
Or can i use the "Generate it from files" apprach. If so, please specify the commands who i need to fill in in all of these collums :-D
Thanks a lot :-D
> How can i use the new german model with simon ? How does it needed to cerverted?
You do not need to convert it. You just need to import it as a base model. See
http://docs.kde.org/development/en/extragear-accessibility/simon/configuration.html#base_model_use
Import it from an archive and it should work. You need 'Open Model'->'Import'.
> Or can i use the "Generate it from files" apprach. If so, please specify the commands who i need to fill in in all of these collums
Use the command you are planning to use in your everyday control.
Thanks i figured this out now. But the clue is to extract the package and use the text files inside :-D
But i get this error:
The recognition reported the following error:
The required speech recognition backend for this model ("CMU SPHINX") is not available.
Please install it to continue.
(More information: http://userbase.kde.org/Simon/Back_ends).
I find this odd because sphinx3 is already installed!
Packages i go installed:
app-accessibility/pocketsphinx
app-accessibility/sphinx3
app-accessibility/sphinxbase
app-accessibility/SphinxTrain
There is also sphinx2 available but i did not install as it seems to be the older version.
I got these shpinx programms installed:
manuel@kobold /home/manuel $ sphinx
sphinx3_align sphinx3_continuous sphinx3_ep sphinx3_lm_convert sphinx_cont_fileseg sphinx_lm_eval
sphinx3_astar sphinx3_dag sphinx3_gausubvq sphinx3-simple sphinx_fe sphinx_lm_sort
sphinx3_cfg2fsg sphinx3_decode sphinx3_livedecode sphinx_cepview sphinx_jsgf2fsg sphinx_pitch
sphinx3_conf sphinx3_dp sphinx3_livepretend sphinx_cont_adseg sphinx_lm_convert sphinxtrain
I don't know what is missing. Also the console does not give out a valueable error message what could be missing.
Thank
Ok i solved the prvious error by remerging the simon package. :-D
But now i get this error:
p, li { white-space: pre-wrap; }
INFO: cmd_ln.c(691): Parsing command line:
\
-hmm /tmp/kde-manuel//simond/default/sphinx/ \
-jsgf /tmp/kde-manuel//simond/default/sphinx/default{8da2f162-7a64-4b9a-b097-3bf12be148b2}.jsgf \
-dict /tmp/kde-manuel//simond/default/sphinx/default{8da2f162-7a64-4b9a-b097-3bf12be148b2}.dic \
-samprate 16000
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2,000000e+00
-alpha 0.97 9,700000e-01
-ascale 20.0 2,000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1,000000e-48
-bestpath yes yes
-bestpathlw 9.5 9,500000e+00
-bghist no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict /tmp/kde-manuel//simond/default/sphinx/default{8da2f162-7a64-4b9a-b097-3bf12be148b2}.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1,000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1,000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8,500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7,000000e-29
-fwdtree yes yes
-hmm /tmp/kde-manuel//simond/default/sphinx/
-input_endian little little
-jsgf /tmp/kde-manuel//simond/default/sphinx/default{8da2f162-7a64-4b9a-b097-3bf12be148b2}.jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm
-lmctl
-lmname default default
-logbase 1.0001 1,000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1,333333e+02
-lpbeam 1e-40 1,000000e-40
-lponlybeam 7e-29 7,000000e-29
-lw 6.5 6,500000e+00
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1,000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1,000000e+00
-pbeam 1e-48 1,000000e-48
-pip 1.0 1,000000e+00
-pl_beam 1e-10 1,000000e-10
-pl_pbeam 1e-5 1,000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1,600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5,000000e-03
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1,000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6,855498e+03
-usewdphones no no
-uw 1.0 1,000000e+00
-var
-varfloor 0.0001 1,000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7,000000e-29
-wip 0.65 6,500000e-01
-wlen 0.025625 2,562500e-02
INFO: cmd_ln.c(691): Parsing command line:
\
-lowerf 130 \
-upperf 6800 \
-nfilt 25 \
-transform dct \
-lifter 22 \
-feat 1s_c_d_dd \
-agc none \
-cmn current \
-varnorm no
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2,000000e+00
-alpha 0.97 9,700000e-01
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-dither no no
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 22
-logspec no no
-lowerf 133.33334 1,300000e+02
-ncep 13 13
-nfft 512 512
-nfilt 40 25
-remove_dc no no
-round_filters yes yes
-samprate 16000 1,600000e+04
-seed -1 -1
-smoothspec no no
-svspec
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6,800000e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2,562500e-02
INFO: acmod.c(246): Parsed model-specific feature parameters from /tmp/kde-manuel//simond/default/sphinx//feat.params
INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12,00, mean[1..12]= 0.0
INFO: mdef.c(517): Reading model definition: /tmp/kde-manuel//simond/default/sphinx//mdef
INFO: bin_mdef.c(179): Allocating 60978 * 8 bytes (476 KiB) for CD tree
INFO: tmat.c(205): Reading HMM transition probability matrices: /tmp/kde-manuel//simond/default/sphinx//transition_matrices
INFO: acmod.c(121): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /tmp/kde-manuel//simond/default/sphinx//means
INFO: ms_gauden.c(292): 4177 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 16x32
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /tmp/kde-manuel//simond/default/sphinx//variances
INFO: ms_gauden.c(292): 4177 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 16x32
INFO: ms_gauden.c(354): 6598 variance values floored
INFO: acmod.c(123): Attempting to use PTHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /tmp/kde-manuel//simond/default/sphinx//means
INFO: ms_gauden.c(292): 4177 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 16x32
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /tmp/kde-manuel//simond/default/sphinx//variances
INFO: ms_gauden.c(292): 4177 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 16x32
INFO: ms_gauden.c(354): 6598 variance values floored
INFO: ptm_mgau.c(792): Number of codebooks exceeds 256: 4177
INFO: acmod.c(125): Falling back to general multi-stream GMM computation
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /tmp/kde-manuel//simond/default/sphinx//means
INFO: ms_gauden.c(292): 4177 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 16x32
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /tmp/kde-manuel//simond/default/sphinx//variances
INFO: ms_gauden.c(292): 4177 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 16x32
INFO: ms_gauden.c(354): 6598 variance values floored
ERROR: "ms_mgau.c", line 112: Dimension of stream 0 does not match: 32 != 39
This looks more like an error in the language model or sphinx to me but i don't really have a clue :-(
Thanks for help:-D
Somehow the model was not imported properly. It was either Simon bug or import issue.
The problem is that the model lacks "feature_transform" file present in the original model. Try to reimport the model and if it is still broken try to copy feature_transform file from the model to the folder /tmp/kde-manuel/simond/default/sphinx yourself.
Ah ok this makes sense now.
As i said before, simon sems to can't import .tgz archives (gives error: model is corrupt) so i extracted the model and cliked on "create from model files" and in this file dialog there is no option to select "feature_transform" so i guess i does not get included into the model.
I will try to copy it by hand and see what happens :-D
Ok the problem ist, that the metadata.xml is missing and copying did indeed solve the problem.
I wrote a small script who can convert the filestructure of the german model (i guess this is sphinx3 file structure) into a fully working .sbm archive who can easily inported into simon :-D
Have fun. And please improve it if you want, it is a dirty hack in some places....
Well, when i try to use it with the scenario "Fensterverwaltung" (windowmanangement) it gives me this error:
p, li { white-space: pre-wrap; }
INFO: cmd_ln.c(691): Parsing command line:
\
-hmm /tmp/kde-manuel//simond/default/sphinx/ \
-jsgf /tmp/kde-manuel//simond/default/sphinx/default{c5611aea-446c-4893-aee3-384076ce6ff4}.jsgf \
-dict /tmp/kde-manuel//simond/default/sphinx/default{c5611aea-446c-4893-aee3-384076ce6ff4}.dic \
-samprate 16000
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2,000000e+00
-alpha 0.97 9,700000e-01
-ascale 20.0 2,000000e+01
-aw 1 1
-backtrace no no
-beam 1e-48 1,000000e-48
-bestpath yes yes
-bestpathlw 9.5 9,500000e+00
-bghist no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict /tmp/kde-manuel//simond/default/sphinx/default{c5611aea-446c-4893-aee3-384076ce6ff4}.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1,000000e-08
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1,000000e-64
-fwdflatefwid 4 4
-fwdflatlw 8.5 8,500000e+00
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7,000000e-29
-fwdtree yes yes
-hmm /tmp/kde-manuel//simond/default/sphinx/
-input_endian little little
-jsgf /tmp/kde-manuel//simond/default/sphinx/default{c5611aea-446c-4893-aee3-384076ce6ff4}.jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm
-lmctl
-lmname default default
-logbase 1.0001 1,000100e+00
-logfn
-logspec no no
-lowerf 133.33334 1,333333e+02
-lpbeam 1e-40 1,000000e-40
-lponlybeam 7e-29 7,000000e-29
-lw 6.5 6,500000e+00
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-min_endfr 0 0
-mixw
-mixwfloor 0.0000001 1,000000e-07
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1,000000e+00
-pbeam 1e-48 1,000000e-48
-pip 1.0 1,000000e+00
-pl_beam 1e-10 1,000000e-10
-pl_pbeam 1e-5 1,000000e-05
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1,600000e+04
-seed -1 -1
-sendump
-senlogdir
-senmgau
-silprob 0.005 5,000000e-03
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1,000000e-04
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6,855498e+03
-usewdphones no no
-uw 1.0 1,000000e+00
-var
-varfloor 0.0001 1,000000e-04
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7,000000e-29
-wip 0.65 6,500000e-01
-wlen 0.025625 2,562500e-02
INFO: cmd_ln.c(691): Parsing command line:
\
-lowerf 130 \
-upperf 6800 \
-nfilt 25 \
-transform dct \
-lifter 22 \
-feat 1s_c_d_dd \
-agc none \
-cmn current \
-varnorm no
Current configuration:
[NAME] [DEFLT] [VALUE]
-agc none none
-agcthresh 2.0 2,000000e+00
-alpha 0.97 9,700000e-01
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-dither no no
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda /tmp/kde-manuel//simond/default/sphinx//feature_transform
-ldadim 0 0
-lifter 0 22
-logspec no no
-lowerf 133.33334 1,300000e+02
-ncep 13 13
-nfft 512 512
-nfilt 40 25
-remove_dc no no
-round_filters yes yes
-samprate 16000 1,600000e+04
-seed -1 -1
-smoothspec no no
-svspec
-transform legacy dct
-unit_area yes yes
-upperf 6855.4976 6,800000e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2,562500e-02
INFO: acmod.c(246): Parsed model-specific feature parameters from /tmp/kde-manuel//simond/default/sphinx//feat.params
INFO: feat.c(713): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean[0]= 12,00, mean[1..12]= 0.0
INFO: acmod.c(157): Reading linear feature transformation from /tmp/kde-manuel//simond/default/sphinx//feature_transform
INFO: mdef.c(517): Reading model definition: /tmp/kde-manuel//simond/default/sphinx//mdef
INFO: bin_mdef.c(179): Allocating 60978 * 8 bytes (476 KiB) for CD tree
INFO: tmat.c(205): Reading HMM transition probability matrices: /tmp/kde-manuel//simond/default/sphinx//transition_matrices
INFO: acmod.c(121): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /tmp/kde-manuel//simond/default/sphinx//means
INFO: ms_gauden.c(292): 4177 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 16x32
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /tmp/kde-manuel//simond/default/sphinx//variances
INFO: ms_gauden.c(292): 4177 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 16x32
INFO: ms_gauden.c(354): 6598 variance values floored
INFO: acmod.c(123): Attempting to use PTHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /tmp/kde-manuel//simond/default/sphinx//means
INFO: ms_gauden.c(292): 4177 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 16x32
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /tmp/kde-manuel//simond/default/sphinx//variances
INFO: ms_gauden.c(292): 4177 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 16x32
INFO: ms_gauden.c(354): 6598 variance values floored
INFO: ptm_mgau.c(792): Number of codebooks exceeds 256: 4177
INFO: acmod.c(125): Falling back to general multi-stream GMM computation
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /tmp/kde-manuel//simond/default/sphinx//means
INFO: ms_gauden.c(292): 4177 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 16x32
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /tmp/kde-manuel//simond/default/sphinx//variances
INFO: ms_gauden.c(292): 4177 codebook, 1 feature, size:
INFO: ms_gauden.c(294): 16x32
INFO: ms_gauden.c(354): 6598 variance values floored
INFO: ms_senone.c(149): Reading senone mixture weights: /tmp/kde-manuel//simond/default/sphinx//mixture_weights
INFO: ms_senone.c(200): Truncating senone logs3(pdf) values by 10 bits
INFO: ms_senone.c(207): Not transposing mixture weights in memory
INFO: ms_senone.c(266): Read mixture weights for 4177 senones: 1 features x 16 codewords
INFO: ms_senone.c(320): Mapping senones to individual codebooks
INFO: ms_mgau.c(141): The value of topn: 4
INFO: dict.c(317): Allocating 4111 * 32 bytes (128 KiB) for word entries
INFO: dict.c(332): Reading main dictionary: /tmp/kde-manuel//simond/default/sphinx/default{c5611aea-446c-4893-aee3-384076ce6ff4}.dic
ERROR: "dict.c", line 193: Line 1: Phone 'gls' is mising in the acoustic model; word 'Abbrechen' ignored
ERROR: "dict.c", line 193: Line 2: Phone 'a' is mising in the acoustic model; word 'Alle' ignored
ERROR: "dict.c", line 193: Line 3: Phone 'gls' is mising in the acoustic model; word 'anzeigen' ignored
ERROR: "dict.c", line 193: Line 4: Phone 'aU' is mising in the acoustic model; word 'Auswählen' ignored
ERROR: "dict.c", line 193: Line 5: Phone 'f' is mising in the acoustic model; word 'Fenster' ignored
ERROR: "dict.c", line 193: Line 6: Phone 'l' is mising in the acoustic model; word 'Links' ignored
ERROR: "dict.c", line 193: Line 7: Phone 'n' is mising in the acoustic model; word 'Nächstes' ignored
ERROR: "dict.c", line 193: Line 8: Phone 'gls' is mising in the acoustic model; word 'Rauf' ignored
ERROR: "dict.c", line 193: Line 9: Phone 'E' is mising in the acoustic model; word 'Rechts' ignored
ERROR: "dict.c", line 193: Line 10: Phone 'gls' is mising in the acoustic model; word 'Runter' ignored
ERROR: "dict.c", line 193: Line 11: Phone 'l' is mising in the acoustic model; word 'schließen' ignored
ERROR: "dict.c", line 193: Line 12: Phone 'f' is mising in the acoustic model; word 'Vorheriges' ignored
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(335): 0 words read
INFO: dict.c(341): Reading filler dictionary: /tmp/kde-manuel//simond/default/sphinx//noisedict
INFO: dict.c(211): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(344): 3 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(404): Allocating 59^3 * 2 bytes (401 KiB) for word-initial triphones
INFO: dict2pid.c(131): Allocated 84016 bytes (82 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 84016 bytes (82 KiB) for single-phone word triphones
INFO: fsg_search.c(145): FSG(beam: -1080, pbeam: -1080, wbeam: -634; wip: -26, pip: 0)
INFO: jsgf.c(581): Defined rule:
INFO: jsgf.c(581): Defined rule:
INFO: jsgf.c(581): Defined rule:
INFO: jsgf.c(581): Defined rule:
INFO: jsgf.c(581): Defined rule:
INFO: jsgf.c(581): Defined rule:
INFO: jsgf.c(581): Defined rule:
INFO: jsgf.c(581): Defined rule:
INFO: jsgf.c(581): Defined rule:
INFO: jsgf.c(581): Defined rule:
INFO: jsgf.c(581): Defined rule:
INFO: jsgf.c(581): Defined rule:
INFO: jsgf.c(581): Defined rule:
INFO: jsgf.c(581): Defined rule:
INFO: jsgf.c(581): Defined rule: PUBLIC
INFO: fsg_model.c(215): Computing transitive closure for null transitions
INFO: fsg_model.c(270): 37 null transitions added
ERROR: "fsg_search.c", line 334: The word 'Fenster' is missing in the dictionary
Interestingly. The word 'Fenster' is available in the shadow dict and also in the active dict. So i don't get what he is talking about.......
With the missing ohonoes, this seems really bad but als here i can't belive that the phonon a is not in the model it is very hard to speak german without it.
So what's next ?
Can i add the phonones ?
I would say the qulity is not very good of the model if phonon a is not included and with these fenster business, i just dont understand why it is in the dicts but not regognised...
So yeah this is the status on that front. Meanwhile im recording the text of the book :-D