
иероглифы в pocketsphinx
User: anz
Date: 4/24/2018 5:10 pm
Views: 11621
Rating: 0

Голову сломал. Команда pocketsphinx_batch в hyp и в терминал выдаёт ÑÑ‚о в высокий офицер кгб у потянул, причём с continuous всё впорядке. Goforeward тоже batch распознало на ура. Исходный файл сконвертирован из wav в raw при помощи команды sox. Вот.

INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /home/mitya/Documents/sphinx/zero_ru_cont_8k_v3/zero_ru.cd_ptm_4000/feat.params

Current configuration:


-agc none none

-agcthresh 2.0 2.000000e+00


-allphone_ci no no

-alpha 0.97 9.700000e-01

-ascale 20.0 2.000000e+01

-aw 1 1

-backtrace no no

-beam 1e-48 1.000000e-48

-bestpath yes yes

-bestpathlw 9.5 9.500000e+00

-ceplen 13 13

-cmn live current

-cmninit 40,3,-1 11.64,0.15,-0.04,0.17,-0.40,-0.03,-0.50,-0.13,-0.33,-0.11,-0.19,-0.10,-0.24

-compallsen no no

-debug 0

-dict /home/mitya/Documents/sphinx/zero_ru_cont_8k_v3/ru.dic

-dictcase no no

-dither no yes

-doublebw no no

-ds 1 1


-feat 1s_c_d_dd s2_4x


-fillprob 1e-8 1.000000e-08

-frate 100 100


-fsgusealtpron yes yes

-fsgusefiller yes yes

-fwdflat yes yes

-fwdflatbeam 1e-64 1.000000e-64

-fwdflatefwid 4 4

-fwdflatlw 8.5 8.500000e+00

-fwdflatsfwin 25 25

-fwdflatwbeam 7e-29 7.000000e-29

-fwdtree yes yes

-hmm /home/mitya/Documents/sphinx/zero_ru_cont_8k_v3/zero_ru.cd_ptm_4000

-input_endian little little




-kws_delay 10 10

-kws_plp 1e-1 1.000000e-01

-kws_threshold 1 1.000000e+00

-latsize 5000 5000


-ldadim 0 0

-lifter 0 0

-lm /home/mitya/Documents/sphinx/zero_ru_cont_8k_v3/ru.lm



-logbase 1.0001 1.000100e+00


-logspec no no

-lowerf 133.33334 1.300000e+02

-lpbeam 1e-40 1.000000e-40

-lponlybeam 7e-29 7.000000e-29

-lw 6.5 6.500000e+00

-maxhmmpf 30000 30000

-maxwpf -1 -1




-min_endfr 0 0


-mixwfloor 0.0000001 1.000000e-07


-mmap yes yes

-ncep 13 13

-nfft 512 512

-nfilt 40 31

-nwpen 1.0 1.000000e+00

-pbeam 1e-48 1.000000e-48

-pip 1.0 1.000000e+00

-pl_beam 1e-10 1.000000e-10

-pl_pbeam 1e-10 1.000000e-10

-pl_pip 1.0 1.000000e+00

-pl_weight 3.0 3.000000e+00

-pl_window 5 5


-remove_dc no no

-remove_noise yes yes

-remove_silence yes yes

-round_filters yes yes

-samprate 16000 1.600000e+04

-seed -1 -1




-silprob 0.005 5.000000e-03

-smoothspec no no



-tmatfloor 0.0001 1.000000e-04

-topn 4 4

-topn_beam 0 0


-transform legacy legacy

-unit_area yes yes

-upperf 6855.4976 3.700000e+03

-uw 1.0 1.000000e+00

-vad_postspeech 50 50

-vad_prespeech 20 20

-vad_startspeech 10 10

-vad_threshold 2.0 2.000000e+00


-varfloor 0.0001 1.000000e-04

-varnorm no no

-verbose no no


-warp_type inverse_linear inverse_linear

-wbeam 7e-29 7.000000e-29

-wip 0.65 6.500000e-01

-wlen 0.025625 2.562500e-02


INFO: fe_interface.c(324): Using -1 as the seed.

INFO: feat.c(715): Initializing feature stream to type: 's2_4x', ceplen=13, CMN='batch', VARNORM='no', AGC='none'

INFO: mdef.c(518): Reading model definition: /home/mitya/Documents/sphinx/zero_ru_cont_8k_v3/zero_ru.cd_ptm_4000/mdef

INFO: bin_mdef.c(181): Allocating 145321 * 8 bytes (1135 KiB) for CD tree

INFO: tmat.c(149): Reading HMM transition probability matrices: /home/mitya/Documents/sphinx/zero_ru_cont_8k_v3/zero_ru.cd_ptm_4000/transition_matrices

INFO: acmod.c(113): Attempting to use PTM computation module

INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/mitya/Documents/sphinx/zero_ru_cont_8k_v3/zero_ru.cd_ptm_4000/means

INFO: ms_gauden.c(242): 53 codebook, 4 feature, size: 

INFO: ms_gauden.c(244):  64x12

INFO: ms_gauden.c(244):  64x24

INFO: ms_gauden.c(244):  64x3

INFO: ms_gauden.c(244):  64x12

INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/mitya/Documents/sphinx/zero_ru_cont_8k_v3/zero_ru.cd_ptm_4000/variances

INFO: ms_gauden.c(242): 53 codebook, 4 feature, size: 

INFO: ms_gauden.c(244):  64x12

INFO: ms_gauden.c(244):  64x24

INFO: ms_gauden.c(244):  64x3

INFO: ms_gauden.c(244):  64x12

INFO: ms_gauden.c(304): 273 variance values floored

INFO: ptm_mgau.c(476): Loading senones from dump file /home/mitya/Documents/sphinx/zero_ru_cont_8k_v3/zero_ru.cd_ptm_4000/sendump


INFO: ptm_mgau.c(563): Rows: 64, Columns: 4159

INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones

INFO: ptm_mgau.c(838): Maximum top-N: 4

INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0

INFO: dict.c(320): Allocating 520937 * 20 bytes (10174 KiB) for word entries

INFO: dict.c(333): Reading main dictionary: /home/mitya/Documents/sphinx/zero_ru_cont_8k_v3/ru.dic

INFO: dict.c(213): Dictionary size 516838, allocated 8787 KiB for strings, 8504 KiB for phones

INFO: dict.c(336): 516838 words read

INFO: dict.c(358): Reading filler dictionary: /home/mitya/Documents/sphinx/zero_ru_cont_8k_v3/zero_ru.cd_ptm_4000/noisedict

INFO: dict.c(213): Dictionary size 516841, allocated 0 KiB for strings, 0 KiB for phones

INFO: dict.c(361): 3 words read

INFO: dict2pid.c(396): Building PID tables for dictionary

INFO: dict2pid.c(406): Allocating 53^3 * 2 bytes (290 KiB) for word-initial triphones

INFO: dict2pid.c(132): Allocated 33920 bytes (33 KiB) for word-final triphones

INFO: dict2pid.c(196): Allocated 33920 bytes (33 KiB) for single-phone word triphones

INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format

INFO: ngram_model_trie.c(365): Header doesn't match

INFO: ngram_model_trie.c(177): Trying to read LM in arpa format

INFO: ngram_model_trie.c(193): LM of order 3

INFO: ngram_model_trie.c(195): #1-grams: 506961

INFO: ngram_model_trie.c(195): #2-grams: 7227984

INFO: ngram_model_trie.c(195): #3-grams: 4977339

INFO: lm_trie.c(474): Training quantizer

INFO: lm_trie.c(482): Building LM trie

INFO: ngram_search_fwdtree.c(74): Initializing search tree

INFO: ngram_search_fwdtree.c(101): 1245 unique initial diphones

INFO: ngram_search_fwdtree.c(186): Creating search channels

INFO: ngram_search_fwdtree.c(323): Max nonroot chan increased to 914170

INFO: ngram_search_fwdtree.c(333): Created 1245 root, 914042 non-root channels, 89 single-phone words

INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25

INFO: batch.c(729): Decoding 'u'

INFO: cmn.c(133): CMN: 11.39 -0.24 -0.27 -0.31 -0.42 -0.25 -0.23 -0.23 -0.24 -0.15 -0.03 -0.13 -0.14 

INFO: ngram_search.c(459): Resized backpointer table to 10000 entries

INFO: ngram_search.c(467): Resized score stack to 200000 entries

INFO: ngram_search.c(467): Resized score stack to 400000 entries

INFO: ngram_search.c(459): Resized backpointer table to 20000 entries

INFO: ngram_search.c(467): Resized score stack to 800000 entries

INFO: ngram_search.c(459): Resized backpointer table to 40000 entries

INFO: ngram_search.c(467): Resized score stack to 1600000 entries

INFO: ngram_search.c(459): Resized backpointer table to 80000 entries

INFO: ngram_search.c(467): Resized score stack to 3200000 entries

INFO: ngram_search.c(459): Resized backpointer table to 160000 entries

INFO: ngram_search_fwdtree.c(949): cand_sf[] increased to 64 entries

INFO: ngram_search.c(467): Resized score stack to 6400000 entries

INFO: ngram_search.c(459): Resized backpointer table to 320000 entries

INFO: ngram_search.c(467): Resized score stack to 12800000 entries

INFO: ngram_search.c(459): Resized backpointer table to 640000 entries

INFO: ngram_search.c(467): Resized score stack to 25600000 entries

INFO: ngram_search_fwdtree.c(1550):   622622 words recognized (75/fr)

INFO: ngram_search_fwdtree.c(1552): 25231844 senones evaluated (3048/fr)

INFO: ngram_search_fwdtree.c(1556): 244194826 channels searched (29495/fr), 6952175 1st, 23513168 last

INFO: ngram_search_fwdtree.c(1559):  1123098 words for which last channels evaluated (135/fr)

INFO: ngram_search_fwdtree.c(1561): 15825342 candidate words for entering last phone (1911/fr)

INFO: ngram_search_fwdtree.c(1564): fwdtree 168.28 CPU 2.033 xRT

INFO: ngram_search_fwdtree.c(1567): fwdtree 169.52 wall 2.048 xRT

INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 25948 words

INFO: ngram_search_fwdflat.c(948):   327042 words recognized (40/fr)

INFO: ngram_search_fwdflat.c(950): 10701821 senones evaluated (1293/fr)

INFO: ngram_search_fwdflat.c(952): 49848898 channels searched (6021/fr)

INFO: ngram_search_fwdflat.c(954):  3160104 words searched (381/fr)

INFO: ngram_search_fwdflat.c(957):  2161224 word transitions (261/fr)

INFO: ngram_search_fwdflat.c(960): fwdflat 33.14 CPU 0.400 xRT

INFO: ngram_search_fwdflat.c(963): fwdflat 33.51 wall 0.405 xRT

INFO: ngram_search.c(1250): lattice start node <s>.0 end node </s>.8230

INFO: ngram_search.c(1276): Eliminated 0 nodes before end node

INFO: ngram_search.c(1381): Lattice has 41989 nodes, 625937 links

INFO: ps_lattice.c(1380): Bestpath score: -358064

INFO: ps_lattice.c(1384): Normalizer P(O) = alpha(</s>:8230:8277) = -20544294

INFO: ps_lattice.c(1441): Joint P(O,S) = -22990236 P(S|O) = -2445942

INFO: ngram_search.c(872): bestpath 25.47 CPU 0.308 xRT

INFO: ngram_search.c(875): bestpath 25.64 wall 0.310 xRT

INFO: batch.c(761): u: 82.78 seconds speech, 226.89 seconds CPU, 228.67 seconds wall

INFO: batch.c(763): u: 2.74 xRT (CPU), 2.76 xRT (elapsed)

это в высокий офицер кгб у потянул вверх ф настолько в нам хорошо знакомо весною в к для наших л в политических подошёл к прикладных экивоков так что мало не покажется в ставку в которую сделал в человечество уже доведенное до в своей крайность в клане нигилизма в тихой мною с самодостаточного но в том и тит ливий ту половину конституировать существование посредством потеряю то быть вера субъекта для паф статье субъекта потечёт субъектов к торой был ли злые духи заключённых как над кгб в таком в муку в дпов в автобусах распахнулись хорошо втолкуйте ей вÑu done --------------------------------------

INFO: batch.c(778): TOTAL 82.78 seconds speech, 226.89 seconds CPU, 228.67 seconds wall

INFO: batch.c(780): AVERAGE 2.74 xRT (CPU), 2.76 xRT (elapsed)

INFO: ngram_search_fwdtree.c(429): TOTAL fwdtree 168.28 CPU 2.033 xRT

INFO: ngram_search_fwdtree.c(432): TOTAL fwdtree 169.52 wall 2.048 xRT

INFO: ngram_search_fwdflat.c(176): TOTAL fwdflat 33.14 CPU 0.400 xRT

INFO: ngram_search_fwdflat.c(179): TOTAL fwdflat 33.51 wall 0.405 xRT

INFO: ngram_search.c(303): TOTAL bestpath 25.47 CPU 0.308 xRT

INFO: ngram_search.c(306): TOTAL bestpath 25.64 wall 0.310 xRT

помогите кто чем может
