VoxForge
Just uploaded the latest 20161117 release of the german voxforge models to
http://goofy.zamia.org/voxforge/de/
in addition to previous releases this one covers most of the german-speechdata-package-v2.tar.gz - all in all over 200 hours of material.
Also, there are new model types in this release:
stats:
30684 lexicon entries.total duration of all good submissions: 214:55:02
CMU Sphinx Models:cmusphinx cont model: SENTENCE ERROR: 25.4% (1523/6006) WORD ERROR RATE: 5.7% (4147/73022)
cmusphinx ptm model: SENTENCE ERROR: 24.7% (1481/6006) WORD ERROR RATE: 7.1% (5199/73022)kaldi gmm models:%WER 8.34 [ 6090 / 73007, 994 ins, 1306 del, 3790 sub ] exp/tri3b/decode.si/wer_16
%WER 7.39 [ 5397 / 73007, 634 ins, 1611 del, 3152 sub ] exp/tri2b/decode/wer_18
%WER 5.70 [ 4165 / 73007, 397 ins, 1204 del, 2564 sub ] exp/tri3b_mpe/decode/wer_17
%WER 5.46 [ 3989 / 73007, 384 ins, 1110 del, 2495 sub ] exp/tri3b_mmi/decode/wer_13
%WER 5.29 [ 3862 / 73007, 475 ins, 1044 del, 2343 sub ] exp/tri2b_mpe/decode/wer_16
%WER 5.29 [ 3861 / 73007, 324 ins, 1207 del, 2330 sub ] exp/tri3b_mmi_b0.05/decode/wer_15
%WER 4.95 [ 3611 / 73007, 646 ins, 924 del, 2041 sub ] exp/tri3b/decode/wer_17
%WER 4.23 [ 3090 / 73007, 352 ins, 767 del, 1971 sub ] exp/tri2b_mmi/decode/wer_12
%WER 4.16 [ 3035 / 73007, 333 ins, 804 del, 1898 sub ] exp/tri2b_mmi_b0.05/decode/wer_13kaldi nnet3 models:%WER 1.52 [ 1112 / 73007, 192 ins, 314 del, 606 sub ] exp/nnet3/nnet_tdnn_a/decode/wer_11_0.0
%WER 1.33 [ 972 / 73007, 175 ins, 252 del, 545 sub ] exp/nnet3/lstm_ld5/decode/wer_11_0.0sequitur g2p model:total: 3069 strings, 36060 symbols
successfully translated: 3068 (99.97%) strings, 36060 (100.00%) symbols
string errors: 1261 (41.10%)
symbol errors: 2831 (7.85%)
insertions: 969 (2.69%)
deletions: 962 (2.67%)
substitutions: 900 (2.50%)
translation failed: 1 (0.03%) strings, 0 (0.00%) symbols
total string errors: 1262 (41.12%)
total symbol errors: 2831 (7.85%)
r20170420 is now available:
http://goofy.zamia.org/voxforge/de/
besides covering the latest voxforge submissions, this is mainly a bugfix release. The CMU Sphinx models are based on the same srilm language model as the kaldi models now.
stats:
kaldi nnet3:
%WER 1.67 [ 1199 / 71994, 211 ins, 319 del, 669 sub ] exp/nnet3/nnet_tdnn_a/decode/wer_12_0.0
kaldi gmm:
%WER 8.35 [ 6011 / 71994, 975 ins, 1224 del, 3812 sub ] exp/tri3b/decode.si/wer_16_0.0 %WER 7.54 [ 5426 / 71994, 663 ins, 1613 del, 3150 sub ] exp/tri2b/decode/wer_16_0.0 %WER 5.63 [ 4053 / 71994, 422 ins, 1038 del, 2593 sub ] exp/tri3b_mpe/decode/wer_17_0.0 %WER 5.16 [ 3718 / 71994, 702 ins, 870 del, 2146 sub ] exp/tri3b/decode/wer_17_0.0 %WER 5.16 [ 3716 / 71994, 364 ins, 979 del, 2373 sub ] exp/tri3b_mmi/decode/wer_13_0.0 %WER 5.14 [ 3704 / 71994, 439 ins, 1021 del, 2244 sub ] exp/tri2b_mpe/decode/wer_17_0.0 %WER 5.05 [ 3639 / 71994, 366 ins, 977 del, 2296 sub ] exp/tri3b_mmi_b0.05/decode/wer_13_0.0 %WER 4.18 [ 3006 / 71994, 277 ins, 830 del, 1899 sub ] exp/tri2b_mmi/decode/wer_13_0.0 %WER 4.07 [ 2933 / 71994, 275 ins, 797 del, 1861 sub ] exp/tri2b_mmi_b0.05/decode/wer_13_0.0
CMU Sphinx Cont:
SENTENCE ERROR: 28.0% (1689/6025) WORD ERROR RATE: 6.5% (4673/72017)
CMU Sphinx PTM:
SENTENCE ERROR: 29.3% (1768/6025) WORD ERROR RATE: 8.6% (6222/72017)
sequitur:
total: 3097 strings, 36758 symbols
successfully translated: 3097 (100.00%) strings, 36758 (100.00%) symbols
string errors: 1302 (42.04%)
symbol errors: 2915 (7.93%)
insertions: 1040 (2.83%)
deletions: 954 (2.60%)
substitutions: 921 (2.51%)
translation failed: 0 (0.00%) strings, 0 (0.00%) symbols
total string errors: 1302 (42.04%)
total symbol errors: 2915 (7.93%)