New 375k word 260 hours german models available

German

Flat

User: guenter
Date: 6/23/2018 5:24 pm

Views: 8510
Rating: 3

The latest 20180611 builds of the german models were trained on 260 hours of training material and thanks to IPA extraction from german wiktionary cover a dictionary of more than 375,000 entries now.

You can find download links to all our models and dicts here:

https://github.com/gooofy/zamia-speech#download

WER results for these models are not comparable to previous releases as we are measuring WERs for speakers not in the training set from now on and also tried to make the language model more neutral (i.e. not over-represent prompts in the training material) so the WER results should give a more realistic assessment of what performance one can expect from our models without adaptation.

WER for the large kaldi model is 6.23% for the large model and 7.49% for the embedded model.

WER for the continuous CMU Sphinx model is 29%.

We have also been quite busy cleaning up our scripts and documentation so it should become easier to understand what we are doing here. The models come complete with example scripts and pre-compiled binary packages for various platforms, more information on that can be found in our getting started guide here:

https://github.com/gooofy/zamia-speech#get-started-with-our-pre-trained-models

Please note that we have changed the tarball format of our models significantly so you will have to use the latest 0.3.1 py-kaldi-asr wrappers with these models. The new tarball format allows for model adaptation

https://github.com/gooofy/zamia-speech#model-adaptation

as well as automatic segmentation and transcript alignment of long audio recordings (e.g. librivox audiobooks):

https://github.com/gooofy/zamia-speech#audiobook-segmentation-and-transcription-kaldi

comments, suggestions and contributions are very welcome. For more information about the zamia-speech project, please visit http://zamia-speech.org/

Re: New 375k word 260 hours german models available

User: guenter
Date: 9/5/2018 3:26 pm

Views: 484
Rating: 1

The latest http://zamia-speech.org german Kaldi ASR Factorized TDNN model looks quite good:

4.25% WER (previous models: 6.23% WER tdnn_sp, 7.49% WER tdnn_250).

Download here:

http://goofy.zamia.org/zamia-speech/asr-models/

Re: New 375k word 260 hours german models available

User: mrsmith
Date: 10/24/2018 10:38 am

Views: 61
Rating: 0

Hello guenter!

Thanks a lot for your work! But the only question - when I try to use the latest german model with zamia-speech (with a demo script kaldi_decode_wav.py) I always see the error like this:

=================

Traceback (most recent call last):

File "kaldi_decode_wav.py", line 60, in <module>

kaldi_model = KaldiNNet3OnlineModel (options.modeldir, acoustic_scale=1.0, beam=7.0, frame_subsampling_factor=3)

File "kaldiasr/nnet3.pyx", line 134, in kaldiasr.nnet3.KaldiNNet3OnlineModel.__cinit__ (kaldiasr/nnet3.cpp:3549)

RuntimeError

==================

Don't you know do we need any additional operations to adapt this model to work with zamia-speech?

Thank you in advance!

Re: New 375k word 260 hours german models available

User: guenter
Date: 10/27/2018 4:44 pm

Views: 6
Rating: 1

Hello Mr. Smith,

I suspect you're using the Debian packages from zamia-speech.org repositories which up to a few hours ago contained kaldi 5.3 which was too old to support tdnn_f models. I have uploaded kaldi 5.4 debian packages now which should run the new models fine. The new tdnn_f models are therefore also included in the zamia-speech debian model packages so you should find them installed in /opt/kaldi/models if you're using those.

Cheers,

Guenter

Re: New 375k word 260 hours german models available

User: mrsmith
Date: 10/28/2018 7:51 am

Views: 3105
Rating: 0

Hello Guenter!

Yes, it is working perfectly now! The accuracy of this model is really amazing!

Thank you so much!

Previous • Next •


Username	Password