Speech Recognition Engines

Flat
Speaker Adaptation with HTK ...
User: frenchfrog
Date: 8/13/2009 2:38 am
Views: 14658
Rating: 6

Hi everybody,

I would like to evaluate speaker adaptation on a digits corpus in German. I have trained using HTK, a monophone system on another german corpus which work quite well in term of accuracy on the digits corpus. But I have problems in making the adaptation process (MLLR) to run on this corpus ...

So I have multiple questions:

- In the doc and examples, most of the examples for speaker
adaptation given are with triphones models ... I suppose that adapting context independent monophone acoustic models is also possible ?

- If yes, here are the steps that have been followed (following htk doc):

  1- training of a 64 mixture monophone system (43 german phones - 3 states per phone - 64 mixture per states)
         HERest.exe -B -D -T 4 -L new-seg -S mfcc-list-all.scp -s
statphones -H .\hmm01\hmmdefs -M hmm02\ listphones.txt

  2- Tree clustering using following command:
        HHEd.exe -B -T 4 -H .\hmm02a\hmmdefs -M classes regtree.hed listphones.txt
        where regtree.hed is:
        LS "statsphone"
        RC 32 "rtree" {sil.state[2-4].mix}
        in the class directory I have a file named global
        ~b ``global''
        <MMFIDMASK> *
        <PARAMETERS> MIXBASE
        <NUMCLASSES> 1
        <CLASS> 1 {*.state[2-4].mix[1-64]}

        This is creating 3 different files which seems quite ok :
rtree.base; rtree.rtree and hmmdefs in the classes directory

HHEd
 43/43 Models Loaded [5 states max, 64 mixes max]

LS statsphone
 Loading state occupation stats
  Stats loaded for 43 models
  Mean Occupation Count = 59334.495429

RC 32 rtree
 Building regression tree with 32 terminals
Creating regression class tree with ident rtree.tree and baseclass
rtree.base
 Models
  sil
  1 models in itemlist
          sil.state[2].stream[1]
          sil.state[3].stream[1]
          sil.state[4].stream[1]
Splitting Node 3, score 1.007575e+008
Splitting Node 5, score 6.008558e+007
Splitting Node 6, score 3.361824e+007
Splitting Node 2, score 2.519530e+007
Splitting Node 4, score 2.258807e+007
Splitting Node 7, score 1.867613e+007
Splitting Node 8, score 1.807741e+007
Splitting Node 9, score 1.189350e+007
Splitting Node 13, score 1.053742e+007
Splitting Node 14, score 9.306629e+006
Splitting Node 12, score 8.674780e+006
Splitting Node 16, score 7.772738e+006
Splitting Node 17, score 7.701269e+006
Splitting Node 20, score 7.331825e+006
Splitting Node 15, score 7.193971e+006
Splitting Node 19, score 5.830185e+006
Splitting Node 30, score 4.630366e+006
Splitting Node 18, score 4.581620e+006
Splitting Node 22, score 4.436380e+006
Splitting Node 25, score 4.378486e+006
Splitting Node 29, score 3.968522e+006
Splitting Node 27, score 3.646834e+006
Splitting Node 23, score 3.525451e+006
Splitting Node 32, score 3.492046e+006
Splitting Node 26, score 3.205845e+006
Splitting Node 37, score 3.171115e+006
Splitting Node 35, score 2.855839e+006
Splitting Node 24, score 2.766869e+006
Splitting Node 33, score 2.633669e+006
Splitting Node 38, score 2.627900e+006

Saving new HMM files ...
Edit Complete

    3- Adaptation
    HERest.exe -L adapt-lab -C config.global -S 001.scp -u a -H
.\hmm02a\hmmdefs -J .\classes -K xform mllr list-phones.txt
  ERROR [+999]  Components missing from Base Class list (1548 8256)
  ERROR [+999]  BaseClass check failed
 FATAL ERROR - Terminating program HERest.exe

I have checked the following error and this seems to be due to the fact that mixture components were missing in the model; although after having checked it seems that the model is containing the right number  of phones with 64 mixtures each
    (running a command like cat .\hmm02a\hmmdefs | grep "MIXTURE" | wc -> gives a result of 8256 (43phones*3states*64 mixtures as expected)

I have to say that I don't understand what I'm doing wrong here ... Any help would be useful here.

Thanks for your help.

Best Regards.

--- (Edited on 8/13/2009 2:38 am [GMT-0500] by frenchfrog) ---

Re: Speaker Adaptation with HTK ...
User: nsh
Date: 8/13/2009 5:56 am
Views: 109
Rating: 6

since 1548 = 43 * 3 * 12 i really suspect that you first followed htkbook and made global class with 12 mixtures only. then you probably updated the global but didnt rebuild something. can you check your class tree once again please?

--- (Edited on 8/13/2009 5:56 am [GMT-0500] by nsh) ---

Re: Speaker Adaptation with HTK ...
User: Visitor
Date: 8/13/2009 7:38 am
Views: 169
Rating: 7

Hi and thanks for your help.

That was my first guess (only having 12 instead of 64 mixtures) because I made plenty of test,f ollowing your advice I create a new directory and relaunch the HHed command : So you were right in your assumption.

But launching the HERest command change the error message now :
HERest.exe -L adapt-lab -C config.global -S 001.scp -u a -H .\hmm02a\hmmdefs -K xform mllr -J classes2 -T 4 list-phones
HMM Def Error: <MMFIDMASK> symbol expected in GetBaseClass at line 5/col 35/char
 102 in classes2/global
  ERROR [+7050]  HMError:

The global filename in classes2 

~b ``global''
<MMFIDMASK> *
<PARAMETERS> MIXBASE
<NUMCLASSES> 1
<CLASS> 1 {*.state[2-4].mix[1-64]} 

Any idea for this last step ?

Thanks.

 

--- (Edited on 8/13/2009 7:38 am [GMT-0500] by Visitor) ---

Re: Speaker Adaptation with HTK ...
User: frenchfrog
Date: 8/13/2009 8:42 am
Views: 90
Rating: 6

OK getting to the last point ... there was a problem with '' and " in my global file ... stupid error

Now it's just a matter of finding the right mask for my mfcc files ...


HERest.exe -L .\adapt-lab -C config.global -S 001.scp -u a -H .\hmm02a\hmmdefs -K xform mllr -J classes2 -h '*\%%%_????????.mfcc' list-phones1

 Using baseclass macro "global" from file classes2/global
Attached 8256 XFormInfo structures
Attached 8256 RegAcc structures
Pruning-Off

 ERROR [+999]  Output xform mask '*\%%%_????????.mfcc' does not match filename
z:\digits-htk\adapt-mfcc\001_stdc0001.mfcc
 FATAL ERROR - Terminating program HERest.exe

 

--- (Edited on 8/13/2009 8:42 am [GMT-0500] by frenchfrog) ---

Re: Speaker Adaptation with HTK ...
User: nsh
Date: 8/13/2009 4:21 pm
Views: 88
Rating: 5

Y, mask matching was always a pain as well. I think you can work it out with few experiments/adjustments. Here in HTK 3.4.1 it works as expected, this mask matches this filename.

--- (Edited on 8/13/2009 4:21 pm [GMT-0500] by nsh) ---

Re: Speaker Adaptation with HTK ...
User: frenchfrog
Date: 8/14/2009 4:30 am
Views: 579
Rating: 11

Hello

Got it worked ... thank you very much for your help.

Best Regards.

--- (Edited on 8/14/2009 4:30 am [GMT-0500] by frenchfrog) ---

Re: Speaker Adaptation with HTK ...
User: Sobh
Date: 12/30/2012 11:53 pm
Views: 269
Rating: 5

The mask works fine on windows if we use " instaed of '

--- (Edited on 12/30/2012 11:53 pm [GMT-0600] by Visitor) ---

Re: Speaker Adaptation with HTK ...
User: sat
Date: 2/22/2013 6:04 am
Views: 294
Rating: 2

I am trying to implement MAP adaptation in htk, but I can not find any example.


Is it enough to set the option "-u p" in HERest? Does MAP modify directly the models or does it generate transforms like MLLR? Because I can not find any transform.


Moreover.. to test the adaptation, can I use HVite normally on the model modified by MAP or should I use the transforms?

Thank you veryc much in advance!!

--- (Edited on 2/22/2013 6:04 am [GMT-0600] by Visitor) ---

Re: Speaker Adaptation with HTK ...
User: Amber
Date: 7/3/2013 11:29 pm
Views: 768
Rating: 1

I have some similiar problem while trying HLDA

The following is the syntax of the command I am running to get HLDA transform
HERest -C config -C config.hlda -S train.scp -I wintri.mlf -H
hmm15/macros -u stw -H hmm15/hmmdefs -K hmm16 -M hmm16 tiedlist

my config.hlda is as follows

 HADAPT:TRANSKIND              = SEMIT
 HADAPT:USEBIAS                = FALSE
 HADAPT:BASECLASS              = global
 HADAPT:SPLITTHRESH            = 0.0
 HADAPT:SEMITIEDMACRO          = HLDA
 HADAPT:NUMNUISANCEDIM         = 5
 HADAPT:SEMITIED2INPUTXFORM    = TRUE
 HADAPT:MAXXFORMITER           = 100
 HADAPT:MAXSEMITIEDITER        = 20
 HADAPT:TRACE                  = 61
 HMODEL:TRACE                  = 512

and global file

  ~b "global"
  <MMFIDMASK> *
  <PARAMETERS> MIXBASE
  <NUMCLASSES> 1
    <CLASS> 1  {*.state[2-4].mix[1-32]}

the error I am getting is
Pruning-Off

HMM Def Error: <MMFIDMASK> symbol expected in GetBaseClass at line
8/col 1/char 108 in global
  ERROR [+7050]  HMError:

Segmentation fault

 

 

Could anyone please help me out with thi

--- (Edited on 7/3/2013 11:29 pm [GMT-0500] by Visitor) ---

Re: Speaker Adaptation with HTK ...
User: Pardhu
Date: 3/18/2015 1:32 am
Views: 509
Rating: 2

What is a stats file? What exactly does it contain?

--- (Edited on 3/18/2015 1:32 am [GMT-0500] by Visitor) ---

PreviousNext