Click here to register.

Step 9 - Making Triphones from Monophones

Background 

In the dict file you created in Step 2, the pronunciation of a word was given by a series of phonemes (also called monophones - i.e. a single phone).   To generate a triphone (i.e. a group of 3 phones) declaration from monophones, the "L" phone  (i.e. the left-hand phone) precedes "X" phone and the "R" phone (i.e. the right-hand phone) follows it.  The triphone is declared in the form "L-X+R". 

Below is an example of the conversion to a triphone declaration of the word "TRANSLATE" (the first line shows the "monophone" declaration, and the second line shows the "triphone" declaration):

TRANSLATE [TRANSLATE] t r @ n s l e t
TRANSLATE [TRANSLATE] t+r t-r+@ r-@+n @-n+s n-s+l s-l+e l-e+t e-t

We are therefore moving to an improved level of recognition accuracy.  So far, we have created a monophone Acoustic Model, which can be used with Julius.  But with such a model, we are not looking at the 'context' of the monophone.  The SRE is trying to match the sound that it has heard to a single phone - a  single sound.

With a triphone acoustic model, we are essentially looking for a monophone in the "context" other monophones - i.e. the one immediately before and the one immediately after (if they exist - it may be the beginning or end of the word).  This greatly improves recognition accuracy, because the SRE is looking to match a specific sequence of 3 sounds together (a triphone), rather than only one sound.  This is like using a 3 word Google search rather than a single word Google search - you get more accurate results.  Triphones reduce the possibility of error caused by confusing one sound with another, because we are now looking for a distinct sequence of 3 sounds. 

Note that some commercial systems use quintphones (5 phone groupings) in their recognition systems - but this requires a very large amounts of speech audio data.

Tutorial 

To convert the monophone transcriptions in the  aligned.mlfaligned.mlf file you created in Step 8 to an equivalent set of triphone transcriptions, you need to execute the HLEd command. 

First you need to create the mktri.led edit script:

WB sp
WB sil
TC
 

Then you execute the HLEd command as follows:

$HLEd -A -D -T 1 -n triphones1 -l '*' -i wintri.mlf mktri.led aligned.mlf

This creates 2 files:

Next, to create the mktri.hed file by executing the following script:

$perl ../HTK_scripts/maketrihed monophones1 triphones1

This creates the mktri.hedmktri.hed file.

Then create  3 more folders: hmm10-12

Then execute the HHEd command:

$HHEd -A -D -T 1 -H hmm9/macros -H hmm9/hmmdefs -M hmm10 mktri.hed monophones1 

The files created by this command are:

 

 Next run HERest 2 more times: 

$HERest  -A -D -T 1 -C config -I wintri.mlf -t 250.0 150.0 3000.0 -S train.scp -H hmm10/macros -H hmm10/hmmdefs -M hmm11 triphones1

The files created by this command are:

 

$HERest  -A -D -T 1 -C config -I wintri.mlf -t 250.0 150.0 3000.0 -s stats -S train.scp -H hmm11/macros -H hmm11/hmmdefs -M hmm12 triphones1

 

The files created by this command are:

Comments

AddSearch

By Crunkrock - 3/22/2013 - 1 Replies I get

By adoh - 11/10/2012 Hi,

By tt - 2/25/2012 on this section i try to run perl maketrihed monophones1 triphones1.

By Babak - 7/29/2011 - 1 Replies

By swbluto - 9/9/2010 - 2 Replies I think I already posted a thread with the same issue in the auto section, but it looks like it's gone. Anyways, when I run

By Aswin Juari - 4/10/2009 - 2 Replies Hello,

By Moe - 3/16/2009 - 3 Replies Hi,

By Annie - 12/26/2007 - 1 Replies Hi! I have a problem during the re-estimation for 11 times after the triphones compilation has been succeded. the command : HERest -A -D -T 1 -C config -I wintri.mlf -t 250.0 150.0 3000.0 -S train.scp -H hmm10/macros -H hmm10/hmmdefs -M hmm11 triphones1 and it gives me this error : ERROR [+7321] CreateInsts : Unknown label B I'll already followed and checked everything, but can't find the solution...please help me! Thank you; Regards; Annie

By Manuel - 9/11/2007 - 5 Replies At the point to create hmm11: HERest -A -D -T 1 -C config -I wintri.mlf -t 250.0 150.0 3000.0 -S train.scp -H hmm10/macros -H hmm10/hmmdefs -M hmm11 triphones1 It give me many warning like this: WARNING [-2331] UpdateModels: t+ae[1] copied: only 1 egs in HERest WARNING [-2331] UpdateModels: t+ay[3] copied: only 2 egs in HERest .... It's a problem, or it's all ok? Because I reach the end of tutorial to create my personal acoustic model, but when I try to use it with Julian it give me some errors: Reading in dictionary... line 3: triphone "*-f+ow" or biphone "f+ow" not found line 3: triphone "f-ow+n" not found > 2 [PHONE] f ow n line 4: triphone "*-k+ao" or biphone "k+ao" not found line 4: triphone "k-ao+l" not found > 2 [CALL] k ao l line 5: triphone "d-ay+ax" not found > 3 [DIAL] d ay ax l line 6: triphone "t-iy+v" not found > 4 [STEVE] s t iy v line 8: triphone "b-aa+b" not found > 4 [BOB] b aa b line 9: triphone "*-jh+aa" or biphone "jh+aa" not found line 9: triphone "jh-aa+n" not found line 9: triphone "aa-n+s" not found > 4 [JOHNSTON] jh aa n s t ax n line 10: triphone "*-jh+aa" or biphone "jh+aa" not found line 10: triphone "jh-aa+n" not found > 4 [JOHN] jh aa n line 11: triphone "*-jh+ao" or biphone "jh+ao" not found line 11: triphone "jh-ao+r" not found line 11: triphone "r-d+ax" not found > 4 [JORDAN] jh ao r d ax n line 13: triphone "f-ay+v" not found > 5 [FIVE] f ay v line 15: triphone "n-ay+n" not found > 5 [NINE] n ay n line 21: triphone "th-r+iy" not found > 5 [THREE] th r iy line 23: triphone "z-ih+r" not found line 23: triphone "ih-r+ow" not found > 5 [ZERO] z ih r ow ////// Missing phones: *-f+ow or biphone f+ow *-jh+aa or biphone jh+aa *-jh+ao or biphone jh+ao *-k+ao or biphone k+ao aa-n+s b-aa+b d-ay+ax f-ay+v f-ow+n ih-r+ow jh-aa+n jh-ao+r k-ao+l n-ay+n r-d+ax t-iy+v th-r+iy z-ih+r ////////////////////// error in reading sample.dict: 12 words failed out of 23 words ERROR: failed to read dictionary, terminated If I try to use monophones model It start but all the recognizes are wrong Tks Manuel