Step 2 - Pronunciation Dictionnary

BackGround - Phonetically Balanced Dictionnary 

Usually, the first step in building the Pronunciation Dictionnary is to create a sorted list of the words contained in your Grammar, one per line, with pronunciations (the phonemes that make up a word).  With our current example, it is easy to create an initial one by hand (see Initial Pronunciation Dictionnary). 

However, for HTK to be able to compile your speech audio and transcriptions into an Acoustic Model, HTK requires a phonetically balanced Pronunciation Dictionnary with at the very least 30-40 'sentences' of 8-10 words each.  If your Grammar has fewer sentences/words than this (as we do in this tutorial), or if your grammar in not  phonetically balanced (if some phonemes only occur one or two times) then we need to add additional words to make sure we have 3-5 occurences of each phoneme in our Pronunciation Dictionnary.

Therefore for this tutorial, we will need to add additional words to our Pronunciation Dictionnary in order to permit HTK to compile an Acoustic Model.  Remember, we are only trying to get the minimum number of pronunciation dictionnary entries that will permit HTK to compile - creating an Acoustic Model that produces consistent recognition results requires many more entries, and corresponding speech audio.

Tutorial 

To create a pronunciation dictionnary in HTK we will follow these steps:

  • create a prompts.txt file - which is the list of words we will record in the next Step;
  • derive a wlist file from the prompts.txt file - the wlist file is a sorted list of the unique words that appear in the prompts.txt file.
  • create the pronunciation dictionnary - which is done by adding pronunciation information to the words in wlist.

prompts.txt file 

First we need to create a prompts.txt file that includes our Grammar words and the additional dictionnary words required to create a phonetically balanced dictionnary.  This file basically contains the list of words that need to be recorded, and the names of the audio files the recordings will be stored - one per line.  You will do these recordings in Step 3.

Go to the 'voxforge/tutorial' folder you created in your home holder and create a file called 'prompts.txt' containing the following:

*/sample1 DIAL ONE TWO THREE FOUR FIVE SIX SEVEN EIGHT NINE OH ZERO
*/sample2 DIAL ONE THREE FIVE SEVEN NINE ZERO TWO FOUR SIX EIGHT OH
*/sample3 DIAL ZERO NINE SEVEN FIVE THREE ONE OH EIGHT SIX FOUR TWO
*/sample4 DIAL ONE ONE TWO TWO THREE THREE FOUR FOUR FIVE FIVE
*/sample5 DIAL SIX SIX SEVEN SEVEN EIGHT EIGHT NINE NINE OH OH ZERO ZERO
*/sample6 PHONE STEVE YOUNG CALL STEVE YOUNG
*/sample7 PHONE STEVE CALL STEVE PHONE YOUNG CALL YOUNG
*/sample8 PHONE PHONE STEVE STEVE  CALL CALL YOUNG YOUNG
*/sample9 MEASURE LEISURE AND LEISURE MEASURE
*/sample10 COMPLAIN CHAMPLAIN AIRPLANE ELAINE EXPLAIN
*/sample11 BOOKENDS KENNEL KENNETH KENYA WEEKEND
*/sample12 BELT BELOW BEND AEROBIC DASHBOARD DATABASE
*/sample13 GATEWAY GATORADE GAZEBO AFGHAN AGAINST AGATHA
*/sample14 ABALON ABDOMINALS BODY ABOLISH
*/sample15 ABOUNDING ABOUT ACCOUNT ALLENTOWN
*/sample16 ACHIEVE ACTUAL ACUPUNCTURE ADVENTURE
*/sample17 ALGORITHM ALTHOUGH ALTOGETHER ANOTHER
*/sample18 BATTLE BEATLE LITTLE METAL
*/sample19 BITTEN BLATANT BRIGHTEN BRITAIN
*/sample20 BROOKHAVEN HOOD BROUHAHA BULLHEADS
*/sample21 BUSBOYS CHOICE COILS COIN
*/sample22 COLLECTION COLORATION COMBINATION COMMERCIAL
*/sample23 MIDDLE NEEDLE POODLE SADDLE
*/sample24 ALRIGHT ARTHRITIS BRIGHT COPYRIGHT CRITERIA RIGHT
*/sample25 COUPLE CRADLE CRUMBLE
*/sample26 CUBA CUBE CUMULATIVE
*/sample27 CURING CURLING CYCLING
*/sample28 CYNTHIA DANFORTH DEPTH
*/sample29 DIGEST DIGITAL DILIGENT
*/sample30 AMNESIA ASIA AVERSION BEIGE BEIJING
*/sample31 HELP HELLO HELMET HELPLESS AHEAD HELP
*/sample32 VOXFORGE HOME READ LISTEN FORUMS DEVELOPER ABOUT HOWTO TUTORIAL
*/sample33 RHYTHMBOX PLAY START NEXT SKIP FORWARD PREVIOUS BACK
*/sample34 MUSIC SHOW WHO ABOUT INFORMATION UP LOUDER DOWN LOWER
*/sample35 PLAYER SOFTER SILENCE STOP QUIET
*/sample36 COMPUTER WEATHER EMAIL VOLUME LOUDER SOFTER
*/sample37 COMPUTERIZE AMPUTATE MINICOMPUTER PUMA'S PEWTER  
*/sample38 ACUTE AMPUTATION BOOTERS CONTRIBUTOR'S ALOUETTE GIFTWARE GLADWELL
*/sample39 MAYWEATHER WHETHER WOODSTREAM ARTILLERYMAN CREMATION DAIRYMAID FEMALE
*/sample40 ISHMAEL'S LANCEDALE LAVAL VOLATILE SCALIA SOLUBLE SUPERVALUE VALUATION

The first column of the prompts.txt file contains the name of the audio file to be created, and the following columns contain the text transcriptions of what to be recorded in the audio file.

wlist file 

The Julia script prompts2wlist.jl can take the prompts.txt file you just created, and remove the file name in the first column and print each word on one line into a word list file (wlist). 

Download prompts2wlist.jl to your voxforge/bin folder. 

Next, go back to your 'voxforge/tutorial' directory (where your prompts.txt file is located), and run prompts2wlist.jl as follows:

 julia ../bin/prompts2wlist.jl prompts.txt wlist

This will create the wlist file. 

Note: the following entries were automatically added to your wlist file (in sorted order):

SENT-END
SENT-START  

These are HTK internal entries required for creation of the Acoustic Model, and for processing of the Acoustic Model by Julius.  

pronunciation dictionnary 

The next step is to add pronunciation information (i.e. the phonemes that make up the word) to each of the words in the wlist file, thus creating a Pronunciation Dictionnary.  HTK uses the HDMan command to go through the wlist file, and look up the pronunciation for each word in a separate lexicon file, and output the result in a Pronunciation Dictionnary. 

First you need to create the global.ded script in your 'voxforge/tutorial' folder (default script used by HDMan), which contains:

AS sp
RS cmu
MP sil sil sp

This is mainly used to convert all the words in the dict file to uppercase.  See the HTK book for details of what these commands mean.

Create a new directory called 'lexicon' in your 'voxforge' folder.  Create a new file called voxforge_lexicon in your 'voxforge/lexicon' folder, and copy the into it: VoxForgeDict.txt   (origin of VoxForge phoneset).  Execute the HDMan command from your 'voxforge/tutorial' directory as follows:

HDMan -A -D -T 1 -m -w wlist -n monophones1 -i -l dlog dict ../lexicon/VoxForgeDict.txt

The output of the above noted HDMan command is two files:

  • dict - the pronunciation dictionnary for you Grammar and additional words required to create a phonetically balanced Acoustic Model; and
  • monophones1 - which is simply a list of the phones used in dict.

Confirming Phonetically Balanced Dictionnary

To help you determine your if dictionnary is phonetically balanced, review the output from your HDMan command in the 'dlog' log file: 

WARNING: no script file ../lexicon/VoxForgeDict.ded

Dictionary Usage Statistics
---------------------------
  Dictionary    TotalWords WordsUsed  TotalProns PronsUsed
VoxForgeDict    268089        114     268089        114
        dict       114        114        114        114

114 words required, 0 missing

New Phone Usage Counts
---------------------
  1. ae    :    17
  2. b     :    32
  3. ah    :    74
  4. l     :    48
  5. ow    :     9
  6. n     :    43
  7. sp    :   112
  8. d     :    26
  9. aa    :     8
 10. m     :    13
 11. z     :     7
 12. ih    :    23
 13. sh    :     7
 14. aw    :     4
 15. ng    :     7
 16. t     :    32
 17. k     :    33
 18. ch    :     5
 19. iy    :    12
 20. v     :     8
 21. w     :     4
 22. y     :     8
 23. uw    :     7
 24. p     :    11
 25. er    :    13
 26. eh    :    24
 27. r     :    21
 28. f     :     5
 29. g     :     8
 30. s     :    15
 31. th    :     7
 32. hh    :    10
 33. ey    :    20
 34. dh    :     4
 35. ao    :     6
 36. ay    :    12
 37. zh    :     7
 38. uh    :     5
 39. oy    :     4
 40. jh    :     3
 41. sil   :     2

Dictionary dict created


Although reviewing this log will not conclusively determine whether you have a phonetically balanced pronunciation dictionnary or not (because it may be missing certain phones altogether because your grammar is so small), it is a good place to start. 

For HTK to compile your Acoustic Model, you need to make sure that you have (at the very least) 3 to 5 usage counts for each phone.  If there are phones that only have one occurence, you must add words that use these phones to your prompts.txt file.  You can search through the VoxForgeDict file for the phones you need, and then include the word that contains that phone.

Creating Monophones0 File

You also need another monophones file for a later Step.  Simply copy the "monophones1" file to a new "monophones0" file in your 'voxforge/tutorial' directory and then remove the short-pause "sp" entry in monophones0.

Comments

By resstymanuzon - 10/9/2018 - 1 Replies Good day!

By prithviraj - 3/29/2018 Hi,

By birdieagle - 1/28/2018 - 2 Replies Hello,

By birdieagle - 11/15/2017 - 5 Replies Hello,

By Gururaj - 1/2/2017 Hello everyone,

By Gururaj - 12/20/2016 - 2 Replies hi everyone,

By chetto - 12/9/2015 - 1 Replies Hi all, i want to check my dictionary if it is phonetically balanced. However, i am not sure how to do that. How to check it? There is a table that created above about count of each phone. How can i open that table for my dictionary?

By Visitor - 10/23/2015 - 7 Replies hello,

By Fati - 7/18/2015 - 1 Replies I clicked on prompts2wlist.jl in this page to see the script by I got the "Not Found" error! is it removed?

By himiker - 6/20/2015 - 2 Replies I am unable to locate the HTK_scripts file, so entering julia ../bin/scripts/prompts2wlist.jl prompts.txt wlist gives me an error. Did I miss a step when installing the samples?

By Shubham - 5/18/2015 What are actually prompts?Should it be manually generated using HSGen and wlist or the other way round as you mentioned to create wlist from prompt?Reverse is given in HTKBook.

By sammy - 11/21/2014 thsi is the error message i get when i try to run hdman....solutions pleaseee .ASAP error [+1413] CreateBuffer:Cannot read first word in dict tor_lexicon

By Shipra - 7/4/2014 - 1 Replies while executing HParse command, i get error [+3110] InitScan: can not open network Defn File. Please tell me how to solve it

By amolPucsd - 6/5/2014

By t66 - 1/20/2014 - 1 Replies $ HDMan -A -D -T 1 -m -w wlist -n monophones1 -i -l dlog dict lexicon/voxforge_lexicon

By tuba - 1/20/2014 $ HDMan -A -D -T 1 -m -w wlist -n monophones1 -i -l dlog dict voxforge\lexicon\voxforge_lexicon

By jjarenas26 - 1/14/2014 - 1 Replies is there a lexicon file for spanish?

By fatima - 2/28/2013 - 1 Replies when i write the following command according to the htk book:

By frend - 7/6/2012 - 1 Replies Hi,

By ripul_88 - 7/3/2012 - 1 Replies

By nerr - 8/10/2011 - 1 Replies Hi,

By mawahib - 4/25/2011 - 5 Replies hello, I want to write a grammar for phone but knowing that the unit is the word not the phoneme,

By weihe6666 - 3/16/2011 - 3 Replies hi,

By Visitor - 2/14/2011 - 1 Replies when i try to execute hdman zith ;y french phoneticqlly dictionnarry .

By bejimed - 2/4/2011 - 4 Replies i want to build a recognition system for the french language so i use a french phonotocally dictionnary but i have two problems

By bejimed - 8/22/2010 - 1 Replies hi i want to build a recongtion system for the french language so i

hii
By mmm - 6/6/2010 - 1 Replies please can anyone teels me where we used

hi
By mmm - 3/8/2010 - 7 Replies When I downloaded htk toolkit on vista windows and tried to type some commands like HParse gram wdnet

By Nick - 3/11/2009 - 1 Replies Hi,

By imene - 7/19/2008 - 1 Replies hi