Acoustic Model Discussions

Nested
HTK treatment of single quotes
User: colbec
Date: 5/14/2012 7:03 am
Views: 3003
Rating: 7

I hope I have the subject line correct, here is the situation.

I'm trying to segment audio from Librivox for the Voxforge collection. No problem that I can see from my first batch of audio files, but my second project has a particular issue. I chose the author O.Henry. Unfortunately there is a lot of vernacular in the text such as " 'TWAS, 'TWILL, 'EM" and so on. These are separate sounds in the audio, so I cannot edit the text to say It was, It will and them. I might be wrong but it does not seem reasonable for purposes of audio model generation.

So in my lexicon I have those words with single quote in position 0 of the string. It appears that single quotes sort low in the ASCII order for HTK. I have built audio models before with HTK with single quotes in the string in other positions than zero. The result is that 'TWAS appears right at the top of the lexicon with the other '*.

OK so mkdfa.pl runs fine. Now using the automated procedure of the Voxforge tools I get the following output:

...
TURNING
TURQUOISE
TWAS

TWENTY

    Step 3 - Recording the Data
==============================================================
already completed manually

    Step 4 - Creating Transcription Files
==============================================================
writing to mlf file ./interim_files/words.mlf
writing to ./interim_files/words.mlf file done
  ERROR [+1232]  NumParts: Cannot find word VOICE in dictionary
 FATAL ERROR - Terminating program HLEd
  ERROR [+1232]  NumParts: Cannot find word VOICE in dictionary
 FATAL ERROR - Terminating program HLEd


OK so, the word TWAS (no initial quote) listed in the warning output does not exist in either the prompts file, the lexicon or the text. Clearly step 2 has failed, HTK, or the Voxforge script, has not been able to read my lexicon correctly.

The step 2 log output indicates that many words were not found, starting with the A's. So I think the issue is the existence of the single quoted items at the very top of the lexicon. The logs are long, so I have not included them here. I can provide extracts as required.

There are workarounds for this, but I wanted to ensure I have not misunderstood something. Workarounds are meant for defined issues, otherwise they just help to muddy the water.

 

--- (Edited on 5/14/2012 7:03 am [GMT-0500] by colbec) ---


FWIW: The workaround of replacing the quote in 'TIS with ZZ as in ZZTIS and doing similarly with other items resolves the issue, but ZZTIS, while it sorts conveniently at the bottom end of the lexicon, is not very readable.

--- (Edited on 5/14/2012 7:43 am [GMT-0500] by colbec) ---

OK looks like this is a known issue (thx to http://speechtechie.wordpress.com/ and http://www.ling.ohio-state.edu/~bromberg/htk_problems.html).

Solution is to escape the quotes in lexicon and prompts, then HTK works.

Example lexicon:

\'EM    ['EM]    uh m
\'TIS    ['TIS]    t ih z
\'TWAS    ['TWAS]    t w aa z
\'TWILL    ['TWILL]    t w ih l
007    [007]    d ah b ax l ow s eh v ih n
1    [1]    w ah n
100    [100]    hh ah n d r ih d

 

Example prompts:

...
*/oht69 SHE LIFTS TOBIN\'S HAND

...
*/oht78 \'TIS NOT ME FOOT AT ALL 

...

 

--- (Edited on 5/14/2012 9:40 am [GMT-0500] by colbec) ---

PreviousNext