Acoustic Model Discussions

Nested
Missing triphones in Julius (again)
User: danijel
Date: 2/15/2010 4:22 pm
Views: 7378
Rating: 2

 

I've been looking around the forums for answers, but I still didn't find all the answers so I have to post this question again.

I created a small dictionary like this:

0 [<s>] sil

1 [</s>] sil

2 [OPEN] ow p ax n

3 [NOTEPAD] n ow t p ae dx

4 [WINAMP] w ih n ae m p

5 [PAINT] p ey n t

6 [CLOSE] k l ow z

7 [WINDOW] w ih n d ow

8 [DEBUG] d iy b ah g

When using it with the "quickstart" downloaded from the hompage I already got a few errors. After searching for answers for a while I decided to download the newest AM build. The amount of errors was reduced, but a few still remained:

Error: voca_load_htkdict: line 9: triphone "d-iy+b" not found

Error: voca_load_htkdict: line 9: triphone "iy-b+ah" not found

Error: voca_load_htkdict: the line content was: 8       [DEBUG] d iy b ah g

Error: voca_load_htkdict: begin missing phones

Error: voca_load_htkdict: d-iy+b

Error: voca_load_htkdict: iy-b+ah

Error: voca_load_htkdict: end missing phones

I want to know if these errors are unavoidable because of the limited data or is there a mechanism that can automatically solve this?

Would it be a good idea to test the model with a big list of transcriptions, like the CMU lexicon?

 

 

--- (Edited on 2/15/2010 4:22 pm [GMT-0600] by danijel) ---

Re: Missing triphones in Julius (again)
User: kmaclean
Date: 2/17/2010 9:36 pm
Views: 43
Rating: 1

>Error: voca_load_htkdict: the line content was: 8       [DEBUG] d iy b ah g

Not sure where you got your pronunciation for this, but the VoxForge Lexicon (used  in the creation of the VoxForge acoustic models in the Nightly Builds) uses this:

DEBUG           [DEBUG]         d ix b ah g

--- (Edited on 2/17/2010 10:36 pm [GMT-0500] by kmaclean) ---

Re: Missing triphones in Julius (again)
User: Visitor
Date: 2/18/2010 3:20 am
Views: 113
Rating: 2

CMU dictionary has this:

DEBUG  D IY0 B AH1 G

But that still doesn't change the fact that, as I understand, I can only model the words that contain triphones that were used in VoxForge prompts?

One solution was to tie missing triphones to others that sound "similar", but that's obviously not very elegant.

The problem is that I want to make an app that allows ordinary users to add pronunciations for words that they want recognized. This makes the whole process much more difficult...

--- (Edited on 2/18/2010 3:20 am [GMT-0600] by Visitor) ---

Re: Missing triphones in Julius (again)
User: nsh
Date: 2/18/2010 6:45 pm
Views: 65
Rating: 1

Try pocketsphinx instead :)

--- (Edited on 2/19/2010 03:45 [GMT+0300] by nsh) ---

Re: Missing triphones in Julius (again)
User: kmaclean
Date: 2/21/2010 8:50 pm
Views: 71
Rating: 1

>The problem is that I want to make an app that allows ordinary users to

>add pronunciations for words that they want recognized.

We currently use the CMU unstressed dictionary from the  xvoice site, which I assumed was the same as CMU v0.6.  Which, as you are finding out, is not the case...

If you describe what you are doing, we can get a better idea whether it it might be worthwhile for us to move the current VoxForge phone list to the CMU v0.7 pronunciation dictionary - which is where we should be at regardless (Ticket #468).

Ken

--- (Edited on 2/21/2010 9:50 pm [GMT-0500] by kmaclean) ---

Re: Missing triphones in Julius (again)
User: danijel
Date: 2/22/2010 1:48 pm
Views: 98
Rating: 2

I'm currently developing this:

http://code.google.com/p/voice-remote-android/

I want to allow users to build simple grammars using my GUI and attach them to certain "actions" in the operating system. Nothing too ambitious, but it has to be simple to use.

I intend to add the dictionary that you mentioned to the program, so whenever the user chooses a word thats already been transcribed, it will automatically use the correct pronunciation.

But sooner or later they will want to recognize words not in the dictionary (eg. application names) and that makes the whole situation kinda tricky.

As I understand, it's not possible to have all the triphones in the model, so I need to make some easy way of dealing with the missing ones.

I usually work with hybrid ANN/HMM systems. It's so much easier there...

--- (Edited on 2/22/2010 1:48 pm [GMT-0600] by danijel) ---

Re: Missing triphones in Julius (again)
User: kmaclean
Date: 2/25/2010 6:08 pm
Views: 357
Rating: 1

>But sooner or later they will want to recognize words not in the

>dictionary (eg. application names) and that makes the whole situation

>kinda tricky.

I think you need to look at grapheme-to-phoneme conversion (g2p). 

Sequitur G2P (GPL) can be trained with any flavour of pronunciation dictionary.

Ken

--- (Edited on 2/25/2010 7:08 pm [GMT-0500] by kmaclean) ---

Re: Missing triphones in Julius (again)
User: Visitor
Date: 5/27/2010 6:51 pm
Views: 226
Rating: 2

I'm not quite clear on what the fix is for these errors:

 

STAT: reading [ofsample.dfa] and [ofsample.dict]...

Error: voca_load_htkdict: line 6: triphone "ah-l+er" not found

Error: voca_load_htkdict: line 6: triphone "l-er+*" or biphone "l-er" not found

Error: voca_load_htkdict: the line content was: 3 [COLOR] k ah l er

Error: voca_load_htkdict: begin missing phones

Error: voca_load_htkdict: ah-l+er

Error: voca_load_htkdict: l-er+* or biphone l-er

Error: voca_load_htkdict: end missing phones

Error: init_voca: error in reading ofsample.dict: 1 

 

Does this mean that somewhere along the line that triphone wasn't generated and included in the hmmdefs for?

--- (Edited on 5/27/2010 6:51 pm [GMT-0500] by Visitor) ---

Re: Missing triphones in Julius (again)
User: kmaclean
Date: 6/9/2010 8:45 pm
Views: 3277
Rating: 1

>Error: voca_load_htkdict: the line content was: 3 [COLOR] k ah l er

It means that your training corpus does not contain the word "[COLOR] k ah l er".  

 

 

 

--- (Edited on 6/9/2010 9:45 pm [GMT-0400] by kmaclean) ---

PreviousNext