Speech Recognition Engines

Nested
monophone recognition in continuous speech
User: Terminator
Date: 2/1/2015 9:17 pm
Views: 5494
Rating: 1

Hi I am using a new set of features for monophone recognition in continuous speech. I created monophone models and did forced alignment on training data and the alignments I got are pretty good when I tested them. But the problem is that I am not able to do phone recognition (I am testing it on same training data, as of now). When I use HVite for phone recognition, I get only one phone per one speech file while clearly I have a whole sentence in a file. I am not able to understand it as I get good alignments but I am not able to get anythinng in recognition. The HVite command I used for recognition is as follows :

 

 HVite -A -C config2 -w wdnet -H models/hmm10/hmmdefs -S train_test.scp -i recog.out -p 1.0 -t 250.0 150.0 1000.0 phone_dictionary_v2_pau.txt phone_list_v2.txt

I have tried various numbers for -p option and -t option. Each time I get only one phone per wav file.

 

My dictionaty is as follows : 

aa aa

ae ae

ah ah

ao ao

aw aw

ay ay

b b

ch ch

d d

dh dh

eh eh

er er

ey ey

f f

g g

hh hh

ih ih

iy iy

jh jh

k k

l l

m m

n n

ng ng

ow ow

oy oy

p p

pau pau

r r

s s

sh sh

t t

th th

uh uh

uw uw

v v

w w

y y

z z

zh zh

SENT-START

SENT-END

My grammar file is :
$digit = aa | ae | ah | ao | aw | ay | b | ch | d | dh | eh | er  | ey | f | g | hh | ih | iy | jh | k | l | m | n | ng | ow |
 oy | p | r | s | sh | pau | t | th | uh | uw | v | w | y |
 z | zh;
( $digit )
Please let me know how I can get recognition right?

--- (Edited on 2/1/2015 9:18 pm [GMT-0600] by Terminator) ---

Re: monophone recognition in continuous speech
User: TonyR
Date: 2/2/2015 7:32 am
Views: 445
Rating: 2

Easy!

You use:

( $digit )

 

Which is just one digit.  TFM (e.g. http://www.ee.columbia.edu/ln/LabROSA/doc/HTKBook21/node131.html) says that "{} denotes zero or more repetitions" so you want:

 

( { $digit } )

 

Let us know if that doesn't work for you.

 

Tony

-- 

Dr Tony Robinson
Founder Cantab Research Ltd

--- (Edited on 2-February-2015 1:32 pm [GMT+0000] by TonyR) ---

Re: monophone recognition in continuous speech
User: Terminator
Date: 2/4/2015 6:36 am
Views: 2050
Rating: 1

Hi,

  Thanks Tony for the pointer. This worked ( <digit> )

The angular brackets seem to tell the task grammar that there can be multiple phones in a single file.

--- (Edited on 2/4/2015 6:36 am [GMT-0600] by Terminator) ---

PreviousNext