VoxForge
Hi!
I'm trying to develop a recognition system pitch-based. So, I need to use external coefficients vectors proceeding from a pitch detector that I've programmed. I use a MatLab function to make a .htk file format where I append pitch coefficients to the head. If I use "HList -h -S train.scp", for example, I obtain for one .htk file:
------------------------------------- Source: ns5 ------------------------------
Sample Bytes: 200 Sample Kind: USER
Num Comps: 50 Sample Period: 10000.0 us
Num Samples: 1 File Format: HTK
------------------------------------ Samples: 0->-1 ----------------------------
0: -1.067 2.653 2.653 2.653 2.653 2.653 6.580 6.580 6.580
15.101
15.101 15.101 15.101 15.101 15.101 6.580 6.580 6.580 2.653
2.653
-1.067 -2.828 -4.588 -4.588 -11.079 -11.079 -11.079 -11.079 -11.079
-7.923
-6.256 -4.588 2.653 2.653 2.653 2.653 2.653 2.653 2.653
2.653
-4.588 -1.067 -4.588 -4.588 -4.588 -4.588 -7.923 -16.207 -16.207 -
24.492
----------------------------------------- END ----------------------------------
I established 10ms to sample period because my pitch detector evaluate pitch from 10ms length samples (I don't know if this is correct). Vector is composed by 50 coefficients and I defined sample kind as USER.
When I use HERest I obtain the next error for every .htk file corresponding to one utterance:
ERROR [-7324] Bad data or over pruning.
It isn't a matching problem. I think that it isn't a threshold problem either, because I relaxed pruning thresholds in the interval 0 to 10000... Probably the problem is that I have only one sample with 50 components but, if I do it backwards I would have 50 samples with just 1 component per sample so, how could I augment samples and components if I only give a value (corresponding to the pitch in Mel scale) per 10ms segment size?
How must I config the config file if I have defined files with coefficients vectors? For example, I don't need SOURCEKIND but TARGETKIND exists because I have defined .htk files yet (I don't need to use HCopy, I have equivalent .mfc HTKBook example files).
Thank you very much.
--- (Edited on 10/12/2010 1:40 pm [GMT-0500] by iloes) ---
I achieved to give a solution to the problem. It was related, as I supposed, to the number of coefficients. Finally, I inverted the coefficient vector (using only one coefficient per sample) and after I augmented the number of features including energy, delta and acceleration coefficients. It works this way until the end without any problem.
Thanks.
--- (Edited on 10/15/2010 8:30 am [GMT-0500] by iloes) ---
hi i also want to use different features for speech recognition like plp, rasta instead of mfcc.can u suggest me hw did u convert into .htk ur own features(matrices in matlab).
as i followed the same steps from this site upto step5
after that i used "writhtk function" in matlab through which i computed plp coefficients "in htk format".Instead of mfcc i saved them in
C:\cygwin\home\Administrator\voxforge\train\mfcc\sample1.plp and so on
i have changed the codetrain , train.scp accordingly.
but on running HcompV i m getting error as follows:
Administrator@vishal /home/Administrator/voxforge/auto
$ HCompV -A -D -T 1 -C config_p -f 0.01 -m -S train_p.scp -M hmm0 proto_p
C:\cygwin\HTK\htk-3.3-windows-binary_2\htk\HCompV.exe -A -D -T 1 -C config_p -f
0.01 -m -S train_p.scp -M hmm0 proto_p
HTK Configuration Parameters[10]
Module/Tool Parameter Value
# NUMCEPS 12
# CEPLIFTER 22
# NUMCHANS 26
# PREEMCOEF 0.970000
# USEHAMMING TRUE
# WINDOWSIZE 250000.000000
# SAVEWITHCRC TRUE
# SAVECOMPRESSED TRUE
# TARGETRATE 100000.000000
# TARGETKIND PLP_D_A_Z_0
HMM Def Error: LoadHMMSet: Inconsistent HMM macro name at line 1/col 0/char -1 i
n proto_p
ERROR [+7050] HMError:
ERROR [+2028] Initialise: LoadHMMSet failed
FATAL ERROR - Terminating program C:\cygwin\HTK\htk-3.3-windows-binary_2\htk\HC
ompV.exe
--- (Edited on 10/16/2010 4:43 pm [GMT-0500] by aspirant) ---
Hi,
I am also thinking of using other feature vectors other than mfcc. but i am not sure on how to implement plp.
i am very new to this and my project supervisors have basically left me to get a feel of this. correct me if i am wrong, i can change it under the targetkind to plp right.
what kind of changes do i have to make to my wave files and codetrain?
--- (Edited on 3/1/2011 6:07 am [GMT-0600] by Visitor) ---
I meant other than obviously renaming all mfcc related folders and extensions to plp. i read that some users have reconfigured that deltas etc, are these required?
thank
--- (Edited on 3/1/2011 6:10 am [GMT-0600] by Visitor) ---
hi i also want to use different features for speech recognition like plp, rasta instead of mfcc.can u suggest me hw did u convert into .htk ur own features(matrices in matlab).
as i followed the same steps from this site upto step5
after that i used "writhtk function" in matlab through which i computed plp coefficients "in htk format".Instead of mfcc i saved them in
C:\cygwin\home\Administrator\voxforge\train\mfcc\sample1.plp and so on
i have changed the codetrain , train.scp accordingly.
but on running HcompV i m getting error as follows:
Administrator@vishal /home/Administrator/voxforge/auto
$ HCompV -A -D -T 1 -C config_p -f 0.01 -m -S train_p.scp -M hmm0 proto_p
C:\cygwin\HTK\htk-3.3-windows-binary_2\htk\HCompV.exe -A -D -T 1 -C config_p -f
0.01 -m -S train_p.scp -M hmm0 proto_p
HTK Configuration Parameters[10]
Module/Tool Parameter Value
# NUMCEPS 12
# CEPLIFTER 22
# NUMCHANS 26
# PREEMCOEF 0.970000
# USEHAMMING TRUE
# WINDOWSIZE 250000.000000
# SAVEWITHCRC TRUE
# SAVECOMPRESSED TRUE
# TARGETRATE 100000.000000
# TARGETKIND PLP_D_A_Z_0
HMM Def Error: LoadHMMSet: Inconsistent HMM macro name at line 1/col 0/char -1 i
n proto_p
ERROR [+7050] HMError:
ERROR [+2028] Initialise: LoadHMMSet failed
FATAL ERROR - Terminating program C:\cygwin\HTK\htk-3.3-windows-binary_2\htk\HC
ompV.exe
--- (Edited on 10/16/2010 4:43 pm [GMT-0500] by aspirant) ---
Cross posted here: unable to run HcompV
--- (Edited on 10/16/2010 7:43 pm [GMT-0400] by kmaclean) ---
Al Salam Alikoum,
I also work with new feature extraction method for phone recognition but the correct percentage is very small as I get 25% and the big problem is the accuracy is negative .
any one can help me to improve the result and give accuracy with suitable value
--- (Edited on 7/20/2012 7:57 am [GMT-0500] by Visitor) ---
Hi all,
Im basically working on the same idea. Here's what I am doing. I use HTK to convert my .wav files into .mfc files. Then I use HList to convert .mfc files to text files. I get the pitch values from Praat. Im trying to append the pitch values to the correspinding mfcc vectors. Here's my problem. Say, a wav file is of 79 seconds duration. I have set TARGETRATE=100000 and WINDOWSIZE=200000, so (please correct me if im wrong) I should be getting like 7900 vectors or so in my .mfc files. Instead, I get 3950 vectors ( half of what I expected). I am sure I am missing something. Please help!
--- (Edited on 9/5/2012 6:06 am [GMT-0500] by Visitor) ---