VoxForge
Hi,
i'm quite new to HTK but i was able to do a simple word recognition.
My problem is that i can't record any audio files with HSLab because when i try to do it, i obtain a truncated audio file.
For example if i record the word HELLO and then press play button, i can hear only a piece of it (for example HE...or....LL..).
I worked around this by using Audacity to record wav file.
The problem still remain when i try to use HVite to real-time recognition. So i obtain a bad recognition
In both cases in my terminal i obtain: WARNING [-6006] InitAudi: error dividing buffer in HVite.
Is my problem due to that warning??
I use Ubuntu 12.04, HTK 3.4.1, and to launch HSLab and HVite i have to type in a terminal $ padsp Hvite ...etc...
Can anyone help me??
Thanks a lot.
--- (Edited on 3/24/2013 11:59 am [GMT-0500] by ) ---
I thought no one read these posts...!!!
Anyway i solved this problem long time ago, and i was able to build a good speech recognition system with HTK.
The problem is i can't remember exactly what i did.....!!!!
I have to think about it...because i stumbled in so many problems with the HTK tool.
Let me remember what i did, than i'll answer you.
Just a couple of hours or in the worst case no later than tomorrow.
--- (Edited on 8/20/2014 12:10 pm [GMT-0500] by Visitor) ---
got it.....
I was recording with a cheap built-in mic webcam.
I switched it with the built-in mic of my laptop.
Thereafter HSLab worked fine.
If you want to use Audacity and write .lab files by hand you have to use the following parameters in order get Hvite working properly.
16Khz, mono,32bit float,wav 16bit PCM.
Moreover when you copy the time stamps(start time-final time) of a piece of audio recording from Audacity you must multiply it by 625 (if you use 16Khz as above).
I'm quite sure that the warning will go away.
Let me know if it works for you.
Bye.
--- (Edited on 8/20/2014 12:31 pm [GMT-0500] by Visitor) ---
Thanks for your fast reply!
I'm sorry I didn't tell before what exactly i'm trying to do. This is my first time using HTK, so I followed the HTK Book and voxforge tutorials to build a simple phone call recognizer. I recorded the wav files using audacity (I just realize I recorded stereo, 44.1 Khz ), converted them to mfccs with TARGETKIND = MFCC_0_D_N_Z, and then trained the markov models. When I ran HResults it told the system had a word rate recognition of almost 98%. I was obviously happy the thing seemed to be working..
Then I wanted to run the system live. Voxforge tutorial tells to use Julian to do this. I used Julian and the outcomes weren't as I expected.. the recognition was very poor. So I decided to use HVite, as HTK book tells. But when trying to run it, I got that warning and the subsequent error that didn't let me finish the task.
I will then set the recording parameters as you stated, though I don't understand what I must multiply by 625 :s.
I also have a doubt about the targetkind on the mfccs config file, as the HTK Book targetkind is MFCC_0 and I'm working with MFCC_0_D_N_Z. I read on another forum the HVite does not work with the one I'm using...
Anyways, thanks so much. I hope my English wasn't so painful to read hehe.
Have a nice day.
--- (Edited on 8/20/2014 8:01 pm [GMT-0500] by albertigno) ---
Hi All,
sorry for the late reply, but i'm so busy with my master thesis...
First of all, i do want to share a tutorial that helped me a lot during my first steps through HTK.
Here is the link:
http://www.labunix.uqam.ca/~boukadoum_m/DIC9315/Notes/Markov/HTK_basic_tutorial.pdf
Despite it teachs you how to build a WORD recognition system, it can be useful to one wants to build a system based on phonema.
This Warning is due to a wrong audio record at the very first step of the entire process.
Possible Solutions which worked for me:
6250 62500 hello
Moreover, during the steps to build the entire system you are asked to write some configuration files (.conf).
If you use HSLab you can simple use the config files as in the pdf i posted or as the HTK book shows. At the contrary if you use Audacity you'll have to change the parameter SOURCEFORMAT from HTK to WAV (or WAVE i don't remenber...)
and be sure the others are as shown below:
# Parameters of the input signal
SOURCERATE = 625.0 # = 16 kHz
SOURCEKIND = HAUDIO
SOURCEFORMAT = WAV
As soon as i find my .config files, i'll post them.
I hope it can help someone....
Let me know.
--- (Edited on 9/7/2014 4:22 am [GMT-0500] by Visitor) ---
Thanks for the quick reply. Good to know there are some active users on here. I read what you wrote. I used Audicty and then .lab my files. I tried to use HSLab instead however I get the same WARNING.
1st is " WARNING [-6870] InstallFonts: Cannot load font *-Medium-R-Normal-*-19-*-*-*-*-*-*-* in HSLab"
2nd WARNING is " WARNING [-6006] InitAudi: error dividing buffer in HSLab" as soon as I click the rec button.
Could it simply be the hardware?
Any help is appreciated :)
--- (Edited on 9/7/2014 9:27 am [GMT-0500] by ) ---
Ignore the first warning….
--- (Edited on 9/7/2014 11:00 am [GMT-0500] by Visitor) ---
Cheers. However when I run the command HVite and trying to run it live I get this
READY[1]>
WARNING [-6006] InitAudi: error dividing buffer in HVite
Please speak sentence - measuring level
Level measurement completed
WARNING [-6006] InitAudi: error dividing buffer in HVite
which I ignored. Then it goes on to this and it repeatily outputs this message.
WARNING [-6006] ReadAudio: Failed to read all 0 samples from OSS audio in HVite
This is definitely doing my head in! I am not sure what the problem is. Not sure if this has occured with you.
--- (Edited on 9/9/2014 11:03 am [GMT-0500] by lusting4life) ---