Step 5 - Coding the (Audio) Data

Create Codetrain.scp

HTK calls this last step in data preparation the "parameterizing the raw speech waveforms into sequences of feature vectors". All this means is that HTK is not as efficient in processing wav files as it is with its internal format. Therefore, you need to convert you audio wav files to another format called MFCC format (which refers to Mel Frequency Cepstral Coefficients; which are more generally referred to as 'feature vectors').

You use the HCopy tool to convert your wav files to MFCC format. You have 2 options. You could execute the HCopy command by hand for each audio file you created in Step 3, or you can create a file containing a list of each source audio file and the name of the MFCC file it will be converted to, and use that file as a parameter to the HCopy command. We will use the second approach in this example.

Create the codetrain.scp HTK script file in your 'voxforge/tutorial' folder.

Config file

The HCopy command performs the conversion from wav format to MFCC. To do this, a configuration file (config) which specifies all the needed conversion parameters is required. Create a file called wav_config in your 'voxforge/tutorial' folder and add the following:

SOURCEFORMAT = WAV
TARGETKIND = MFCC_0_D
TARGETRATE = 100000.0
SAVECOMPRESSED = T
SAVEWITHCRC = T
WINDOWSIZE = 250000.0
USEHAMMING = T
PREEMCOEF = 0.97
NUMCHANS = 26
CEPLIFTER = 22
NUMCEPS = 12

If you would like more details on the contents of the config file, please see the HTK documentation.

Create a new directory called 'mfcc' in your 'voxforge/train' folder. Then execute HCopy from your 'voxforge/tutorial' folder as follows:

HCopy -A -D -T 1 -C wav_config -S codetrain.scp

The result is the creation of a series of mfc files corresponding to the files listed in your codetrain.scp script in the "voxforge/train/mfcc" folder. Be sure to monitor the output of the HCopy command to ensure that all wav files get processed properly. Most problems are related to file paths or audio files in a non-wav format.

Comments

ERROR [+6210] OpenWaveInput: Cannot open waveform

By mawahballah - 2/18/2018 Hello everyone,

illegible values in step 5

By Asim12 - 1/19/2013

Error 6210

By Asim12 - 1/2/2013 - 6 Replies I am getting an error in step 5, that is cannot open waveform file S0001.wav. I have heard that i need to use sox to convert my audio files into wav files but how to use it, plz guide me. Sox cheat file doesn't help me out with this. I am following this tutorial, step 5 point number 2.

Hcopy command error please help

By prashanth123 - 10/20/2012 - 3 Replies

HLEd word dose not exist problem

By ripul_88 - 7/5/2012 - 1 Replies

untitled

By Stewie - 11/16/2011 - 1 Replies I was also in trouble about that issue

unable to set MFCC_E_D_N_Z as target for coding

By ashwin - 6/22/2011 - 1 Replies i am not able to set MFCC_E_D_N_Z as the target for coding the wav files.

Configuration file

By gbernardi - 3/3/2011 - 1 Replies Hi, I don't understand why it is not a problem that the configuration files we use for coding the data and the one used by HcompV do not have the same TARGETKIND...

problem with the last part!

By Hossein Khaki - 2/8/2011 - 3 Replies

Doubt regarding the output of hcopy

By Tom George - 11/28/2010 - 2 Replies I used to hcopy command to extract the features of a wave file. But instead of getting numerical output i got a mfc file of illegible characters.

problems with wav format

By j17 - 9/27/2010 - 2 Replies

Problem with copy

By bejimed - 8/13/2010 - 1 Replies can anyone help me

Hcopy output

By novision - 5/20/2010 - 2 Replies Hii all,

Running HCopy

By Gothrog - 3/22/2010 - 4 Replies This is my first time running through this example on Step5.

TIMIT wav file have problems!!

By spring - 12/23/2009 - 2 Replies Hi,ken,I ask you for help!

Re: need your help

By Amit Surana - 5/30/2009 Please check the path of wav files in codetrain.scp

need some help

By joshua - 3/30/2009 - 4 Replies Hi i would need some help on this, can anyone help me?


Username	Password