VoxForge
im new to asr ,so i start reading some resources and in all resource i found i have to label my audio file using HSLab or in Audicity, but the reason i dont get is why do i label my audio???? the other question is ,since i have one audio for one word; (sil word sil) is there an automated way to do that?? ^ question no 3 is ---> some guys label on phone label and some label using the whole word ??? what is the difference or the advantage of labeling using phone.
--- (Edited on 1/9/2013 3:07 am [GMT-0600] by serak) ---
> but the reason i dont get is why do i label my audio????
because the acoustic model training program needs a label to know which letter the sound it is looking at corresponds to... see this post What is an Acoustic Model? for an overview of the recognition process to understand why you need labels for training acoustic models
> i have one audio for one word; (sil word sil) is there an automated way to do that?
I don't understand your question
>some guys label on phone label and some label using the whole word ??
you can combine phone labels hmms to recognize words - therefore less training audio needed.
--- (Edited on 1/12/2013 3:29 pm [GMT-0500] by kmaclean) ---
i mean if for example am using a single word like "hello" per one audio file , and i was wondering if there is a software that can isolate my speech and trim out the silent part and label the audio with my filename.....eg. if my audio file is hello.wav ...the label would be something like this
0393883 8847837 sil
7743599 9883777 hello
10009388 28999818 sil
--- (Edited on 1/14/2013 6:43 am [GMT-0600] by serak) ---
>software that can isolate my speech and trim out the silent
>part and label the audio with my filename
I think you want to do forced alignment whic can generate a file like this aligned.out
--- (Edited on 1/14/2013 2:21 pm [GMT-0500] by kmaclean) ---