Acoustic Model Discussions

Acoustic Models at different frequencies
User: John
Date: 9/17/2008 10:15 am
Views: 6307
Rating: 2

I am looking for any acoustic models (to be used with sphinx) that have been created using data at a 22Khz or 44 Khz (or even 11Khz). My problem is that I am using information recorded through the Flash Microphone class, which only supports 5.5(8)Khz, 11Khz, 22Khz and 44Khz.

I would believe a proper solution would be to use a model trained to the desired frequency rather than downsampling for each audio file.

What frequency is the voxForge-en model for sphinx? Is this also 16Khz?


--- (Edited on 9/17/2008 10:15 am [GMT-0500] by Visitor) ---

Re: Acoustic Models at different frequencies
User: John
Date: 9/17/2008 12:24 pm
Views: 75
Rating: 3

I should clarify that I am speaking about the sampling rate of the audio.

--- (Edited on 9/17/2008 12:24 pm [GMT-0500] by Visitor) ---

Re: Acoustic Models at different frequencies
User: kmaclean
Date: 9/17/2008 1:13 pm
Views: 2778
Rating: 2

Hi John,

>I would believe a proper solution would be to use a model trained to

>the desired frequency rather than downsampling for each audio file.

You are correct.  The perfect solution is to only collect audio from your target environment, and only at the sampling rate that your microphone  and audio software can best support.  However, such an approach is very costly.  Check out LDC - most speech there was recorded at either a 8kHz or 16kHz sampling rate.

Downsampling is a reasonable compromise.  That way, audio might be used to train acoustic models for different applications.

>What frequency is the voxForge-en model for sphinx? Is this also 16Khz?

16kHz (according to the voxforge-en/logdir/decode/voxforge_en_sphinx-1-1.log file).



--- (Edited on 9/17/2008 2:13 pm [GMT-0400] by kmaclean) ---
