Speech Recognition in the News

Nested
Zero Crossings as an Effective Feature In Speech Recognition for Embedded Applications
User: kmaclean
Date: 3/21/2008 12:02 pm
Views: 3160
Rating: 18

This is an interesting article on the use of zero crossing rather than feature vectors (such as the MFCCs we use with HTK/Julius) that are traditionally used in speech recognition.  Shubhendu Trivedi was looking to create a speaker dependent, isolated word, speech recognizer for a 8051 micro-controller.  But traditional HMM approaches using MFCC based feature vectors were too computationally intensive to work on this controller.

He found a paper that provided the solution.  In it, the authors describe a way of only using zero crossings of the speech signal to determine the feature vector.   Shubhendu says in his article:

This feature vector is basically the histogram of the time interval between successive zero-crossings of the utterance in a short time window. These feature vectors for each window are then combined together to form a feature matrix. Since we are dealing with only small time series (isolated words), we can employ Dynamic Time Warping to compare the input matrix with the reference matrix’ stored.

 

PreviousNext