- Why do we use DCT in MFCC?
- What does MFCC extract?
- What is the output of MFCC?
- How is MFCC used in speech recognition?
Why do we use DCT in MFCC?
DCT is the last step of the main process of MFCC feature extraction. The basic concept of DCT is correlating value of mel spectrum so as to produce a good representation of property spectral local. Basically the concept of DCT is the same as inverse fourier transform.
What does MFCC extract?
The MFCC feature extraction technique basically includes windowing the signal, applying the DFT, taking the log of the magnitude, and then warping the frequencies on a Mel scale, followed by applying the inverse DCT.
What is the output of MFCC?
The output after applying MFCC is a matrix having feature vectors extracted from all the frames. In this output matrix the rows represent the corresponding frame numbers and columns represent corresponding feature vector coefficients [1-4]. Finally this output matrix is used for classification process.
How is MFCC used in speech recognition?
MFCC are popular features extracted from speech signals for use in recognition tasks. In the source-filter model of speech, MFCC are understood to represent the filter (vocal tract). The frequency response of the vocal tract is relatively smooth, whereas the source of voiced speech can be modeled as an impulse train.