dc.contributor.author |
Tsiakoulis, P |
en |
dc.contributor.author |
Potamianos, A |
en |
dc.contributor.author |
Dimitriadis, D |
en |
dc.date.accessioned |
2014-03-01T01:34:38Z |
|
dc.date.available |
2014-03-01T01:34:38Z |
|
dc.date.issued |
2010 |
en |
dc.identifier.issn |
1070-9908 |
en |
dc.identifier.uri |
https://dspace.lib.ntua.gr/xmlui/handle/123456789/20781 |
|
dc.subject |
First spectral moment |
en |
dc.subject |
Low order cepstral coefficients |
en |
dc.subject |
Robust speech recognition |
en |
dc.subject |
SMAC |
en |
dc.subject.classification |
Engineering, Electrical & Electronic |
en |
dc.subject.other |
Automatic speech recognition |
en |
dc.subject.other |
Central frequency |
en |
dc.subject.other |
Cepstral coefficients |
en |
dc.subject.other |
Frequency domains |
en |
dc.subject.other |
Low order |
en |
dc.subject.other |
Robust ASR |
en |
dc.subject.other |
Robust speech recognition |
en |
dc.subject.other |
Spectral moments |
en |
dc.subject.other |
Spectral tilt |
en |
dc.subject.other |
Speech spectra |
en |
dc.subject.other |
Time-frequency distributions |
en |
dc.subject.other |
Speech recognition |
en |
dc.title |
Spectral moment features augmented by low order cepstral coefficients for robust ASR |
en |
heal.type |
journalArticle |
en |
heal.identifier.primary |
10.1109/LSP.2010.2046349 |
en |
heal.identifier.secondary |
5437270 |
en |
heal.identifier.secondary |
http://dx.doi.org/10.1109/LSP.2010.2046349 |
en |
heal.language |
English |
en |
heal.publicationDate |
2010 |
en |
heal.abstract |
We propose a novel Automatic Speech Recognition (ASR) front-end, that consists of the first central Spectral Moment time-frequency distribution Augmented by low order Cepstral coefficients (SMAC). We prove that the first central spectral moment is proportional to the spectral derivative with respect to the filter's central frequency. Consequently, the spectral moment is an estimate of the frequency domain derivative of the speech spectrum. However information related to the entire speech spectrum, such as the energy and the spectral tilt, is not adequately modeled. We propose adding this information with few cepstral coefficients. Furthermore, we use a mel-spaced Gabor filterbank with 70% frequency overlap in order to overcome the sensitivity to pitch harmonics. The novel SMAC front-end was evaluated for the speech recognition task for a variety of recording conditions. The experimental results have shown that SMAC performs at least as well as the standard MFCC front-end in clean conditions, and significantly outperforms MFCCs in noisy conditions. © 2006 IEEE. |
en |
heal.publisher |
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
en |
heal.journalName |
IEEE Signal Processing Letters |
en |
dc.identifier.doi |
10.1109/LSP.2010.2046349 |
en |
dc.identifier.isi |
ISI:000277048600004 |
en |
dc.identifier.volume |
17 |
en |
dc.identifier.issue |
6 |
en |
dc.identifier.spage |
551 |
en |
dc.identifier.epage |
554 |
en |