Spectral moment features augmented by low order cepstral coefficients for robust ASR

Tsiakoulis, P; Potamianos, A; Dimitriadis, D

dc.contributor.author	Tsiakoulis, P	en
dc.contributor.author	Potamianos, A	en
dc.contributor.author	Dimitriadis, D	en
dc.date.accessioned	2014-03-01T01:34:38Z
dc.date.available	2014-03-01T01:34:38Z
dc.date.issued	2010	en
dc.identifier.issn	1070-9908	en
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/20781
dc.subject	First spectral moment	en
dc.subject	Low order cepstral coefficients	en
dc.subject	Robust speech recognition	en
dc.subject	SMAC	en
dc.subject.classification	Engineering, Electrical & Electronic	en
dc.subject.other	Automatic speech recognition	en
dc.subject.other	Central frequency	en
dc.subject.other	Cepstral coefficients	en
dc.subject.other	Frequency domains	en
dc.subject.other	Low order	en
dc.subject.other	Robust ASR	en
dc.subject.other	Robust speech recognition	en
dc.subject.other	Spectral moments	en
dc.subject.other	Spectral tilt	en
dc.subject.other	Speech spectra	en
dc.subject.other	Time-frequency distributions	en
dc.subject.other	Speech recognition	en
dc.title	Spectral moment features augmented by low order cepstral coefficients for robust ASR	en
heal.type	journalArticle	en
heal.identifier.primary	10.1109/LSP.2010.2046349	en
heal.identifier.secondary	5437270	en
heal.identifier.secondary	http://dx.doi.org/10.1109/LSP.2010.2046349	en
heal.language	English	en
heal.publicationDate	2010	en
heal.abstract	We propose a novel Automatic Speech Recognition (ASR) front-end, that consists of the first central Spectral Moment time-frequency distribution Augmented by low order Cepstral coefficients (SMAC). We prove that the first central spectral moment is proportional to the spectral derivative with respect to the filter's central frequency. Consequently, the spectral moment is an estimate of the frequency domain derivative of the speech spectrum. However information related to the entire speech spectrum, such as the energy and the spectral tilt, is not adequately modeled. We propose adding this information with few cepstral coefficients. Furthermore, we use a mel-spaced Gabor filterbank with 70% frequency overlap in order to overcome the sensitivity to pitch harmonics. The novel SMAC front-end was evaluated for the speech recognition task for a variety of recording conditions. The experimental results have shown that SMAC performs at least as well as the standard MFCC front-end in clean conditions, and significantly outperforms MFCCs in noisy conditions. © 2006 IEEE.	en
heal.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	en
heal.journalName	IEEE Signal Processing Letters	en
dc.identifier.doi	10.1109/LSP.2010.2046349	en
dc.identifier.isi	ISI:000277048600004	en
dc.identifier.volume	17	en
dc.identifier.issue	6	en
dc.identifier.spage	551	en
dc.identifier.epage	554	en