dc.contributor.author |
Dimitriadis, D |
en |
dc.contributor.author |
Maragos, P |
en |
dc.contributor.author |
Potamianos, A |
en |
dc.date.accessioned |
2014-03-01T02:43:09Z |
|
dc.date.available |
2014-03-01T02:43:09Z |
|
dc.date.issued |
2005 |
en |
dc.identifier.uri |
https://dspace.lib.ntua.gr/xmlui/handle/123456789/31254 |
|
dc.relation.uri |
http://www.scopus.com/inward/record.url?eid=2-s2.0-33745225159&partnerID=40&md5=fdcff3918c3ab9bb541dad7384cbe9cc |
en |
dc.relation.uri |
http://cvsp.cs.ntua.gr/publications/confr/DimitriadisMaragosPotamianos_AuditTeagEnergCepstrumRobustSpeechRecogn_Interspeech2005.pdf |
en |
dc.relation.uri |
http://www.telecom.tuc.gr/%7Epotam/preprints/conf/05_EURO_features.pdf |
en |
dc.relation.uri |
http://www.isca-speech.org/archive/interspeech_2005/i05_3013.html |
en |
dc.relation.uri |
http://www.informatik.uni-trier.de/~ley/db/conf/interspeech/interspeech2005.html#DimitriadisMP05 |
en |
dc.subject |
Additive Noise |
en |
dc.subject |
Feature Extraction |
en |
dc.subject |
Human Auditory Processing |
en |
dc.subject |
Speech Recognition |
en |
dc.subject |
Error Rate |
en |
dc.subject |
Mel Frequency Cepstrum Coefficient |
en |
dc.subject |
Word Error Rate |
en |
dc.subject.other |
Acoustic noise |
en |
dc.subject.other |
Computational linguistics |
en |
dc.subject.other |
Error analysis |
en |
dc.subject.other |
Feature extraction |
en |
dc.subject.other |
Magnetic resonance |
en |
dc.subject.other |
Natural frequencies |
en |
dc.subject.other |
Speech recognition |
en |
dc.subject.other |
Energy cepstrum coefficients |
en |
dc.subject.other |
Phone recognition tasks |
en |
dc.subject.other |
Recording conditions |
en |
dc.subject.other |
Teager Energy Cepstrum Coefficients (TECC) |
en |
dc.subject.other |
Learning algorithms |
en |
dc.title |
Auditory teager energy cepstrum coefficients for robust speech recognition |
en |
heal.type |
conferenceItem |
en |
heal.publicationDate |
2005 |
en |
heal.abstract |
In this paper, a feature extraction algorithm for robust speech recognition is introduced. The feature extraction algorithm is motivated by the human auditory processing and the nonlinear Teager-Kaiser energy operator that estimates the true energy of the source of a resonance. The proposed features are labeled as Teager Energy Cepstrum Coefficients (TECCs). TECCs are computed by first filtering the speech signal through a dense non constant-Q Gammatone filterbank and then by estimating the ""true"" energy of the signal's source, i.e., the short-time average of the output of the Teager-Kaiser energy operator. Error analysis and speech recognition experiments show that the TECCs and the mel frequency cepstrum coefficients (MFCCs) perform similarly for clean recording conditions; while the TECCs perform significantly better than the MFCCs for noisy recognition tasks. Specifically, relative word error rate improvement of 60% over the MFCC baseline is shown for the Aurora-3 database for the high-mismatch condition. Absolute error rate improvement ranging from 5% to 20% is shown for a phone recognition task in (various types of additive) noise. |
en |
heal.journalName |
9th European Conference on Speech Communication and Technology |
en |
dc.identifier.spage |
3013 |
en |
dc.identifier.epage |
3016 |
en |