Speech event detection using multiband modulation energy

Evangelopoulos, G; Maragos, P

dc.contributor.author	Evangelopoulos, G	en
dc.contributor.author	Maragos, P	en
dc.date.accessioned	2014-03-01T02:43:33Z
dc.date.available	2014-03-01T02:43:33Z
dc.date.issued	2005	en
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/31470
dc.relation.uri	http://www.scopus.com/inward/record.url?eid=2-s2.0-33745225459&partnerID=40&md5=250075046f04ca69ca6955aa7e05f709	en
dc.relation.uri	http://cvsp.cs.ntua.gr/projects/pub/HIWIRE/HiwirePublications/EvagelMaragos_SpeechEventDetectionMBEnergy_is05.pdf	en
dc.relation.uri	http://cvsp.cs.ntua.gr/publications/confr/EvangelopoulosMaragos_interspeech05.pdf	en
dc.relation.uri	http://www.isca-speech.org/archive/interspeech_2005/i05_0685.html	en
dc.relation.uri	http://www.informatik.uni-trier.de/~ley/db/conf/interspeech/interspeech2005.html#EvangelopoulosM05	en
dc.subject	Event Detection	en
dc.subject	Modeling and Analysis	en
dc.subject.other	Acoustic signal processing	en
dc.subject.other	Amplitude modulation	en
dc.subject.other	Bandwidth	en
dc.subject.other	Classification (of information)	en
dc.subject.other	Demodulation	en
dc.subject.other	Error correction	en
dc.subject.other	Sounding apparatus	en
dc.subject.other	Spectrum analysis	en
dc.subject.other	Detection-theoretic motivation	en
dc.subject.other	Modulation energy	en
dc.subject.other	Non-linear speech modeling	en
dc.subject.other	Word boundary	en
dc.subject.other	Speech recognition	en
dc.title	Speech event detection using multiband modulation energy	en
heal.type	conferenceItem	en
heal.publicationDate	2005	en
heal.abstract	The need for efficient, sophisticated features for speech event detection is inherent in state of the art processing, enhancement and recognition systems. We explore ideas and techniques from non-linear speech modeling and analysis, like modulations and multiband filtering and propose new energy and spectral content features derived through filtering in multiple frequency bands and tracking dominant modulation energy in terms of the Teager-Kaiser Energy of separate AM-FM components. We present a detection-theoretic motivation and incorporate them in two detection schemes namely word boundary and voice activity detection. The modulation approach demonstrated noisy speech endpoint detection accuracy, reaching ∼40% error reduction on NTIMIT. In a voice activity scheme, improvement in overall misclassification error of a high hit-rate detector reached 7.5% on Aurora 2 and 9.5% on Aurora 3 databases.	en
heal.journalName	9th European Conference on Speech Communication and Technology	en
dc.identifier.spage	685	en
dc.identifier.epage	688	en