Multiband modulation energy tracking for noisy speech detection

Evangelopoulos, G; Maragos, P

dc.contributor.author	Evangelopoulos, G	en
dc.contributor.author	Maragos, P	en
dc.date.accessioned	2014-03-01T01:24:40Z
dc.date.available	2014-03-01T01:24:40Z
dc.date.issued	2006	en
dc.identifier.issn	1558-7916	en
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/17386
dc.subject	Detector evaluation	en
dc.subject	Energy separation algorithm (ESA)	en
dc.subject	Modulations	en
dc.subject	Multiband demodulation	en
dc.subject	Speech analysis	en
dc.subject	Speech endpoint detection	en
dc.subject	Teager energy	en
dc.subject	Voice activity detection (VAD)	en
dc.subject.classification	Acoustics	en
dc.subject.classification	Engineering, Electrical & Electronic	en
dc.subject.other	Energy separation algorithm (ESA)	en
dc.subject.other	Multiband demodulation	en
dc.subject.other	Speech endpoint detection	en
dc.subject.other	Teager energy	en
dc.subject.other	Voice activity detection (VAD)	en
dc.subject.other	Amplitude modulation	en
dc.subject.other	Demodulation	en
dc.subject.other	Detectors	en
dc.subject.other	Error detection	en
dc.subject.other	Frequency bands	en
dc.subject.other	Optical variables measurement	en
dc.subject.other	Separation	en
dc.subject.other	Signal analysis	en
dc.subject.other	Speech	en
dc.subject.other	Speech analysis	en
dc.subject.other	Speech communication	en
dc.subject.other	Speech transmission	en
dc.subject.other	Speech recognition	en
dc.title	Multiband modulation energy tracking for noisy speech detection	en
heal.type	journalArticle	en
heal.identifier.primary	10.1109/TASL.2006.872625	en
heal.identifier.secondary	http://dx.doi.org/10.1109/TASL.2006.872625	en
heal.identifier.secondary	1709892	en
heal.language	English	en
heal.publicationDate	2006	en
heal.abstract	The ability to accurately locate the boundaries of speech activity is an important attribute of any modern speech recognition, processing, or transmission system. The effort in this paper is the development of efficient, sophisticated features for speech detection in noisy environments, using ideas and techniques from recent advances in speech modeling and analysis, like presence of modulations in speech formants, energy separation and multiband filtering. First we present a method, conceptually based on a classic speech-silence discrimination procedure, that uses some newly developed, short-time signal analysis tools and provide for it a detection theoretic motivation. The new energy and spectral content representations are derived through filtering the signal in various frequency bands, estimating the Teager-Kaiser energy for each and demodulating the most active one in order to derive the signal's dominant AM-FM components. This modulation approach demonstrated an improved robustness in noise over the classic algorithm, reaching an average error reduction of 33.5% under 5-30-dB noise. Second, by incorporating alternative modulation energy features in voice activity detection, improvement in overall misclassification error of a high hit rate detector reached 7.5% and 9.5% on different benchmarks © 2006 IEEE.	en
heal.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	en
heal.journalName	IEEE Transactions on Audio, Speech and Language Processing	en
dc.identifier.doi	10.1109/TASL.2006.872625	en
dc.identifier.isi	ISI:000241567200015	en
dc.identifier.volume	14	en
dc.identifier.issue	6	en
dc.identifier.spage	2024	en
dc.identifier.epage	2038	en