Multimodal fusion by adaptive compensation for feature uncertainty with application to audiovisual speech recognition

Katsamanis, A; Papandreou, G; Pitsikalis, V; Maragos, P

dc.contributor.author	Katsamanis, A	en
dc.contributor.author	Papandreou, G	en
dc.contributor.author	Pitsikalis, V	en
dc.contributor.author	Maragos, P	en
dc.date.accessioned	2014-03-01T02:44:06Z
dc.date.available	2014-03-01T02:44:06Z
dc.date.issued	2006	en
dc.identifier.issn	22195491	en
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/31682
dc.relation.uri	http://www.scopus.com/inward/record.url?eid=2-s2.0-84862631884&partnerID=40&md5=ccaeee023c42f0923a6dcdec81ac7fdc	en
dc.relation.uri	http://cvsp.cs.ntua.gr/publications/confr/KatsamanisPapandreouPitsikalisMaragos_MultimodalFusion-AdaptCompens-FeaturUncertain-AV-ASR_EUSIPCO2006.pdf	en
dc.relation.uri	http://cvsp.cs.ntua.gr/publications/confr/KatsamanisPapandreouPitsikalisMaragos_MultimodalFusionAvAsr_eusipco06.pdf	en
dc.relation.uri	http://www.eurasip.org/Proceedings/Eusipco/Eusipco2006/papers/1568987783.pdf	en
dc.relation.uri	http://cvsp.cs.ntua.gr/projects/pub/HIWIRE/HiwirePublications/KatsamanisPapandreouPitsikalisMaragos_MultimodalFusionAvAsr_eusipco06.pdf	en
dc.subject	Audio Visual Speech Recognition	en
dc.subject	Environmental Conditions	en
dc.subject	Measurement Noise	en
dc.subject	multimodal fusion	en
dc.subject	Pattern Recognition	en
dc.subject	Speech Recognition	en
dc.subject.other	Adaptive compensation	en
dc.subject.other	Audio visual speech recognition	en
dc.subject.other	Complementary features	en
dc.subject.other	Environmental conditions	en
dc.subject.other	Feature measurement	en
dc.subject.other	Feature uncertainty	en
dc.subject.other	Measurement Noise	en
dc.subject.other	Multi-modal fusion	en
dc.subject.other	Multiple streams	en
dc.subject.other	Probabilistic framework	en
dc.subject.other	Signal processing	en
dc.subject.other	Speech recognition	en
dc.subject.other	Uncertainty analysis	en
dc.title	Multimodal fusion by adaptive compensation for feature uncertainty with application to audiovisual speech recognition	en
heal.type	conferenceItem	en
heal.publicationDate	2006	en
heal.abstract	In pattern recognition one usually relies on measuring a set of informative features to perform tasks such as classification. While the accuracy of feature measurements heavily depends on changing environmental conditions, studying the consequences of this fact has received relatively little attention to date. In this work we explicitly take into account uncertainty in feature measurements and we show in a rigorous probabilistic framework how the models used for classification should be adjusted to compensate for this effect. Our approach proves to be particularly fruitful in multimodal fusion scenarios, such as audio-visual speech recognition, where multiple streams of complementary features are integrated. For such applications, provided that an estimate of the measurement noise uncertainty for each feature stream is available, we show that the proposed framework leads to highly adaptive multimodal fusion rules which are widely applicable and easy to implement. We further show that previous multimodal fusion methods relying on stream weights fall under our scheme if certain assumptions hold; this provides novel insights into their applicability for various tasks and suggests new practical ways for estimating the stream weights adaptively. Preliminary experimental results in audio-visual speech recognition demonstrate the potential of our approach.	en
heal.journalName	European Signal Processing Conference	en