dc.contributor.author |
Katsamanis, A |
en |
dc.contributor.author |
Papandreou, G |
en |
dc.contributor.author |
Pitsikalis, V |
en |
dc.contributor.author |
Maragos, P |
en |
dc.date.accessioned |
2014-03-01T02:44:06Z |
|
dc.date.available |
2014-03-01T02:44:06Z |
|
dc.date.issued |
2006 |
en |
dc.identifier.issn |
22195491 |
en |
dc.identifier.uri |
https://dspace.lib.ntua.gr/xmlui/handle/123456789/31682 |
|
dc.relation.uri |
http://www.scopus.com/inward/record.url?eid=2-s2.0-84862631884&partnerID=40&md5=ccaeee023c42f0923a6dcdec81ac7fdc |
en |
dc.relation.uri |
http://cvsp.cs.ntua.gr/publications/confr/KatsamanisPapandreouPitsikalisMaragos_MultimodalFusion-AdaptCompens-FeaturUncertain-AV-ASR_EUSIPCO2006.pdf |
en |
dc.relation.uri |
http://cvsp.cs.ntua.gr/publications/confr/KatsamanisPapandreouPitsikalisMaragos_MultimodalFusionAvAsr_eusipco06.pdf |
en |
dc.relation.uri |
http://www.eurasip.org/Proceedings/Eusipco/Eusipco2006/papers/1568987783.pdf |
en |
dc.relation.uri |
http://cvsp.cs.ntua.gr/projects/pub/HIWIRE/HiwirePublications/KatsamanisPapandreouPitsikalisMaragos_MultimodalFusionAvAsr_eusipco06.pdf |
en |
dc.subject |
Audio Visual Speech Recognition |
en |
dc.subject |
Environmental Conditions |
en |
dc.subject |
Measurement Noise |
en |
dc.subject |
multimodal fusion |
en |
dc.subject |
Pattern Recognition |
en |
dc.subject |
Speech Recognition |
en |
dc.subject.other |
Adaptive compensation |
en |
dc.subject.other |
Audio visual speech recognition |
en |
dc.subject.other |
Complementary features |
en |
dc.subject.other |
Environmental conditions |
en |
dc.subject.other |
Feature measurement |
en |
dc.subject.other |
Feature uncertainty |
en |
dc.subject.other |
Measurement Noise |
en |
dc.subject.other |
Multi-modal fusion |
en |
dc.subject.other |
Multiple streams |
en |
dc.subject.other |
Probabilistic framework |
en |
dc.subject.other |
Signal processing |
en |
dc.subject.other |
Speech recognition |
en |
dc.subject.other |
Uncertainty analysis |
en |
dc.title |
Multimodal fusion by adaptive compensation for feature uncertainty with application to audiovisual speech recognition |
en |
heal.type |
conferenceItem |
en |
heal.publicationDate |
2006 |
en |
heal.abstract |
In pattern recognition one usually relies on measuring a set of informative features to perform tasks such as classification. While the accuracy of feature measurements heavily depends on changing environmental conditions, studying the consequences of this fact has received relatively little attention to date. In this work we explicitly take into account uncertainty in feature measurements and we show in a rigorous probabilistic framework how the models used for classification should be adjusted to compensate for this effect. Our approach proves to be particularly fruitful in multimodal fusion scenarios, such as audio-visual speech recognition, where multiple streams of complementary features are integrated. For such applications, provided that an estimate of the measurement noise uncertainty for each feature stream is available, we show that the proposed framework leads to highly adaptive multimodal fusion rules which are widely applicable and easy to implement. We further show that previous multimodal fusion methods relying on stream weights fall under our scheme if certain assumptions hold; this provides novel insights into their applicability for various tasks and suggests new practical ways for estimating the stream weights adaptively. Preliminary experimental results in audio-visual speech recognition demonstrate the potential of our approach. |
en |
heal.journalName |
European Signal Processing Conference |
en |