dc.contributor.author |
Pitsikalis, V |
en |
dc.contributor.author |
Katsamanis, A |
en |
dc.contributor.author |
Papandreou, G |
en |
dc.contributor.author |
Maragos, P |
en |
dc.date.accessioned |
2014-03-01T02:43:53Z |
|
dc.date.available |
2014-03-01T02:43:53Z |
|
dc.date.issued |
2006 |
en |
dc.identifier.uri |
https://dspace.lib.ntua.gr/xmlui/handle/123456789/31539 |
|
dc.relation.uri |
http://www.scopus.com/inward/record.url?eid=2-s2.0-44949227080&partnerID=40&md5=6edf7efa047e4239c0ea003cf525bf63 |
en |
dc.relation.uri |
http://cvsp.cs.ntua.gr/publications/confr/PitsikalisKatsamanisPapandreouMaragos_AdaptiveMultimodalFusionUncertaintyCompensation_ICSLP06.pdf |
en |
dc.relation.uri |
http://cvsp.cs.ntua.gr/projects/pub/HIWIRE/HiwirePublications/PitsikalisKatsamanisPapandreouMaragos_AdaptiveFusionUncertaintyCompensation_ICSLP06.pdf |
en |
dc.relation.uri |
http://aspi.loria.fr/Save/Pitsikalis.pdf |
en |
dc.relation.uri |
http://www.isca-speech.org/archive/interspeech_2006/i06_1950.html |
en |
dc.relation.uri |
http://www.informatik.uni-trier.de/~ley/db/conf/interspeech/interspeech2006.html#PitsikalisKPM06 |
en |
dc.subject |
Active appearance models |
en |
dc.subject |
Audiovisual speech recognition |
en |
dc.subject |
Multimodal fusion |
en |
dc.subject |
Product HMMs |
en |
dc.subject |
Stream weights |
en |
dc.subject |
Uncertainty compensation |
en |
dc.subject.other |
Asynchronous models |
en |
dc.subject.other |
Audio visual speech recognition (AVSR) |
en |
dc.subject.other |
Classification rules |
en |
dc.subject.other |
Environmental conditioning |
en |
dc.subject.other |
feature measurements |
en |
dc.subject.other |
international conferences |
en |
dc.subject.other |
Measurement noises |
en |
dc.subject.other |
Multi modal fusion |
en |
dc.subject.other |
multiple streams |
en |
dc.subject.other |
On-stream |
en |
dc.subject.other |
recognition tasks |
en |
dc.subject.other |
Spoken language processing |
en |
dc.subject.other |
Feature extraction |
en |
dc.subject.other |
Fusion reactions |
en |
dc.subject.other |
Linguistics |
en |
dc.subject.other |
Measurements |
en |
dc.subject.other |
Nuclear physics |
en |
dc.subject.other |
Pattern recognition |
en |
dc.subject.other |
Speech |
en |
dc.subject.other |
Speech analysis |
en |
dc.subject.other |
Uncertainty analysis |
en |
dc.subject.other |
Speech recognition |
en |
dc.title |
Adaptive multimodal fusion by uncertainty compensation |
en |
heal.type |
conferenceItem |
en |
heal.publicationDate |
2006 |
en |
heal.abstract |
While the accuracy of feature measurements heavily depends on changing environmental conditions, studying the consequences of this fact in pattern recognition tasks has received relatively little attention to date. In this work we explicitly take into account feature measurement uncertainty and we show how classification rules should be adjusted to compensate for its effects. Our approach is particularly fruitful in multimodal fusion scenarios, such as audio-visual speech recognition, where multiple streams of complementary time-evolving features are integrated. For such applications, provided that the measurement noise uncertainty for each feature stream can be estimated, the proposed framework leads to highly adaptive multimodal fusion rules which are widely applicable and easy to implement. We further show that previous multimodal fusion methods relying on stream weights fall under our scheme under certain assumptions; this provides novel insights into their applicability for various tasks and suggests new practical ways for estimating the stream weights adaptively. The potential of our approach is demonstrated in audio-visual speech recognition using either synchronous or asynchronous models. |
en |
heal.journalName |
INTERSPEECH 2006 and 9th International Conference on Spoken Language Processing, INTERSPEECH 2006 - ICSLP |
en |
dc.identifier.volume |
5 |
en |
dc.identifier.spage |
2458 |
en |
dc.identifier.epage |
2461 |
en |