HEAL DSpace

Adaptive multimodal fusion by uncertainty compensation with application to audiovisual speech recognition

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Papandreou, G en
dc.contributor.author Katsamanis, A en
dc.contributor.author Pitsikalis, V en
dc.contributor.author Maragos, P en
dc.date.accessioned 2014-03-01T01:29:47Z
dc.date.available 2014-03-01T01:29:47Z
dc.date.issued 2009 en
dc.identifier.issn 1558-7916 en
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/19344
dc.subject Active appearance models (AAMS) en
dc.subject Audiovisual automatic speech recognition (AV-ASR) en
dc.subject Multimodal fusion en
dc.subject Uncertainty compensation en
dc.subject.classification Acoustics en
dc.subject.classification Engineering, Electrical & Electronic en
dc.subject.other Active appearance models en
dc.subject.other Active appearance models (AAMS) en
dc.subject.other Adaptivity en
dc.subject.other Audio features en
dc.subject.other Audio visual speech recognition en
dc.subject.other Audiovisual automatic speech recognition (AV-ASR) en
dc.subject.other Environmental conditions en
dc.subject.other Feature measurement en
dc.subject.other Learning rules en
dc.subject.other Measurement Noise en
dc.subject.other Multi-modal en
dc.subject.other Multimodal fusion en
dc.subject.other Multimodal integration en
dc.subject.other Multiple streams en
dc.subject.other On-stream en
dc.subject.other Person-independent en
dc.subject.other Uncertainty compensation en
dc.subject.other Uncertainty estimates en
dc.subject.other Uncertainty estimation en
dc.subject.other Visual feature extraction en
dc.subject.other Feature extraction en
dc.subject.other Remelting en
dc.subject.other Uncertainty analysis en
dc.subject.other Speech recognition en
dc.title Adaptive multimodal fusion by uncertainty compensation with application to audiovisual speech recognition en
heal.type journalArticle en
heal.identifier.primary 10.1109/TASL.2008.2011515 en
heal.identifier.secondary http://dx.doi.org/10.1109/TASL.2008.2011515 en
heal.language English en
heal.publicationDate 2009 en
heal.abstract While the accuracy of feature measurements heavily depends on changing environmental conditions, studying the consequences of this fact in pattern recognition tasks has received relatively little attention to date. In this paper, we explicitly take feature measurement uncertainty into account and show how multimodal classification and learning rules should be adjusted to compensate for its effects. Our approach is particularly fruitful in multimodal fusion scenarios, such as audiovisual speech recognition, where multiple streams of complementary time-evolving features are integrated. For such applications, provided that the measurement noise uncertainty for each feature stream can be estimated, the proposed framework leads to highly adaptive multimodal fusion rules which are easy and efficient to implement. Our technique is widely applicable and can be transparently integrated with either synchronous or asynchronous multimodal sequence integration architectures.We further show that multimodal fusion methods relying on stream weights can naturally emerge from our scheme under certain assumptions; this connection provides valuable insights into the adaptivity properties of our multimodal uncertainty compensation approach.We show how these ideas can be practically applied for audiovisual speech recognition. In this context, we propose improved techniques for person-independent visual feature extraction and uncertainty estimation with active appearance models, and also discuss how enhanced audio features along with their uncertainty estimates can be effectively computed. We demonstrate the efficacy of our approach in audiovisual speech recognition experiments on the CUAVE database using either synchronous or asynchronous multimodal integration models. © 2009 IEEE. en
heal.publisher IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC en
heal.journalName IEEE Transactions on Audio, Speech and Language Processing en
dc.identifier.doi 10.1109/TASL.2008.2011515 en
dc.identifier.isi ISI:000263639400003 en
dc.identifier.volume 17 en
dc.identifier.issue 3 en
dc.identifier.spage 423 en
dc.identifier.epage 435 en


Αρχεία σε αυτό το τεκμήριο

Αρχεία Μέγεθος Μορφότυπο Προβολή

Δεν υπάρχουν αρχεία που σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής