HEAL DSpace

Multimodal fusion and learning with uncertain features applied to audiovisual speech recognition

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Papandreou, G en
dc.contributor.author Katsamanis, A en
dc.contributor.author Pitsikalis, V en
dc.contributor.author Maragos, P en
dc.date.accessioned 2014-03-01T02:44:51Z
dc.date.available 2014-03-01T02:44:51Z
dc.date.issued 2007 en
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/31975
dc.subject Audio Visual Speech Recognition en
dc.subject Measurement Noise en
dc.subject multimodal fusion en
dc.subject Speech Recognition en
dc.subject Time Varying en
dc.subject.other Audio visuals en
dc.subject.other Audiovisual speech recognitions en
dc.subject.other Complementary features en
dc.subject.other Learning rules en
dc.subject.other Measurement noises en
dc.subject.other Multimodal fusions en
dc.subject.other Multiple streams en
dc.subject.other On streams en
dc.subject.other Uncertain features en
dc.subject.other Signal processing en
dc.subject.other Speech analysis en
dc.subject.other Technical presentations en
dc.subject.other Uncertainty analysis en
dc.subject.other Speech recognition en
dc.title Multimodal fusion and learning with uncertain features applied to audiovisual speech recognition en
heal.type conferenceItem en
heal.identifier.primary 10.1109/MMSP.2007.4412868 en
heal.identifier.secondary http://dx.doi.org/10.1109/MMSP.2007.4412868 en
heal.identifier.secondary 4412868 en
heal.publicationDate 2007 en
heal.abstract We study the effect of uncertain feature measurements and show how classification and learning rules should be adjusted to compensate for it. Our approach is particularly fruitful in multimodal fusion scenarios, such as audio-visual speech recognition, where multiple streams of complementary features whose reliability is time-varying are integrated. For such applications, by taking the measurement noise uncertainty of each feature stream into account, the proposed framework leads to highly adaptive multimodal fusion rules for classification and learning which are widely applicable and easy to implement. We further show that previous multimodal fusion methods relying on stream weights fall under our scheme under certain assumptions; this provides novel insights into their applicability for various tasks and suggests new practical ways for estimating the stream weights adaptively. The potential of our approach is demonstrated in audio-visual speech recognition experiments. ©2007 IEEE. en
heal.journalName 2007 IEEE 9Th International Workshop on Multimedia Signal Processing, MMSP 2007 - Proceedings en
dc.identifier.doi 10.1109/MMSP.2007.4412868 en
dc.identifier.spage 264 en
dc.identifier.epage 267 en


Αρχεία σε αυτό το τεκμήριο

Αρχεία Μέγεθος Μορφότυπο Προβολή

Δεν υπάρχουν αρχεία που σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής