HEAL DSpace

Audiovisual speech inversion by switching dynamical modeling governed by a Hidden Markov process

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Katsamanis, A en
dc.contributor.author Ananthakrishnan, G en
dc.contributor.author Papandreou, G en
dc.contributor.author Maragos, P en
dc.contributor.author Engwall, O en
dc.date.accessioned 2014-03-01T02:45:09Z
dc.date.available 2014-03-01T02:45:09Z
dc.date.issued 2008 en
dc.identifier.issn 22195491 en
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/32168
dc.relation.uri http://www.scopus.com/inward/record.url?eid=2-s2.0-84863731362&partnerID=40&md5=3dc13d3b075c5904501658f11ce8135c en
dc.relation.uri http://cvsp.cs.ntua.gr/publications/confr/KAPME_KalmanHMMInversion_eusipco08.pdf en
dc.relation.uri http://www.speech.kth.se/prod/publications/files/3258.pdf en
dc.relation.uri http://www.eurasip.org/Proceedings/Eusipco/Eusipco2008/papers/1569105532.pdf en
dc.relation.uri http://cvsp.cs.ntua.gr/publications/confr/KatsamanisAnanthPapandreouMaragosEngwall_AV-Speechinvers-SwitchDynModel-HidMarkov_EUSIPCO2008.pdf en
dc.subject Active Appearance Model en
dc.subject Dynamic Model en
dc.subject Hidden Markov Process en
dc.subject Inverse Problem en
dc.subject Linear Dynamical System en
dc.subject Prediction Error en
dc.subject Radial Basis Function en
dc.subject Root Mean Square Error en
dc.subject Support Vector Machine en
dc.subject Visual Analysis en
dc.subject mel frequency cepstral coefficient en
dc.subject Markov Model en
dc.subject.other Active appearance models en
dc.subject.other Audio-visual speech en
dc.subject.other Classification analysis en
dc.subject.other Correlation coefficient en
dc.subject.other Dynamical modeling en
dc.subject.other Evaluation scheme en
dc.subject.other Hidden Markov process en
dc.subject.other Inversion problems en
dc.subject.other Mel-frequency cepstral coefficients en
dc.subject.other Prediction errors en
dc.subject.other Radial basis functions en
dc.subject.other Root mean squared errors en
dc.subject.other State sequences en
dc.subject.other Switching linear dynamical systems en
dc.subject.other Unified framework en
dc.subject.other Visual analysis en
dc.subject.other Hidden Markov models en
dc.subject.other Linear control systems en
dc.subject.other Radial basis function networks en
dc.subject.other Signal processing en
dc.subject.other Image segmentation en
dc.title Audiovisual speech inversion by switching dynamical modeling governed by a Hidden Markov process en
heal.type conferenceItem en
heal.publicationDate 2008 en
heal.abstract We propose a unified framework to recover articulation from audiovisual speech. The nonlinear audiovisual-to-articulatory mapping is modeled by means of a switching linear dynamical system. Switching is governed by a state sequence determined via a Hidden Markov Model alignment process. Mel Frequency Cepstral Coefficients are extracted from audio while visual analysis is performed using Active Appearance Models. The articulatory state is represented by the coordinates of points on important articulators, e.g., tongue and lips. To evaluate our inversion approach, instead of just using the conventional correlation coefficients and root mean squared errors, we introduce a novel evaluation scheme that is more specific to the inversion problem. Prediction errors in the positions of the articulators are weighted differently depending on their relevant importance in the production of the corresponding sound. The applied weights are determined by an articulatory classification analysis using Support Vector Machines with a radial basis function kernel. Experiments are conducted in the audiovisual-articulatory MOCHA database. copyright by EURASIP. en
heal.journalName European Signal Processing Conference en


Αρχεία σε αυτό το τεκμήριο

Αρχεία Μέγεθος Μορφότυπο Προβολή

Δεν υπάρχουν αρχεία που σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής