HEAL DSpace

Modeling naturalistic affective states via facial and vocal expressions recognition

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Caridakis, G en
dc.contributor.author Malatesta, L en
dc.contributor.author Kessous, L en
dc.contributor.author Amir, N en
dc.contributor.author Raouzaiou, A en
dc.contributor.author Karpouzis, K en
dc.date.accessioned 2014-03-01T02:44:06Z
dc.date.available 2014-03-01T02:44:06Z
dc.date.issued 2006 en
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/31678
dc.subject Affective interaction en
dc.subject Facial expression recognition en
dc.subject Image processing en
dc.subject Multimodal analysis en
dc.subject Naturalistic data en
dc.subject Prosodic feature extraction en
dc.subject User modeling en
dc.subject.other Artificial intelligence en
dc.subject.other Computer simulation languages en
dc.subject.other Feedback en
dc.subject.other Image recognition en
dc.subject.other Speech recognition en
dc.subject.other User interfaces en
dc.subject.other Affective states en
dc.subject.other Audiovisual material en
dc.subject.other Facial expressions recognition en
dc.subject.other Vocal expressions recognition en
dc.subject.other Human computer interaction en
dc.title Modeling naturalistic affective states via facial and vocal expressions recognition en
heal.type conferenceItem en
heal.identifier.primary 10.1145/1180995.1181029 en
heal.identifier.secondary http://dx.doi.org/10.1145/1180995.1181029 en
heal.publicationDate 2006 en
heal.abstract Affective and human-centered computing are two areas related to HCI which have attracted attention during the past years. One of the reasons that this may be attributed to, is the plethora of devices able to record and process multimodal input from the part of the users and adapt their functionality to their preferences or individual habits, thus enhancing usability and becoming attractive to users less accustomed with conventional interfaces. In the quest to receive feedback from the users in an unobtrusive manner, the visual and auditory modalities allow us to infer the users' emotional state, combining information both from facial expression recognition and speech prosody feature extraction. In this paper, we describe a multi-cue, dynamic approach in naturalistic video sequences. Contrary to strictly controlled recording conditions of audiovisual material, the current research focuses on sequences taken from nearly real world situations. Recognition is performed via a 'Simple Recurrent Network' which lends itself well to modeling dynamic events in both user's facial expressions and speech. Moreover this approach differs from existing work in that it models user expressivity using a dimensional representation of activation and valence, instead of detecting the usual 'universal emotions' which are scarce in everyday human-machine interaction. The algorithm is deployed on an audiovisual database which was recorded simulating human-human discourse and, therefore, contains less extreme expressivity and subtle variations of a number of emotion labels. Copyright 2006 ACM. en
heal.journalName ICMI'06: 8th International Conference on Multimodal Interfaces, Conference Proceeding en
dc.identifier.doi 10.1145/1180995.1181029 en
dc.identifier.spage 146 en
dc.identifier.epage 154 en


Αρχεία σε αυτό το τεκμήριο

Αρχεία Μέγεθος Μορφότυπο Προβολή

Δεν υπάρχουν αρχεία που σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής