Πολυτροπική βαθιά μάθηση για την αναγνώριση συναισθημάτων και τη σύνθεση εκφράσεων προσώπου με εφαρμογές στην αλληλεπίδραση ανθρώπου-ρομπότ

Φιλντίσης, Παναγιώτης-Παρασκευάς; Filntisis, Panagiotis Paraskevas

dc.contributor.author	Φιλντίσης, Παναγιώτης-Παρασκευάς	el
dc.contributor.author	Filntisis, Panagiotis Paraskevas	en
dc.date.accessioned	2023-02-15T09:52:37Z
dc.date.available	2023-02-15T09:52:37Z
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/57139
dc.identifier.uri	http://dx.doi.org/10.26240/heal.ntua.24837
dc.rights	Default License
dc.subject	Αναγνώριση Συναισθήματος	el
dc.subject	Βαθιά Μάθηση	el
dc.subject	Σύνθεση Εκφράσεων	el
dc.subject	Τρισδιάστατη Ανακατασκευή	el
dc.subject	Όραση Υπολογιστών	el
dc.subject	Affective Computing	en
dc.subject	Computer Vision	en
dc.subject	Expression Synthesis	en
dc.subject	Emotion Recognition	en
dc.subject	Deep Learning	en
dc.title	Πολυτροπική βαθιά μάθηση για την αναγνώριση συναισθημάτων και τη σύνθεση εκφράσεων προσώπου με εφαρμογές στην αλληλεπίδραση ανθρώπου-ρομπότ	el
dc.title	Multimodal Deep Learning for Emotion Recognition and Expression Synthesis with Applications in Human-Robot Interaction	en
dc.contributor.department	Σημάτων, Ελέγχου και Ρομποτικής	el
heal.type	doctoralThesis
heal.secondaryTitle	Multimodal Deep Learning for Emotion Recognition and Expression Synthesis with Applications in Human-Robot Interaction	el
heal.classification	Affective Computing	en
heal.classification	Computer Vision	en
heal.classification	Machine Learning	en
heal.language	el
heal.language	en
heal.access	free
heal.recordProvider	ntua	el
heal.publicationDate	2022-11-07
heal.abstract	Ο χώρος του affective computing είναι ένας συναρπαστικός καινούργιος τομέας έρευνας που έχει ως στόχο να παρέχει στους υπολογιστές και τα ρομπότ τη δυνατότητα αναγνώρισης, έκφρασης, μοντελοποίησης αλλά και «αίσθησης» συναισθημάτων. Το διεπιστημονικό αυτό πεδίο του affective computing αντλεί πόρους από την επιστήμη των υπολογιστών, τα μαθηματικά, τις γνωσιακές επιστήμες και την ψυχολογία. Σε αυτή τη διατριβή, η οποία χωρίζεται σε δύο κύρια μέρη, διερευνούμε δύο πτυχές του συγκεκριμένου πεδίου: η πρώτη πτυχή είναι η «αναγνώριση συναισθήματος» και η δεύτερη είναι η «σύνθεση της έκφρασης του συναισθήματος». Οι δύο αυτές κατευθύνσεις αποτελούν τις πιο σημαντικές πτυχές που πρέπει να εξετάσει κανείς όταν χτίζει συστήματα αλληλεπίδρασης ανθρώπου-ρομπότ. Για το σκοπό αυτό, στο πρώτο μέρος, διερευνούμε και μελετάμε διάφορα κανάλια που περιέχουν πολύτιμες πληροφορίες για την αναγνώριση των συναισθημάτων ενός ανθρώπου και σχεδιάζουμε αρχιτεκτονικές βασισμένες σε βαθιά μάθηση, που μπορούν να συνδυάσουν αποτελεσματικά πληροφορίες από τα κανάλια αυτα, με απώτερο στόχο την ανάπτυξη ενός συστήματος για σενάρια αλληλεπίδρασης ανθρώπου/παιδιού με ρομπότ. Ενώ η αναγνώριση συναισθημάτων στο παρελθόν έχει ως επί το πλείστον επικεντρωθεί στις εκφράσεις του προσώπου και στην ομιλία, κατά τη διατριβή αυτή λαμβάνουμε υπόψη τη γλώσσα του σώματος του ανθρώπου, τη σκηνή στην οποία βρίσκεται, καθώς και τη σημασιολογική έννοια των συναισθημάτων. Στο δεύτερο μέρος, αρχικά ενισχύουμε τις υπάρχουσες μεθόδους για τη σύνθεση της οπτικοακουστικής ομιλίας, δίνοντάς τους τη δυνατότητα να συνδυάζουν και να εκφράζουν συναισθήματα σε διαφορετικά επίπεδα έντασης. Στη συνέχεια, σχεδιάζουμε ένα μοντέλο βαθιάς μάθησης με σκοπό τη σύνθεση οπτικοακουστικής ομιλίας που επιτυγχάνει υψηλό επίπεδο ρεαλισμού και εκφραστικότητας, ξεπερνώντας τις προηγούμενες μεθόδους. Τέλος, παρουσιάζουμε την πρώτη μέθοδο για τρισδιάστατη ανακατασκευή προσώπων από βίντεο μίας όψης, με έμφαση στα χαρακτηριστικά και τη γεωμετρία του στόματος κατά την ομιλία. Η μέθοδος αυτή παρακάμπτει τη συνηθισμένη απαίτηση για κοπιώδη συλλογή μεγάλου πλήθους τρισδιάστατων δεδομένων υψηλής πιστότητας, προσφέροντας έναν εύκολο και καινοτόμο τρόπο για την απόκτηση τρισδιάστατων δεδομένων εκφραστικών προσώπων από βίντεο.	el
heal.abstract	Affective computing is an exciting new research area with the goal of equipping computers and robots with the capability of recognizing, expressing, modeling, and even ``feeling" emotions. An interdisciplinary field, affective computing draws resources from computer science, mathematics, cognitive sciences, and psychology. In this thesis, which is split into two major parts, we explore two aspects of affective computing; namely ``emotion recognition" and ``expression synthesis" since they constitute the most important aspects one needs to consider when building human-robot interaction systems. To this end, in the first part, we explore and study various information streams that contain valuable information for recognizing the emotions of a human, and design architectures based on deep learning, that can efficiently combine information from these streams, with the ultimate goal of deploying the system for human-robot interaction, with an emphasis in child-robot interaction scenarios. While traditional approaches for emotion recognition have mostly focused on facial expressions and speech, we take into account the body language the context, and also employ embeddings that accurately capture the semantic distances of discrete emotions. In the second part, we first enhance existing methods for audiovisual speech synthesis, by giving them the capabilities to both combine, and express emotions in different intensity levels. Then, we design a deep learning-based architecture for expressive audiovisual speech synthesis which achieves a high level of realism and expressiveness, outperforming previous methods. Lastly, we present the first method for visual speech aware monocular perceptual 3D reconstruction in the wild. This work tackles the traditional bottleneck of data collection for high-fidelity 3D ground truth data and offers the field of affective computing a way for easier acquisition of expressive 3D facial data data from monocular videos.	en
heal.advisorName	Μαραγκός, Πέτρος	el
heal.advisorName	Petros, Maragos	en
heal.committeeMemberName	Potamianos, Alexandros	en
heal.committeeMemberName	Potamianos, Gerasimos	en
heal.committeeMemberName	Tzafestas, Costas	en
heal.committeeMemberName	Katsamanis, Athanasios	en
heal.committeeMemberName	Kollias, Stefanos	en
heal.committeeMemberName	Maragos, Petros	en
heal.committeeMemberName	Roussos, Anastasios	en
heal.academicPublisher	Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών. Τομέας Σημάτων, Ελέγχου και Ρομποτικής	el
heal.academicPublisherID	ntua
heal.fullTextAvailability	false