Οπτικοακουστική σύνθεση φωνής με χρήση κρυφών μαρκοβιανών μοντέλων

Φιλντίσης, Παναγιώτης Παρασκευάς; Filntisis, Panagiotis Paraskevas

dc.contributor.author	Φιλντίσης, Παναγιώτης Παρασκευάς	el
dc.contributor.author	Filntisis, Panagiotis Paraskevas	en
dc.date.accessioned	2016-03-09T13:25:36Z
dc.date.available	2016-03-09T13:25:36Z
dc.date.issued	2016-03-09
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/42129
dc.identifier.uri	http://dx.doi.org/10.26240/heal.ntua.10504
dc.rights	Default License
dc.subject	aam	en
dc.subject	hmm	en
dc.subject	Audiovisual speech synthesis	en
dc.subject	Οπτικοακουστική σύνθεση φωνής	el
dc.subject	Αναγνώριση προτύπων	el
dc.title	Οπτικοακουστική σύνθεση φωνής με χρήση κρυφών μαρκοβιανών μοντέλων	el
heal.type	bachelorThesis
heal.classification	Pattern recognition	el
heal.classificationURI	http://skos.um.es/unescothes/C02924
heal.language	el
heal.access	free
heal.recordProvider	ntua	el
heal.publicationDate	2015-10-30
heal.abstract	Στην παρούσα διπλωματική εργασία παρουσιάζεται ένα πλήρες οπτικοακουστικό σύστημα σύνθεσης φωνής για την Ελληνική γλώσσα. Κατά την υλοποίηση ενός τέτοιου συστήματος αντλούνται τεχνικές από διάφορους επιστημονικούς τομείς όπως η Μηχανική Μάθηση, η Επεξεργασία Σημάτων, και η Όραση Υπολογιστών. Εκκινώντας με την εισαγωγή, παρουσιάζουμε την ιστορική αναδρομή και τις σημαντικότερες μεθόδους για την υλοποίηση ενός οπτικοακουστικού συνθέτη φωνής. Εν συνεχεία, στα επόμενα κεφάλαια παρουσιάζεται η απαραίτητη θεωρητική ανάλυση για την υλοποίηση του οπτικοακουστικού συστήματος σύνθεσης φωνής, παράλληλα με τα πειραματικά αποτελέσματα που λήφθηκαν κατά την υλοποίηση και αξιολόγηση του συστήματος. Η αξιολόγηση του συστήματος είναι ιδιαίτερα ενθαρρυντική τόσο για την παραγόμενη ομιλία, όσο και για την παραγόμενη εικονοσειρά, ανοίγοντας διάπλατα τον δρόμο για την μετέπειτα εξέλιξη του συστήματος σε εφαρμογές όπως η συναισθηματική οπτικοακουστική σύνθεσης φωνής, μια πρώτη προσέγγιση και αξιολόγηση της οποίας κάνουμε στο τελευταίο Κεφάλαιο.	el
heal.abstract	In the present diploma thesis, we present a complete audiovisual text-to-speech synthesis system for the Greek language. During the implementation of such a system, we draw tools from a variety of scientific fields, such as Machine Learning, Signal Processing and Computer Vision. Starting with the introduction, we present the history and most important methods for the implementation of an audiovisual text-to-speech synthesis system. In the next chapters we present the necessary theoretical analysis for the implementation of the system, and at the same time we present our experimental results and evaluation. The evaluation of the system appears especially encouraging both for the synthetic speech and video, opening the way for the evolution of our system for applications such as emotional and expressive speech synthesis, on which we do a first approach and evaluation in the last Chapter.	en
heal.advisorName	Μαραγκός, Πέτρος	el
heal.committeeMemberName	Μαραγκός, Πέτρος	el
heal.committeeMemberName	Ποταμιάνος, Αλέξανδρος	el
heal.committeeMemberName	Πρωτόπαπας, Αθανάσιος	el
heal.academicPublisher	Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών. Τομέας Σημάτων, Ελέγχου και Ρομποτικής	el
heal.academicPublisherID	ntua
heal.numberOfPages	95 σ.
heal.fullTextAvailability	true