HEAL DSpace

Self-Attention Based Generative Adversarial Networks for Unsupervised Video Summarization

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Μηναΐδη, Μαρία Νεκταρία el
dc.contributor.author Minaidi, Maria Nektaria el
dc.date.accessioned 2023-05-17T06:43:21Z
dc.date.available 2023-05-17T06:43:21Z
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/57713
dc.identifier.uri http://dx.doi.org/10.26240/heal.ntua.25410
dc.description Εθνικό Μετσόβιο Πολυτεχνείο--Μεταπτυχιακή Εργασία. Διεπιστημονικό-Διατμηματικό Πρόγραμμα Μεταπτυχιακών Σπουδών (Δ.Π.Μ.Σ.) el
dc.rights Default License
dc.subject Βαθιά Μάθηση el
dc.subject Αυτόματη Περίληψη Βίντεο el
dc.subject Δίκτυα Μακράς και Βραχείας Μνήμης el
dc.subject Παραγωγικά Ανταγωνιστικά Δίκτυα el
dc.subject Μηχανισμός Προσοχής el
dc.subject Deep Learning en
dc.subject Automatic Video Summarization el
dc.subject Long Short-Term Memory Networks el
dc.subject Attention Mechanism el
dc.subject Generative Adversarial Networks el
dc.title Self-Attention Based Generative Adversarial Networks for Unsupervised Video Summarization en
dc.title Παραγωγικά Ανταγωνιστικά Δίκτυα με βάση την Αυτό-Προσοχή για την Μη-Επιβλεπόμενη Περίληψη Βίντεο el
dc.contributor.department Speech and Language Processing Group el
heal.type masterThesis
heal.classification Βαθιά Μηχανική Μάθηση el
heal.classification Deep Learning el
heal.language el
heal.language en
heal.access free
heal.recordProvider ntua el
heal.publicationDate 2022-11-08
heal.abstract In this diploma thesis we tackle the topic of video summarization based on unsupervised learning and attention networks. In today’s era, the amount of data that is generated on a daily basis is increasing at an exponential rate. Given this growth, the need for users to select, browse, and consume such extensive collections of videos, as well as efficiently store the large amounts of data, is increasing. In order to meet these needs, automatic video summarization, which aims to provide a short visual summary of an original, full-length video, is considered necessary and is being researched. Given the recent development of neural networks, many video summarization architectures based on deep neural networks have been proposed in the recent years. In this work, we tackle video summarization as a problem of selecting the most characteristic key-shots (sequence of consecutive frames) and use deep learning techniques and generative adversarial networks to build a model that efficiently summarizes the input videos. The visual content of each video is modeled as a feature vector of the visual information of each frame. Firstly, motivated by the desire to overcome the disadvantages of Long Short-Term Memory Networks, as well as to exploit the advantages of attention mechanisms, we build our model by extending a simple generative adversarial network, incorporating into it attention mechanisms in different parts of the architecture. Then, by running a set of experiments on the resulting models that act as an ablation study, we determine the importance of incorporating attention and improving the temporal modeling of the frames, for the selection of key-shots and improving the efficiency of our architecture. Finally, we evaluate the above models on two popular datasets, which consist of short videos and have been extensively used to train and evaluate video summarization models. Additionally, relying on one more database, we create an additional dataset, consisting of longer videos, on which we evaluate our models. The generalizability of our model, as well as the use of attention mechanisms, are judged effective in each case, as the results showcase that using self-attention mechanisms as the frame selection mechanism outperforms the state-of-the-art approaches on SumMe and TVSum. en
heal.advisorName Ποταμιάνος, Αλέξανδρος el
heal.committeeMemberName Τζαφέστας, Κωνσταντίνος el
heal.committeeMemberName Σιόλας, Γεώργιος el
heal.academicPublisher Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών el
heal.academicPublisherID ntua
heal.numberOfPages 98 σ. el
heal.fullTextAvailability false


Αρχεία σε αυτό το τεκμήριο

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής