dc.contributor.author |
Μηναΐδη, Μαρία Νεκταρία
|
el |
dc.contributor.author |
Minaidi, Maria Nektaria
|
el |
dc.date.accessioned |
2023-05-17T06:43:21Z |
|
dc.date.available |
2023-05-17T06:43:21Z |
|
dc.identifier.uri |
https://dspace.lib.ntua.gr/xmlui/handle/123456789/57713 |
|
dc.identifier.uri |
http://dx.doi.org/10.26240/heal.ntua.25410 |
|
dc.description |
Εθνικό Μετσόβιο Πολυτεχνείο--Μεταπτυχιακή Εργασία. Διεπιστημονικό-Διατμηματικό Πρόγραμμα Μεταπτυχιακών Σπουδών (Δ.Π.Μ.Σ.) |
el |
dc.rights |
Default License |
|
dc.subject |
Βαθιά Μάθηση |
el |
dc.subject |
Αυτόματη Περίληψη Βίντεο |
el |
dc.subject |
Δίκτυα Μακράς και Βραχείας Μνήμης |
el |
dc.subject |
Παραγωγικά Ανταγωνιστικά Δίκτυα |
el |
dc.subject |
Μηχανισμός Προσοχής |
el |
dc.subject |
Deep Learning |
en |
dc.subject |
Automatic Video Summarization |
el |
dc.subject |
Long Short-Term Memory Networks |
el |
dc.subject |
Attention Mechanism |
el |
dc.subject |
Generative Adversarial Networks |
el |
dc.title |
Self-Attention Based Generative Adversarial Networks for Unsupervised Video Summarization |
en |
dc.title |
Παραγωγικά Ανταγωνιστικά Δίκτυα με βάση την Αυτό-Προσοχή για την Μη-Επιβλεπόμενη Περίληψη Βίντεο |
el |
dc.contributor.department |
Speech and Language Processing Group |
el |
heal.type |
masterThesis |
|
heal.classification |
Βαθιά Μηχανική Μάθηση |
el |
heal.classification |
Deep Learning |
el |
heal.language |
el |
|
heal.language |
en |
|
heal.access |
free |
|
heal.recordProvider |
ntua |
el |
heal.publicationDate |
2022-11-08 |
|
heal.abstract |
In this diploma thesis we tackle the topic of video summarization based on unsupervised learning and attention networks. In today’s era, the amount of data that is generated on a daily basis is increasing at an exponential rate. Given this growth, the need for users to select, browse, and consume such extensive collections of videos, as well as efficiently store the large amounts of data, is increasing. In order to meet these needs, automatic video
summarization, which aims to provide a short visual summary of an original, full-length video, is considered necessary and is being researched. Given the recent development of neural networks, many video summarization architectures based on deep neural networks have been proposed in the recent years. In this work,
we tackle video summarization as a problem of selecting the most characteristic key-shots (sequence of consecutive frames) and use deep learning techniques and generative adversarial networks to build a model that efficiently summarizes the input videos. The visual content of each video is modeled as a feature vector of the visual information of each frame. Firstly, motivated by the desire to overcome the disadvantages of Long Short-Term Memory Networks, as well as to exploit the advantages of attention mechanisms, we build our model by extending a simple generative adversarial network, incorporating into it attention mechanisms in different parts of the architecture. Then, by running a set of experiments on the resulting models that act as an ablation study, we determine the importance of incorporating attention and improving the temporal modeling of the frames,
for the selection of key-shots and improving the efficiency of our architecture. Finally, we evaluate the above models on two popular datasets, which consist of short videos and have been extensively used to train and evaluate video summarization models. Additionally, relying on one more database, we create an additional dataset, consisting of longer videos, on which we evaluate our models. The generalizability of our model, as well
as the use of attention mechanisms, are judged effective in each case, as the results showcase that using self-attention mechanisms as the frame selection mechanism outperforms the
state-of-the-art approaches on SumMe and TVSum. |
en |
heal.advisorName |
Ποταμιάνος, Αλέξανδρος |
el |
heal.committeeMemberName |
Τζαφέστας, Κωνσταντίνος |
el |
heal.committeeMemberName |
Σιόλας, Γεώργιος |
el |
heal.academicPublisher |
Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών |
el |
heal.academicPublisherID |
ntua |
|
heal.numberOfPages |
98 σ. |
el |
heal.fullTextAvailability |
false |
|