dc.contributor.author |
Μαυρεπής, Φίλιππος
|
el |
dc.contributor.author |
Philippos, Mavrepis
|
en |
dc.date.accessioned |
2023-09-06T09:25:23Z |
|
dc.date.available |
2023-09-06T09:25:23Z |
|
dc.identifier.uri |
https://dspace.lib.ntua.gr/xmlui/handle/123456789/58027 |
|
dc.identifier.uri |
http://dx.doi.org/10.26240/heal.ntua.25724 |
|
dc.rights |
Default License |
|
dc.subject |
Knowledge Distillation |
en |
dc.subject |
Deep Learning |
en |
dc.subject |
Teacher-Student Architectures |
en |
dc.subject |
Απόσταξη Γνώσης |
el |
dc.subject |
Αρχιτεκτονικές Δασκάλου-Μαθητή |
el |
dc.subject |
Βαθιά Μάθηση |
el |
dc.title |
Exploiting feature-based and logit-based knowledge distillation for
improved teacher-student deep neural networks |
en |
dc.contributor.department |
Remote Sensing Lab (RSLab) |
el |
heal.type |
bachelorThesis |
|
heal.classification |
Computer Science |
en |
heal.language |
el |
|
heal.language |
en |
|
heal.access |
campus |
|
heal.recordProvider |
ntua |
el |
heal.publicationDate |
2023-02-26 |
|
heal.abstract |
Knowledge distillation is one of the techniques used to transfer knowledge between two or more networks. Usually, those networks are referred to as student(s) and teacher(s), with the main goal being to increase the metric of performance for the student(s) through exploitation of the knowledge from the teacher network. Previous studies have explored transferring knowledge through feature-based and/or logit-based approaches. Besides that, the utilisation of same and cross-level information between teacher and student networks has also been explored. This diploma thesis examines the combination of state of the art methods in feature- based and logit-based knowledge distillation. The techniques used are, 'Distillation via knowledge Review' also known as 'ReviewKD' and 'Decoupled Knowledge Distillation' or 'DKD'. Those method were merged to create a novel technique named 'ReviewDKD' and test its performance. In addition, we explore the effect of data augmentation techniques such as MixUp and propose a novel way to apply MixUp to teacher and student networks. We apply our method to a variety of teacher-student architectures for the problem of image classification on CIFAR-100. To this end our result show relatively promising results for specific architecture pairs with the student being able to surpass the teacher network at some cases. |
en |
heal.advisorName |
Karantzalos, Konstantinos
|
|
heal.advisorName |
Kakogeorgiou, Ioannis
|
|
heal.committeeMemberName |
Karantzalos, Konstantinos
|
|
heal.committeeMemberName |
Stamou, Giorgos
|
|
heal.committeeMemberName |
Voulodimos, Athanasios
|
|
heal.academicPublisher |
Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών |
el |
heal.academicPublisherID |
ntua |
|
heal.numberOfPages |
62 |
|
heal.fullTextAvailability |
false |
|