HEAL DSpace

Exploiting feature-based and logit-based knowledge distillation for improved teacher-student deep neural networks

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Μαυρεπής, Φίλιππος el
dc.contributor.author Philippos, Mavrepis en
dc.date.accessioned 2023-09-06T09:25:23Z
dc.date.available 2023-09-06T09:25:23Z
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/58027
dc.identifier.uri http://dx.doi.org/10.26240/heal.ntua.25724
dc.rights Default License
dc.subject Knowledge Distillation en
dc.subject Deep Learning en
dc.subject Teacher-Student Architectures en
dc.subject Απόσταξη Γνώσης el
dc.subject Αρχιτεκτονικές Δασκάλου-Μαθητή el
dc.subject Βαθιά Μάθηση el
dc.title Exploiting feature-based and logit-based knowledge distillation for improved teacher-student deep neural networks en
dc.contributor.department Remote Sensing Lab (RSLab) el
heal.type bachelorThesis
heal.classification Computer Science en
heal.language el
heal.language en
heal.access campus
heal.recordProvider ntua el
heal.publicationDate 2023-02-26
heal.abstract Knowledge distillation is one of the techniques used to transfer knowledge between two or more networks. Usually, those networks are referred to as student(s) and teacher(s), with the main goal being to increase the metric of performance for the student(s) through exploitation of the knowledge from the teacher network. Previous studies have explored transferring knowledge through feature-based and/or logit-based approaches. Besides that, the utilisation of same and cross-level information between teacher and student networks has also been explored. This diploma thesis examines the combination of state of the art methods in feature- based and logit-based knowledge distillation. The techniques used are, 'Distillation via knowledge Review' also known as 'ReviewKD' and 'Decoupled Knowledge Distillation' or 'DKD'. Those method were merged to create a novel technique named 'ReviewDKD' and test its performance. In addition, we explore the effect of data augmentation techniques such as MixUp and propose a novel way to apply MixUp to teacher and student networks. We apply our method to a variety of teacher-student architectures for the problem of image classification on CIFAR-100. To this end our result show relatively promising results for specific architecture pairs with the student being able to surpass the teacher network at some cases. en
heal.advisorName Karantzalos, Konstantinos
heal.advisorName Kakogeorgiou, Ioannis
heal.committeeMemberName Karantzalos, Konstantinos
heal.committeeMemberName Stamou, Giorgos
heal.committeeMemberName Voulodimos, Athanasios
heal.academicPublisher Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών el
heal.academicPublisherID ntua
heal.numberOfPages 62
heal.fullTextAvailability false


Αρχεία σε αυτό το τεκμήριο

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής