Εντοπισμός επιθετικών παραδειγμάτων σε συνελικτικά νευρωνικά δίκτυα

Περτιγκιόζογλου, Στέφανος; Pertigkiozoglou, Stefanos

dc.contributor.author	Περτιγκιόζογλου, Στέφανος	el
dc.contributor.author	Pertigkiozoglou, Stefanos	en
dc.date.accessioned	2018-09-27T10:07:32Z
dc.date.available	2018-09-27T10:07:32Z
dc.date.issued	2018-09-27
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/47692
dc.identifier.uri	http://dx.doi.org/10.26240/heal.ntua.15952
dc.rights	Default License
dc.subject	Μηχανική μάθηση	el
dc.subject	Νευρωνικά δίκτυα	el
dc.subject	Όραση υπολογιστών	el
dc.subject	Επεξεργασία εικόνας	el
dc.subject	Επιθετικά παραδείγματα	el
dc.subject	Machine learning	en
dc.subject	Computer vision	en
dc.subject	Image processing	en
dc.subject	Neural networks	en
dc.subject	Adversarial examples	en
dc.title	Εντοπισμός επιθετικών παραδειγμάτων σε συνελικτικά νευρωνικά δίκτυα	el
heal.type	bachelorThesis
heal.classification	Αντίληψη και όραση υπολογιστών	el
heal.classification	Προχωρημένη μηχανική μάθηση	el
heal.classificationURI	http://data.seab.gr/concepts/12c1c913dbe758d67c4c509a6768bdbc7905830c
heal.classificationURI	http://data.seab.gr/concepts/d5cf140063d31fceb414be6c8dcb4654ffd3efcf
heal.language	el
heal.access	campus
heal.recordProvider	ntua	el
heal.publicationDate	2018-07-24
heal.abstract	Τα εντυπωσιακά αποτελέσματα που επιτυγχάνονται με την χρήση βαθιών νευρωνικών δικτύων, έχουν ως αποτέλεσμα την μεγάλη εξάπλωση της χρήσης τους σε πολλές εφαρμογές μηχανικής μάθησης. Τα συστήματα αυτά όμως είναι ευάλωτα σε ειδικά κατασκευασμένες εισόδους, τα επιθετικά παραδείγματα, τα οποία ενώ δεν γίνονται εύκολα αντιληπτά από ανθρώπινους παρατηρητές, οδηγούν τα νευρωνικά δίκτυα σε λανθασμένα συμπεράσματα. Η εργασία αυτή εκτελεί ανάλυση των βασικών χαρακτηριστικών των επιθετικών παραδειγμάτων για συνελικτικά νευρωνικά δίκτυα τα οποία χρησιμοποιούνται σε εφαρμογές αναγνώρισης εικόνων, δίνοντας ιδιαίτερη βαρύτητα στα επιθετικά παραδείγματα που παράγονται από την Fast Gradient Sign Method, ενώ παράλληλα προτείνει τρεις νέες μεθόδους οι οποίες στοχεύουν στον εντοπισμό πιθανών επιθετικών παραδειγμάτων. Η πρώτη από τις προτεινόμενες μεθόδους βασίζεται στην ομαλοποίηση του διανύσματος χαρακτηριστικών που παράγει σαν έξοδο το νευρωνικό δίκτυο, προκειμένου να εντοπίσει πιθανά επιθετικά παραδείγματα. Η δεύτερη μέθοδος κάνει χρήση των ιστογραμμάτων των τιμών των εξόδων των ενδιάμεσων επιπέδων του νευρωνικού δικτύου, για να συμπεράνει αν η είσοδος του δικτύου αποτελεί επιθετικό παράδειγμα. Τα ιστογράμματα αυτά συγκροτούν ένα διάνυσμα χαρακτηριστικών το οποίο εισάγεται σαν είσοδος σε ένα SVM που ταξινομεί την αρχική είσοδο είτε ως επιθετικό παράδειγμα είτε ως πραγματική είσοδο. Τέλος για την τρίτη μέθοδο παρουσιάζουμε την έννοια τη υπολειπόμενης εικόνας, η οποία περιέχει πληροφορία σχετικά με τα μέρη του προτύπου εισόδου τα οποία αγνοούνται από το νευρωνικό δίκτυο. Η μέθοδος αυτή στοχεύει στον εντοπισμό πιθανών επιθετικών εικόνων, χρησιμοποιώντας την πληροφορία που παρέχει η υπολειπόμενη εικόνα και ενισχύοντας τα μέρη του προτύπου εισόδου που αγνοούνται από το νευρωνικό δίκτυο. Για τις τρεις μεθόδους που προτείνονται, παρουσιάζονται τα αποτελέσματα εντοπισμού επιθετικών παραδειγμάτων σε ένα νευρωνικό δίκτυο εκπαιδευμένο στο MNIST σύνολο δεδομένων. Επίσης για την τρίτη μέθοδο παρουσιάζονται αποτελέσματα εντοπισμού επιθετικών παραδειγμάτων και σε ένα νευρωνικό δίκτυο εκπαιδευμένο στο CIFAR-10 σύνολο δεδομένων. Τέλος παρουσιάζεται η δυνατότητα συνδυασμού των μεθόδων για περαιτέρω ενίσχυση των αποτελεσμάτων εντοπισμού.	el
heal.abstract	The great success of deep neural networks, has caused a massive spread of the use of such systems in a large variety of machine learning applications. However these systems are vulnerable to certain inputs, the adversarial examples, which although are not easily perceived by humans can lead a neural network to produce faulty results. This thesis analyzes the basic characteristics of adversarial examples for convolutional neural networks which are used in image recognition applications, emphasizing particularly in adversarial examples produced by the Fast Gradient Sign Method, while at the same time proposes three new methods for detecting possible adversarial examples. The first of the proposed methods is based on the regularization of the feature vector that the neural network produces as an output, in order to detect possible adversarial examples. The second method uses the histograms of the values from the outputs of the hidden layers of the neural network, in order to detect adversarial examples. These histograms create a feature vector which is the input of an SVM that classifies the original input either as an adversarial example or as a real input. Finally for the third method we introduce the concept of the residual image, which contains information about the parts of the input pattern that are ignored by the neural network. This method aims in the detection of possible adversarial examples, by using the residual image and reinforcing the parts of the input pattern that are ignored by the neural network. For the three proposed methods we present the results of detecting adversarial examples in a convolutional neural networks trained in the MNIST dataset. Furthermore for the third method we present the results of detecting adversarial examples in a convolutional neural network trained in the CIFAR-10 dataset. Finally the possibility of combining these methods is presented as a way to further boost the results of detection.	en
heal.advisorName	Μαραγκός, Πέτρος	el
heal.committeeMemberName	Ποταμιάνος, Γεράσιμος	el
heal.committeeMemberName	Ψυλλάκης, Χαράλαμπος	el
heal.committeeMemberName	Μαραγκός, Πέτρος	el
heal.academicPublisher	Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών. Τομέας Σημάτων, Ελέγχου και Ρομποτικής	el
heal.academicPublisherID	ntua
heal.numberOfPages	84 σ.
heal.fullTextAvailability	true