HEAL DSpace

Transfer learning and attention-based conditioning methods for natural language processing

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Μαργατίνα, Αικατερίνη el
dc.contributor.author Margatina, Aikaterini en
dc.date.accessioned 2019-07-24T10:55:18Z
dc.date.available 2019-07-24T10:55:18Z
dc.date.issued 2019-07-24
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/49145
dc.identifier.uri http://dx.doi.org/10.26240/heal.ntua.16780
dc.rights Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/gr/ *
dc.subject Επεξεργασία φυσικής γλώσσας el
dc.subject Μηχανική μάθηση el
dc.subject Natural language processing en
dc.subject Machine learning en
dc.subject Neural networks en
dc.subject Sentiment analysis en
dc.title Transfer learning and attention-based conditioning methods for natural language processing en
heal.type bachelorThesis
heal.classification Επεξεργασία φυσικής γλώσσας el
heal.classification Natural language processing en
heal.language el
heal.language en
heal.access campus
heal.recordProvider ntua el
heal.publicationDate 2019-07-05
heal.abstract In this work, we investigate methods to augment the inductive bias of deep neural models for natural language processing tasks. Our goal is to improve performance of recurrent neural networks in a family of sentiment analysis tasks. Specifically, our research includes; (1) transferring knowledge from pretrained models in order to leverage different domains and tasks, and (2) integrating prior information from human experts to deep neural architectures. First, we propose a method for successfully utilizing a pretrained sentiment analysis classification model to reduce the test error rate on an emotion recognition classification task. Transfer learning from pretrained classifiers exploits the representation learned for one supervised setting with plenty of data, to obtain competitive results on a related task where a smaller dataset is available. We aim to leverage the learned representation of the pretrained sentiment model to tackle the emotion classification task. Next, we utilize pretrained representations from language models to address the same emotion classification task. In this case, the learning algorithm uses information obtained in the unsupervised phase to perform better in the supervised learning stage. Specifically, pretrained word representations captured by language models are useful as they encode contextual information and model syntax and semantics. We propose a three-step transfer learning method that includes pretraining a language model, fine-tuning the weights on the target task and transferring the model to a classifier to leverage these representations. We show an improvement of 10% on the WASSA 2018 emotion recognition dataset baseline. We achieve an F1-score of 70.3%, ranking in the top-3 positions of the shared task. Finally, we experiment with feature-wise conditioning methods to integrate prior knowledge into deep neural networks. We propose the integration of lexicon features into the self-attention mechanism of RNN-based architectures. This form of conditioning on the attention distribution, enforces the contribution of the most salient words for the task at hand. We introduce three methods, namely attentional concatenation, feature-based gating and affine transformation. Experiments on six benchmark datasets show the effectiveness of our methods. Attentional feature-based gating yields consistent performance improvement across tasks. Our approach is implemented as a simple add-on module for RNN-based models with minimal computational overhead and can be adapted to any deep neural architecture. Overall, our work is divided into two main research areas; the first is transfer learning methods of pretrained representations for implicit emotion recognition, while the second is attention-based conditioning methods for external knowledge integration into recurrent neural networks. Both works culminated into research papers, [25] and [83] respectively. en
heal.advisorName Ποταμιάνος, Αλέξανδρος el
heal.committeeMemberName Σταφυλοπάτης, Ανδρέας el
heal.committeeMemberName Τζαφέστας, Κωνσταντίνος el
heal.academicPublisher Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών. Τομέας Σημάτων, Ελέγχου και Ρομποτικής el
heal.academicPublisherID ntua
heal.numberOfPages 132 σ.
heal.fullTextAvailability true


Αρχεία σε αυτό το τεκμήριο

Οι παρακάτω άδειες σχετίζονται με αυτό το τεκμήριο:

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής

Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα Εκτός από όπου ορίζεται κάτι διαφορετικό, αυτή η άδεια περιγράφεται ως Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα