Transfer learning and attention-based conditioning methods for natural language processing

Μαργατίνα, Αικατερίνη; Margatina, Aikaterini

dc.contributor.author	Μαργατίνα, Αικατερίνη	el
dc.contributor.author	Margatina, Aikaterini	en
dc.date.accessioned	2019-07-24T10:55:18Z
dc.date.available	2019-07-24T10:55:18Z
dc.date.issued	2019-07-24
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/49145
dc.identifier.uri	http://dx.doi.org/10.26240/heal.ntua.16780
dc.rights	Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/gr/	*
dc.subject	Επεξεργασία φυσικής γλώσσας	el
dc.subject	Μηχανική μάθηση	el
dc.subject	Natural language processing	en
dc.subject	Machine learning	en
dc.subject	Neural networks	en
dc.subject	Sentiment analysis	en
dc.title	Transfer learning and attention-based conditioning methods for natural language processing	en
heal.type	bachelorThesis
heal.classification	Επεξεργασία φυσικής γλώσσας	el
heal.classification	Natural language processing	en
heal.language	el
heal.language	en
heal.access	campus
heal.recordProvider	ntua	el
heal.publicationDate	2019-07-05
heal.abstract	In this work, we investigate methods to augment the inductive bias of deep neural models for natural language processing tasks. Our goal is to improve performance of recurrent neural networks in a family of sentiment analysis tasks. Specifically, our research includes; (1) transferring knowledge from pretrained models in order to leverage different domains and tasks, and (2) integrating prior information from human experts to deep neural architectures. First, we propose a method for successfully utilizing a pretrained sentiment analysis classification model to reduce the test error rate on an emotion recognition classification task. Transfer learning from pretrained classifiers exploits the representation learned for one supervised setting with plenty of data, to obtain competitive results on a related task where a smaller dataset is available. We aim to leverage the learned representation of the pretrained sentiment model to tackle the emotion classification task. Next, we utilize pretrained representations from language models to address the same emotion classification task. In this case, the learning algorithm uses information obtained in the unsupervised phase to perform better in the supervised learning stage. Specifically, pretrained word representations captured by language models are useful as they encode contextual information and model syntax and semantics. We propose a three-step transfer learning method that includes pretraining a language model, fine-tuning the weights on the target task and transferring the model to a classifier to leverage these representations. We show an improvement of 10% on the WASSA 2018 emotion recognition dataset baseline. We achieve an F1-score of 70.3%, ranking in the top-3 positions of the shared task. Finally, we experiment with feature-wise conditioning methods to integrate prior knowledge into deep neural networks. We propose the integration of lexicon features into the self-attention mechanism of RNN-based architectures. This form of conditioning on the attention distribution, enforces the contribution of the most salient words for the task at hand. We introduce three methods, namely attentional concatenation, feature-based gating and affine transformation. Experiments on six benchmark datasets show the effectiveness of our methods. Attentional feature-based gating yields consistent performance improvement across tasks. Our approach is implemented as a simple add-on module for RNN-based models with minimal computational overhead and can be adapted to any deep neural architecture. Overall, our work is divided into two main research areas; the first is transfer learning methods of pretrained representations for implicit emotion recognition, while the second is attention-based conditioning methods for external knowledge integration into recurrent neural networks. Both works culminated into research papers, [25] and [83] respectively.	en
heal.advisorName	Ποταμιάνος, Αλέξανδρος	el
heal.committeeMemberName	Σταφυλοπάτης, Ανδρέας	el
heal.committeeMemberName	Τζαφέστας, Κωνσταντίνος	el
heal.academicPublisher	Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών. Τομέας Σημάτων, Ελέγχου και Ρομποτικής	el
heal.academicPublisherID	ntua
heal.numberOfPages	132 σ.
heal.fullTextAvailability	true