HEAL DSpace

Διαθεματικές και γνωστικές μέθοδοι για αναπαραστάσεις φυσικής γλώσσας

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Αθανασίου, Νίκος el
dc.contributor.author Athanasiou, Nikos en
dc.date.accessioned 2020-03-30T14:22:57Z
dc.date.available 2020-03-30T14:22:57Z
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/49968
dc.identifier.uri http://dx.doi.org/10.26240/heal.ntua.17666
dc.rights Default License
dc.subject Computational neuroscience el
dc.subject Machine learning en
dc.subject Multiple word embeddings en
dc.subject Natural language representations en
dc.subject Topic modelling en
dc.subject Υπολογιστική νευροεπιστήμη el
dc.subject Μηχανική μάθηση el
dc.subject Πολλαπλές διανυσματικές αναπαραστάσεις λέξεων el
dc.subject Θεματική μοντελοποίηση el
dc.subject Αναπαραστάσεις φυσικής γλώσσας el
dc.title Διαθεματικές και γνωστικές μέθοδοι για αναπαραστάσεις φυσικής γλώσσας el
heal.type bachelorThesis
heal.secondaryTitle Cognitive and Cross-Topic Methods for Natural Language Representations el
heal.classification Machine Learning, Natural Language en
heal.language el
heal.language en
heal.access free
heal.recordProvider ntua el
heal.publicationDate 2019-07-05
heal.abstract In this work we investigate Natural Language Representations by two different points of view cognitive neuroscience and topic modelling. For the evaluation of each approach, we use multiple datasets and experimental setups which follow literature's guidelines. Moreover, we evaluate our work both quantitatively and qualitatively providing useful insights and visualizations in order to make our results interpretable. First, from the angle of cognitive neuroscience we explore how brain representations can help us improve current corpus-based language representations. Neural activation models that have been proposed in the literature use a set of example words for which fMRI measurements are available in order to find a mapping between word semantics and localized neural activations. Successful mappings let us expand to the full lexicon of concrete nouns using the assumption that similarity of meaning implies similar neural activation patterns. In this paper, we propose a computational model that estimates semantic similarity in the neural activation space and investigates the relative performance of this model for various natural language processing tasks. Despite the simplicity of the proposed model and the very small number of example words used to bootstrap it, the neural activation semantic model performs surprisingly well compared to state-of-the-art word embeddings. Specifically, the neural activation semantic model performs better than the state-of-the-art for the task of semantic similarity estimation between very similar or very dissimilar words, while performing well on other tasks such as entailment and word categorization. These are strong indications that neural activation semantic models can not only shed some light into human cognition but also contribute to computation models for certain tasks. In the second part, we investigate how topic modelling can help us produce multi-prototype word embeddings and compare their performance with single-prototype models. In traditional Distributional Semantic Models (DSMs) the multiple senses of a polysemous word are conflated into a single vector space representation. In this work, we propose a DSM that learns multiple distributional representations of a word based on different topics. First, a separate DSM is trained for each topic and then each of the topic-based DSMs is aligned to a common vector space. Our unsupervised mapping approach is motivated by the hypothesis that words preserving their relative distances in different topic semantic sub-spaces constitute robust semantic anchors that define the mappings between them. Aligned cross-topic representations achieve state-of-the-art results for the task of contextual word similarity. Furthermore, evaluation on NLP downstream tasks shows that multiple topic-based embeddings outperform single-prototype models. en
heal.advisorName Ποταμιάνος, Αλέξανδρος el
heal.committeeMemberName Τζαφέστας, Κωσταντίνος el
heal.committeeMemberName Σταφυλοπάτης, Ανδρέας-Γεώργιος el
heal.academicPublisher Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών. Τομέας Σημάτων, Ελέγχου και Ρομποτικής el
heal.academicPublisherID ntua
heal.numberOfPages 117
heal.fullTextAvailability true


Αρχεία σε αυτό το τεκμήριο

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής