HEAL DSpace

Detection of causality relations in plain text with the use of word embeddings

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Μπάστας, Γρηγόριος en
dc.contributor.author Bastas, Grigorios en
dc.date.accessioned 2018-03-05T10:00:16Z
dc.date.available 2018-03-05T10:00:16Z
dc.date.issued 2018-03-05
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/46627
dc.identifier.uri http://dx.doi.org/10.26240/heal.ntua.15245
dc.rights Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα *
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/gr/ *
dc.subject Nlp en
dc.subject Language el
dc.subject Neural networks el
dc.subject Word embeddings el
dc.subject Causality el
dc.title Detection of causality relations in plain text with the use of word embeddings en
heal.type bachelorThesis
heal.classification Επεξεργασία φυσικής γλώσσας el
heal.language en
heal.access free
heal.recordProvider ntua el
heal.publicationDate 2017-09-27
heal.abstract Causality detection is one of the most challenging topics in NLP. In this project we tried to cope with this open problem by employing training methods focused on the creation of vector representations of french words. While we only worked on the problem of causality detection in the French language, our methodology is applicable in many other cases thanks to its generality. Our whole project can be separated into three major tasks. The first task pertains to the creation of our training data through the automatic extraction of cause-effect tuples from a syntactically annotated French corpus. For this purpose, we collected non-ambiguous lexical units from the ASFALDA French FrameNet, that denote causality relations. We, therefore, extracted tuples of meaningful sets of words that represent either the cause or the effect of the captured frame. To achieve all of this, we took advantage of the dependency tree of each sentence and the part-of-speech tag of each word. The second task deals with the computational processing of our training data extracted in the previous task, in order to create causal word embeddings based on cause-effect context similarity. At this stage, the cause-effect tuples created in the first task are treated in an innovative manner as the training data set for the models Word2vec, SVD and NMF, in such a way as to create causal embeddings. The third task is about the evaluation of our models. We compared the causal proximity of cause-effect word pairs by comparing the dot product and cosine similarity of the embeddings stored in the input matrix and the embeddings stored in the output matrix of our models. For the evaluation, we use the SemEval Task8 test data (partially translated in French). en
heal.sponsor Μέσω χορήγησης υποτροφίας για πραγματοποίηση της εργασίας στο ερευνητικό κέντρο IRIT - Université Toulouse III - Paul Sabatier από το πρόγραμμα ανταλλαγής φοιτητών Erasmus+ el
heal.advisorName Σταφυλοπάτης, Ανδρέας-Γεώργιος el
heal.committeeMemberName Σταφυλοπάτης, Ανδρέας-Γεώργιος el
heal.committeeMemberName Στάμου, Γεώργιος el
heal.committeeMemberName Τσανάκας, Παναγιώτης el
heal.academicPublisher Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών. Τομέας Τεχνολογίας Πληροφορικής και Υπολογιστών el
heal.academicPublisherID ntua
heal.numberOfPages 97 σ.
heal.fullTextAvailability true


Αρχεία σε αυτό το τεκμήριο

Οι παρακάτω άδειες σχετίζονται με αυτό το τεκμήριο:

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής

Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα Εκτός από όπου ορίζεται κάτι διαφορετικό, αυτή η άδεια περιγράφεται ως Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα