dc.contributor.author | Μπάστας, Γρηγόριος | en |
dc.contributor.author | Bastas, Grigorios | en |
dc.date.accessioned | 2018-03-05T10:00:16Z | |
dc.date.available | 2018-03-05T10:00:16Z | |
dc.date.issued | 2018-03-05 | |
dc.identifier.uri | https://dspace.lib.ntua.gr/xmlui/handle/123456789/46627 | |
dc.identifier.uri | http://dx.doi.org/10.26240/heal.ntua.15245 | |
dc.rights | Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/gr/ | * |
dc.subject | Nlp | en |
dc.subject | Language | el |
dc.subject | Neural networks | el |
dc.subject | Word embeddings | el |
dc.subject | Causality | el |
dc.title | Detection of causality relations in plain text with the use of word embeddings | en |
heal.type | bachelorThesis | |
heal.classification | Επεξεργασία φυσικής γλώσσας | el |
heal.language | en | |
heal.access | free | |
heal.recordProvider | ntua | el |
heal.publicationDate | 2017-09-27 | |
heal.abstract | Causality detection is one of the most challenging topics in NLP. In this project we tried to cope with this open problem by employing training methods focused on the creation of vector representations of french words. While we only worked on the problem of causality detection in the French language, our methodology is applicable in many other cases thanks to its generality. Our whole project can be separated into three major tasks. The first task pertains to the creation of our training data through the automatic extraction of cause-effect tuples from a syntactically annotated French corpus. For this purpose, we collected non-ambiguous lexical units from the ASFALDA French FrameNet, that denote causality relations. We, therefore, extracted tuples of meaningful sets of words that represent either the cause or the effect of the captured frame. To achieve all of this, we took advantage of the dependency tree of each sentence and the part-of-speech tag of each word. The second task deals with the computational processing of our training data extracted in the previous task, in order to create causal word embeddings based on cause-effect context similarity. At this stage, the cause-effect tuples created in the first task are treated in an innovative manner as the training data set for the models Word2vec, SVD and NMF, in such a way as to create causal embeddings. The third task is about the evaluation of our models. We compared the causal proximity of cause-effect word pairs by comparing the dot product and cosine similarity of the embeddings stored in the input matrix and the embeddings stored in the output matrix of our models. For the evaluation, we use the SemEval Task8 test data (partially translated in French). | en |
heal.sponsor | Μέσω χορήγησης υποτροφίας για πραγματοποίηση της εργασίας στο ερευνητικό κέντρο IRIT - Université Toulouse III - Paul Sabatier από το πρόγραμμα ανταλλαγής φοιτητών Erasmus+ | el |
heal.advisorName | Σταφυλοπάτης, Ανδρέας-Γεώργιος | el |
heal.committeeMemberName | Σταφυλοπάτης, Ανδρέας-Γεώργιος | el |
heal.committeeMemberName | Στάμου, Γεώργιος | el |
heal.committeeMemberName | Τσανάκας, Παναγιώτης | el |
heal.academicPublisher | Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών. Τομέας Τεχνολογίας Πληροφορικής και Υπολογιστών | el |
heal.academicPublisherID | ntua | |
heal.numberOfPages | 97 σ. | |
heal.fullTextAvailability | true |
Οι παρακάτω άδειες σχετίζονται με αυτό το τεκμήριο: