Αναγνώριση χειρόγραφων μαθηματικών εκφράσεων εντός γραμμής (online)

Σιμιστήρα, Φωτεινή; Simistira, Foteini

dc.contributor.author	Σιμιστήρα, Φωτεινή	el
dc.contributor.author	Simistira, Foteini	en
dc.date.accessioned	2015-06-24T10:52:32Z
dc.date.available	2015-06-24T10:52:32Z
dc.date.issued	2015-06-24
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/40883
dc.identifier.uri	http://dx.doi.org/10.26240/heal.ntua.1746
dc.rights	Αναφορά Δημιουργού - Μη Εμπορική Χρήση - Παρόμοια Διανομή 3.0 Ελλάδα	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-sa/3.0/gr/	*
dc.subject	αναγνώριση μαθηματικών εκφράσεων	el
dc.subject	αναγνώριση μαθηματικών συμβόλων	el
dc.subject	τοπολογία σχέσεων	el
dc.subject	SCFG	el
dc.subject	SVM	el
dc.subject	mathematical symbol recognition	en
dc.subject	recognition of mathematical formulas	el
dc.subject	classification of spatial relation	el
dc.subject	SCFG	el
dc.subject	SVM	el
dc.title	Αναγνώριση χειρόγραφων μαθηματικών εκφράσεων εντός γραμμής (online)	el
dc.title	Recognition of online handwritten mathematical expressions	en
dc.contributor.department	ΣΗΜΑΤΩΝ ΕΛΕΓΧΟΥ ΚΑΙ ΡΟΜΠΟΤΙΚΗΣ	el
heal.type	doctoralThesis
heal.classification	Αναγνώριση προτύπων	el
heal.classification	Pattern recognition systems--Congresses	en
heal.classificationURI	http://id.loc.gov/authorities/subjects/sh2008108988
heal.language	el
heal.language	en
heal.access	free
heal.recordProvider	ntua	el
heal.publicationDate	2015-06-08
heal.abstract	Αντικείμενο της διατριβής είναι η αναγνώριση χειρόγραφων μαθηματικών εκφράσεων (ΜΕ). Η κύρια συνεισφορά της παρούσας εργασίας αφορά στο σχεδιασμό και στην υλοποίηση ενός συστήματος αναγνώρισης ΜΕ που λειτουργεί σε τρία στάδια στάδια: α) την ομαδοποίηση των κονδυλιών σε σύμβολα και ταυτόχρονη αναγνώριση των συμβόλων, β) την αναγνώριση των σχέσεων μεταξύ δύο μαθηματικών συμβόλων ή υπο-εκφράσεων και γ) την ανασύσταση της μαθηματικής έκφρασης και η αναπαράστασή της με βάση το μορφότυπο MathML (Παράρτημα Β), με δεδομένη την αναγνώριση και την σχέση μεταξύ των συμβόλων που την απαρτίζουν. Στο πρώτο στάδιο, προτείνεται η κωδικοποίηση της ακολουθίας των συντεταγμένων των συμβόλων σε ακολουθίες κωδικών Freeman-8 επιπέδων συνοδευόμενων με το ποσοστό του επιμέρους μήκους μεταξύ δύο διαδοχικών σημείων. Η σύγκριση των υπό εξέταση συμβόλων με τα πρότυπα αναφοράς γίνεται με την τεχνική της απόστασης του κοντινότερου γείτονα με βάση το ταίριασμα στα δείγματα (template matching) των προτύπων αναφοράς. Στο δεύτερο στάδιο χρησιμοποιούνται τα γεωμετρικά χαρακτηριστικά των συμβόλων της μαθηματικής έκφρασης (π.χ. κεντροειδές, πλαίσιο οριοθέτησης) για την εύρεση των σχέσεων μεταξύ των συμβόλων (π.χ. εκθέτης, δείκτης) και την κατανομή τους σε επιμέρους υποεκφράσεις (sub-expressions) της ΜΕ. Στην παρούσα εργασία αναπτύχθηκαν δύο τεχνικές. Στην πρώτη τεχνική χρησιμοποιήθηκε ο γεωμετρικός ταξινομητής για την εύρεση της τοπολογικής σχέσης μεταξύ των συμβόλων που απαρτίζουν την ΜΕ και την τοποθέτηση αυτών στα αντίστοιχα επίπεδα της ΜΕ. Τα σύμβολα που ανήκουν σε διαφορετικά επίπεδα ενώνονται με σύμβολα άλλων επιπέδων βάση της τοπολογικής σχέσης που υπάρχει μεταξύ τους. Τέλος, αφού όλα τα σύμβολα έχουν τοποθετηθεί στα αντίστοιχα επίπεδα και οι σχέσεις μεταξύ των συμβόλων έχουν οριστεί, παράγεται ένα MathML αρχείο για την αναπαράσταση της ΜΕ. Οι μαθηματικές εκφράσεις που μπορούν να αναγνωριστούν με αυτή την τεχνική είναι μέχρι ενός επιπέδου πάνω (εκθέτης) και ενός επιπέδου κάτω (δείκτης). Στην δεύτερη τεχνική χρησιμοποιήθηκε ένας ταξινομητής βασισμένος σε SVM για την εύρεση της τοπολογικής σχέσης μεταξύ των συμβόλων που απαρτίζουν την ΜΕ. Στο τρίτο στάδιο, αναπτύχθηκε μία στοχαστική γραμματική (SCFG) για την περιγραφή της σύνταξης των ΜΕ. Αφού ορίσουμε την γραμματική, μπορούμε πλέον να ψάξουμε για την πιο πιο πιθανή υποψήφια μαθηματική έκφραση λαμβάνοντας υπόψη τις τοπολογικές σχέσεις μεταξύ των συμβόλων και/ή των υπό-εκφράσεων. Για να αναλύσουμε αποτελεσματικότερα μια ΜΕ, πρέπει πρώτα να όρισουμε ένα αλγόριθμο ανάλυσης (parsing algorithm) της ΜΕ. Σε αυτή την εργασία χρησιμοποιήσαμε τον αλγόριθμο Cocke–Younger–Kasami (CYK) κατάλληλα τροποποιημένο ώστε να μπορεί να αντιμετωπίσει το πρόβλημα της ανάλυσης των μαθηματικών εκφράσεων. Τέλος, αναπτύχθηκε ένα πρόγραμμα συλλογής χειρόγραφων μαθηματικών εκφράσεων για την δημιουργία βάσης χειρόγραφων μαθηματικών συμβόλων. Η βάση που δημιουργήθηκε αποτελείται από 186 σύμβολα που συλλέχτηκαν από 50 γραφείς. Κάθε γραφέας έγραψε κάθε σύμβολο 5 φορές. Επιπλέον, κάθε γραφέας έγραψε 54 εξισώσεις οι οποίες περιέχουν τουλάχιστον από μία φορά όλα τα σύμβολα που προαναφέρθηκαν. Για την αξιολόγηση του αλγορίθμου αναγνώρισης των χειρόγραφων μαθηματικών συμβόλων χρησιμοποιήθηκαν δύο βάσεις χειρόγραφων μαθηματικών συμβόλων, η βάση LaViola και η βάση CROHME2014. Το ποσοστό ορθής αναγνώρισης είναι 92% στην βάση LaViola και 77.25% στη βάση CROHME2014. Για την αξιολόγηση του αλγορίθμου εύρεσης των σχέσεων μεταξύ συμβόλων χρησιμοποιήθηκε η βάση χειρόγραφων μαθηματικών εκφράσεων MathBrush, με μέσο ποσοστό λάθους 2.87%. Τέλος, για την αξιολόγηση του αλγορίθμου ανασύστασης της μαθηματικής έκφρασης χρησμιποποιήθηκαν οι βάσεις CROHME2012 και 2013. Οι βάσεις CROHME2012 και 2013 αποτελούνται από 4 μέρη: Μέρος-I, Μέρος-II, Μέρος-III και Μέρος-IV. Το ποσοστό ορθής αναγνώρισης του προτεινόμενου αλγορίθμου είναι 78.70%, 65.78%, 56.37% και 50.22% για τα τέσσερα μέρη των βάσων CROHME2012 και 2013.	el
heal.abstract	The thesis focuses on the recognition of online handwritten mathematical expressions. The main contribution is the development of a system for the recognition of online handwritten mathematical expressions. This system operates in three stages: a) grouping strokes to symbol hypotheses and recognizing these candidate symbols, b) recognition of the spatial relation between two corresponding mathematical symbols or subexpressions, and c) representation of the mathematical expression using the MathML format (Appendix B). In the first stage, we use the Freeman-8 chain code followed by the local percentage of the length between two successive points to represent the symbol. The Freeman-8 chain code is used to represent a boundary by a connected sequence of straight-line segments of specified length and direction. The direction of each segment is coded by using an integer number in the range of 0 to 8. For the comparison between the reference symbol and the test symbol, the template matching technique is used. In the second stage, the spatial relations between the symbols that comprise the mathematical expression are recognized. The geometrical features of the symbols comprising the mathematical expression (ME) (e.g. centroid, bounding box) are used in order to classify the relation between symbols (e.g. superscript, subscript) and finally put the symbols in the right subexpression of the ME. In the current thesis two different techniques are developed. In the first technique the geometrical classifier for the recognition of the spatial relation between symbols comprising the ME is used. After the spatial relation is recognized, the symbols are positioned in the corresponding levels of the ME. The symbols that belong to different levels are connected with symbols of other levels of the ME based on the spatial relation between them. Finally, when all the symbols are positioned in the corresponding levels and the spatial relations between the symbols are defined, a MathML file is produced for the representation of the ME. The mathematical expression that can be recognized through this technique is till one level up (superscript) and one level down (subscript). In the second technique an SVM based classifier is developed for the recognition of the spatial relation between symbols of the ME. In the third stage a stochastic context free grammar (SCFG) is used for the description of the syntax of the MEs. When the grammar is defined we can then search for the most likely ME using also the spatial relations between the symbols and/or subexpressions. To analyse effectively the ME, we first need to define a parsing algorithm for the ME. In the current thesis we use a variation of the Cocke–Younger–Kasami (CYK) algorithm in order to tackle the problem of the analysis of the MEs. Also a program for the acquisition of handwritten mathematical expressions is developed for the creation of a handwritten mathematical symbols dataset. With this system a database 186 symbols selected by 50 different writers was created. Each writer wrote each symbols 5 times. Moreover each writer wrote 54 mathematical equations that contained at least once each symbol of the dataset. We use two different datasets of mathematical symbols, the Laviola dataset and the CROHME dataset. The accuracy rate for the symbol recognition algorithm is 92% for the LaViola dataset and 77.25% for the CROHME2014 dataset. For the evaluation of the algorithm for the recognition of the spatial relation between objects, we used the MathBrush dataset. The reported error rate in this case is 2.87%. Finally the CROHME 2012 and 2013 datasets were used for the evaluation of the parsing algorithm of the MEs and gave an accuracy result of 78.70%, 65.78%, 56.37% and 50.22% for the four different parts of the dataset.	en
heal.advisorName	Καραγιάννης, Γεώργιος	el
heal.advisorName	Carayannis, George	en
heal.committeeMemberName	Σταφυλοπάτης, Ανδρέας-Γεώργιος	el
heal.committeeMemberName	Κόλλιας, Στέφανος	el
heal.committeeMemberName	Κατσούρος, Βασίλης	el
heal.committeeMemberName	Σελής, Τίμος	el
heal.committeeMemberName	Γάτος, Βασίλης	el
heal.committeeMemberName	Eichenberger-Liwicki, Marcus	el
heal.academicPublisher	Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών	el
heal.academicPublisherID	ntua
heal.numberOfPages	114
heal.fullTextAvailability	true