Εντοπισμός, διαχωρισμός, κατάτμηση: Διεργασίες επεξεργασίας χειρόγραφων και πολυμεσικών δεδομένων εν όψει εφαρμογών Αναγνώρισης, Αρχειοθέτησης και Δεικτοδότησης

Παπαβασιλείου, Βασίλειος Α.; Papavasiliou, Vasilios A.

dc.contributor.advisor	Καραγιάννης, Γιώργος	el
dc.contributor.author	Παπαβασιλείου, Βασίλειος Α.	el
dc.contributor.author	Papavasiliou, Vasilios A.	en
dc.date.accessioned	2011-07-08T08:33:44Z
dc.date.available	2011-07-08T08:33:44Z
dc.date.copyright	2011-06-27	-
dc.date.issued	2011-07-08
dc.date.submitted	2011-06-27	-
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/4673
dc.identifier.uri	http://dx.doi.org/10.26240/heal.ntua.239
dc.description	122 σ.	el
dc.description.abstract	Η ανάλυση εικόνων κειμένου έχει ως στόχο τη μετατροπή των έντυπων και χειρόγραφων κειμένων στα αντίστοιχα ηλεκτρονικά έγγραφα. Πρόκειται για μια σύνθετη διαδικασία που υλοποιείται σε επιμέρους στάδια επεξεργασίας, όπως η ψηφιοποίηση του πρωτοτύπου, ο εντοπισμός των περιοχών κειμένου, η κατάτμησή τους σε βασικά τμήματα του γραπτού λόγου (π.χ. γραμμές κειμένου, λέξεις και παραγράφους), η κατανόηση του ρόλου κάθε τμήματος, η αναγνώριση των χαρακτήρων και η δημιουργία του αντίστοιχου ηλεκτρονικού εγγράφου. Αν και έχουν αναπτυχθεί αποδοτικά εμπορικά προϊόντα για την επεξεργασία εντύπων, δεν έχει σημειωθεί η αντίστοιχη πρόοδος για τα χειρόγραφα. Η συγκεκριμένη εργασία επικεντρώνεται στην επεξεργασία ψηφιακών δυαδικών εικόνων χειρόγραφων κειμένων που περιέχουν μόνο κειμενικά στοιχεία και εστιάζει στα στάδια κατάτμησής τους σε γραμμές κειμένου και σε λέξεις. Στην πρώτη ενότητα περιγράφονται δύο τεχνικές για την οριοθέτηση των γραμμών κειμένου. Η πρώτη τεχνική στοχεύει στη βελτίωση της υπάρχουσας μεθοδολογίας των επιμέρους προβολών, προτείνοντας τη μοντελοποίηση των κατακόρυφων ζωνών ανάλυσης ως ακολουθίες παρατηρήσεων που προκύπτουν από ένα κρυφό Μαρκοβιανό μοντέλο. Η προτεινόμενη τεχνική υποβλήθηκε προς αξιολόγηση σε δύο διεθνείς διαγωνισμούς κατάτμησης χειρόγραφου κειμένου σε γραμμές και παρουσίασε καλύτερα αποτελέσματα και από τις αντίστοιχες (προβολές) και από υλοποιήσεις άλλων μεθόδων. Η δεύτερη τεχνική βασίζεται στην εφαρμογή τελεστών δυαδικής μορφολογίας. Η διαφοροποίησή της έγκειται στην εισαγωγή ενός σταδίου ελέγχου μετά από κάθε επανάληψη για τον εντοπισμό προτύπων τα οποία δηλώνουν ότι τμήματα γειτονικών γραμμών τείνουν να ενωθούν ή έχουν ήδη ενωθεί. Η συγκριτική αξιολόγησή της με παρόμοιες τεχνικές, έδειξε ότι η ενσωμάτωση του σταδίου ελέγχου συμβάλει στη βελτίωση της επίδοσης. Στη δεύτερη ενότητα εξετάζεται το πρόβλημα κατάτμησης του χειρόγραφου κειμένου σε λέξεις. Αν θεωρηθεί ότι τα εικονοστοιχεία δύο διαδοχικών γραφημάτων ανήκουν σε δύο τάξεις, τότε μπορεί να υπολογιστεί ο γραμμικός ταξινομητής διανυσμάτων υποστήριξης που τις διαχωρίζει. Για την εκτίμηση της απόστασης μεταξύ των γραφημάτων προτείνεται μια τιμή ανάλογη του περιθωρίου ταξινόμησης. Η κατηγοριοποίηση των αποστάσεων σε κενά μεταξύ λέξεων και σε κενά μεταξύ γραμμάτων της ίδιας λέξης, γίνεται με τη χρήση κατωφλίου που υπολογίζεται από τη συνάρτηση πυκνότητας πιθανότητας των αποστάσεων. Η αξιολόγηση της προτεινόμενης μεθόδου μέσω της συμμετοχής της σε δύο διεθνείς διαγωνισμούς, την ανέδειξε ως την αποτελεσματικότερη. Ως επέκταση της ανάλυσης εικόνων που περιέχουν μόνο κειμενικά στοιχεία, στην τρίτη ενότητα περιγράφεται μια τεχνική εντοπισμού πρόσθετου κειμένου σε πλαίσια βίντεο, η οποία ενσωματώνει ένα στάδιο επαλήθευσης, στο οποίο οι εντοπισμένες περιοχές κατηγοριοποιούνται σε κειμενικές ή μη, με τη βοήθεια μιγμάτων γκαουσιανών κατανομών.	el
dc.description.abstract	The thesis focuses on handwritten document image analysis, so as to study and propose methods for two critical preprocessing stages in the workflow of an optical character recognition application, such as text-line and word segmentation. The shortcomings of the existing methods are discussed and two novel techniques for text-lines segmentation and one for locating words are introduced. The first text-line segmentation algorithm is based on locating the optimal succession of text and gap areas within vertical zones by applying Viterbi algorithm on a Hidden Markov Model with parameters drawn from statistics of each type of area from the whole document image. Then, a text-line separator drawing technique is applied and finally the connected components are assigned to text lines according to simple geometrical constraints that conclude if a connected component can be directly assigned or it should be split because it lies across successive text lines. The algorithm participated in the ICDAR07 and ICDAR09 handwriting segmentation contests and took the first and second place respectively. The second method is based on binary morphology. The basic steps of the approach are: a) texture reduction (by combining dilation and sub-sampling) to produce a low resolution image, in which the underlying texture of text lines is apparent while preventing aliasing and b) application of dilations and (p,q)-th generalized foreground rank openings successively to join close and horizontally overlapping regions while preventing a merge in the vertical direction. These operations evolve the candidate text lines and distinguish special patterns, which imply that text lines have come very close or have been merged. Finally, each connected component of the initial document image is assigned to the text line that intersects, whereas if it intersects more than one text lines, we cut it using the local ridges produced with the application of the watershed algorithm. Word segmentation can be seen as a problem which requires the formulation of a metric of the gap between successive components and the clustering of the gaps in "inter" or "intra" word classes. To measure the gap metric, we use the negative logarithm of the objective function of a soft-margin linear Support Vector Machine. We employ a nonparametric approach to estimate the probability density function of the gap metrics and have observed that the “inter” words gaps are accumulated to the most right lobe of the probability density function while the “intra” word gaps are gathered to the left lobe. The classification threshold is chosen to be equal to the minimum between the two main lobes. The algorithm tested on the benchmarking datasets of ICDAR07 and ICDAR09 handwriting segmentation contests and outperformed the participating algorithms. Furthermore, the thesis studies the problem of locating artificial text in video frames. A new method for verifying text areas detected in video streams is proposed. The algorithm explores the spectral properties of the horizontal projection of candidate text regions in order to reduce the high amount of false alarms that most text detection algorithms suffer from. The algorithm has been tested on newscast video sequences and we conclude that the addition of the verification module increased the precision rate significantly while keeping the recall rate almost unaffected.	en
dc.description.statementofresponsibility	Βασίλειος Α. Παπαβασιλείου	el
dc.language.iso	el	en
dc.rights	ETDFree-policy.xml	en
dc.subject	Ανάλυση εικόνων κειμένου	el
dc.subject	Κατάτμηση εικόνας	el
dc.subject	Μηχανική μάθηση	el
dc.subject	Αναγνώριση προτύπων	el
dc.subject	Μηχανές διανυσμάτων υποστήριξης	el
dc.subject	Document image analysis	en
dc.subject	Image segmentation	en
dc.subject	Machine learning	en
dc.subject	Pattern recognition	en
dc.subject	Support vector machines	en
dc.title	Εντοπισμός, διαχωρισμός, κατάτμηση: Διεργασίες επεξεργασίας χειρόγραφων και πολυμεσικών δεδομένων εν όψει εφαρμογών Αναγνώρισης, Αρχειοθέτησης και Δεικτοδότησης	el
dc.title.alternative	Localization, discrimination and segmentation: Pre-processing procedures on handwriting and multimedia data for recognition, archiving and indexing applications	en
dc.type	doctoralThesis	el (en)
dc.date.accepted	2010-05-14	-
dc.date.modified	2011-06-27	-
dc.contributor.advisorcommitteemember	Μαραγκός, Πέτρος	el
dc.contributor.advisorcommitteemember	Κόλλιας, Στέφανος	el
dc.contributor.committeemember	Καραγιάννης, Γιώργος	el
dc.contributor.committeemember	Μαραγκός, Πέτρος	el
dc.contributor.committeemember	Κόλλιας, Στέφανος	el
dc.contributor.committeemember	Σταφυλοπάτης, Ανδρέας	el
dc.contributor.committeemember	Καμπουράκης, Γιώργος	el
dc.contributor.committeemember	Μέρτζιος, Βασίλειος	el
dc.contributor.committeemember	Κατσούρος, Βασίλειος	el
dc.contributor.department	Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών & Μηχανικών Υπολογιστών. Τομέας Σημάτων, Ελέγχου και Ρομποτικής	el
dc.date.recordmanipulation.recordcreated	2011-07-08	-
dc.date.recordmanipulation.recordmodified	2011-07-08	-