Ανάπτυξη Ευρετηρίων για Σύνθετα Δεδομένα

Κασκούρα, Χριστίνα; Kaskoura, Christina

dc.contributor.author	Κασκούρα, Χριστίνα
dc.contributor.author	Kaskoura, Christina
dc.date.accessioned	2025-09-02T09:48:59Z
dc.date.available	2025-09-02T09:48:59Z
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/62303
dc.identifier.uri	http://dx.doi.org/10.26240/heal.ntua.29999
dc.rights	Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/gr/	*
dc.subject	ανεστραμμένο αρχείο	el
dc.subject	ερωτήματα υποσυνόλου	el
dc.subject	ερωτήματα ισότητας	el
dc.subject	ερωτήματα υπερσυνόλου	el
dc.subject	μέθοδοι βελτίωσης	el
dc.subject	indexing structures	en
dc.subject	inverted file	en
dc.subject	ordered inverted file	en
dc.subject	subset queries	en
dc.subject	superset queries	en
dc.title	Ανάπτυξη Ευρετηρίων για Σύνθετα Δεδομένα	el
dc.contributor.department	Τομέας τεχνολογίας πληροφορικής και υπολογιστών	el
heal.type	bachelorThesis
heal.classification	δομές δεικτοδότησης	el
heal.language	el
heal.access	free
heal.recordProvider	ntua	el
heal.publicationDate	2006-09-01
heal.abstract	Ο στόχος της διπλωματικής αυτής εργασίας είναι η ανάπτυξη ενός ευρετηρίου το οποίο θα είναι αποδοτικό για χρήση σε σύνθετα δεδομένα και συγκεκριμένα για τιμές-σύνολα, δηλαδή για δοσοληψίες, η κάθε μία από τις οποίες αποτελείται από ένα σύνολο (set) από ίδιου τύπου δεδομένα. Το ευρετήριο που αναπτύσσουμε μας ενδιαφέρει να μπορεί να απαντάει σε συγκεκριμένα ερωτήματα, τα οποία είναι subset queries, equality queries και superset queries. Έτσι, για την ανάπτυξη του ευρετηρίου μας χρησιμοποιούμε το πιο αποδοτικό από τα ήδη υπάρχοντα ευρετήρια, το inverted file, το οποίο συνδυάζουμε με το γνωστό Β-Δέντρο, δημιουργώντας έτσι το ordered inverted file. Ο στόχος που επιθυμούμε να πετύχουμε με την ανάπτυξη του ordered inverted file είναι να κάνουμε πιο αποδοτική την αποτίμηση των ερωτημάτων, αποκτώντας μέσω του Β-Δέντρου πρόσβαση και σε άλλα σημεία των λιστών του inverted file εκτός από την αρχή τους. Αναπτύσσεται επίσης κώδικας σε C++ ο οποίος υλοποιεί την κατασκευή του ordered inverted file καθώς και την αποτίμηση ερωτημάτων με χρήση αυτού, και ο οποίος χρησιμοποιείται για τη διενέργεια πειραμάτων που συγκρίνουν την απόδοση του ordered inverted file με αυτή του απλού inverted file. Η υλοποίησή μας αποθηκεύει τα Β-Δέντρα στο σκληρό δίσκο ενώ για το inverted file τμήμα του ευρετηρίου προσφέρει την επιλογή να αποθηκευτεί είτε στο δίσκο είτε στην κύρια μνήμη. Τα πειράματα που έγιναν με χρήση του κώδικα αυτού δείχνουν ότι σε γενικές γραμμές το ordered inverted file είναι πιο αποδοτικό από το inverted file, ειδικά για την αποτίμηση ερωτημάτων equality και superset. Για την αποτίμηση subset queries κατά την οποία η απόδοση του ordered inverted file δε φάνηκε να υπερτερεί σημαντικά αυτής του απλού inverted file προτείνονται επιπλέον μέθοδοι βελτίωσης, οι οποίες όμως δε συμπεριλαμβάνονται στην υλοποίηση.	el
heal.abstract	The goal of this thesis is to develop an index which will be efficient to use with set-valued attributes. With the term set-valued attributes we mean transactions, each one of which comprises of a set of data of the same type. The index which will be developed must be able to answer efficiently certain types of queries, and more specifically subset, equality and superset queries. To develop our index we combine the already existing inverted file, which is the most efficient index for set-valued attributes, with the B-Tree, therefore creating the ordered inverted file. This way, we manage to answer the aforementioned queries more efficiently than by using the simple inverted file, since the B-Tree grants us access to more entry points in the inverted file’s lists than just the beginning of each list. We also implement the construction and query evaluation of the ordered inverted file in C++, in order to be able to run experiments that will compare the efficiency of the ordered inverted file to that of the simple inverted file. Our implementation stores the B-Trees in the hard disk, while giving the user a choice of whether to store the inverted file portion of the index in the hard disk or in the main memory. The experiments we ran using that code, showed that the ordered inverted file is generally more efficient than the simple inverted file, especially in the evaluation of equality and superset queries. For the evaluation of subset queries, where the ordered inverted file is not significantly more efficient than the simple inverted file, we also propose new methods to enhance its efficiency, whose implementation however is not part of this thesis.	en
heal.sponsor	ΕΜΠ	el
heal.sponsor	ΕΜΠ	el
heal.advisorName	Σελλής, Τιμολέων
heal.committeeMemberName	Σελλής, Τιμολέων
heal.committeeMemberName	Βασιλείου, Ιωάννης
heal.committeeMemberName	Κοζύρης, Νεκτάριος
heal.academicPublisher	Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών	el
heal.academicPublisherID	ntua
heal.numberOfPages	108 σ.
heal.fullTextAvailability	false