dc.contributor.author | Κάβουρας, Λουκάς | el |
dc.contributor.author | Kavouras, Loukas | en |
dc.date.accessioned | 2014-10-23T07:36:19Z | |
dc.date.available | 2014-10-23T07:36:19Z | |
dc.date.issued | 2014-10-23 | |
dc.identifier.uri | https://dspace.lib.ntua.gr/xmlui/handle/123456789/39349 | |
dc.identifier.uri | http://dx.doi.org/10.26240/heal.ntua.5192 | |
dc.rights | Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/gr/ | * |
dc.subject | Προσεγγιστικοί αλγόριθμοι | el |
dc.subject | Αλγόριθμοι ροής | el |
dc.subject | Εξόρυξη δεδομένων | el |
dc.subject | Υπολογιστική γεωμετρία | el |
dc.subject | Συσταδοποίηση | el |
dc.subject | Clustering | en |
dc.subject | k-means | en |
dc.subject | k-median | en |
dc.subject | Approximation algorithms | en |
dc.subject | Streaming algorithms | en |
dc.title | Αλγόριθμοι για τα προβλήματα k-means και k-median | el |
dc.title | Algorithms for the k-median problem and the k-means problem | en |
heal.type | bachelorThesis | |
heal.classification | Μαθηματικά | el |
heal.access | free | |
heal.recordProvider | ntua | el |
heal.publicationDate | 2014-10-06 | |
heal.abstract | Συσταδοποίηση ονομάζουμε την διαδικασία ομαδοποιήσης ενός συνόλου αντικειμένων με τρόπο ώστε αντικείμενα στην ίδια συστάδα να μοιάζουν περισσότερο μεταξύ τους από αντικείμενα σε άλλες συστάδες. Σε αυτή την διπλωματική, εξετάζουμε τα διάσημα προβλήματα συσταδοποίησης k-means και k-median. Παρουσιάζουμε προσεγγιστικούς αλγόριθμους για τα προβλήματα στο offline και στο streaming μοντέλο. | el |
heal.abstract | Clustering is the task of grouping a set of objects in such a way that objects in the same cluster are more similar to each other than to those in other clusters. It is a main task of data mining, machine learning and computational geometry. In this thesis, we discuss famous clustering problems and we emphasize on the k-means clustering problem, where one seeks to partition n observations into k clusters so as to minimize the within-cluster sum of squares. We present Lloyd's algorithm for the k-means problem, which was identified as one of the top 10 algorithms in data mining. Although Lloyd's algorithm has an exponential running time in the worst case, it usually runs fast in many practical applications. However, the algorithm gives no guarantees and there are natural examples where it may produce arbitrarily bad clusterings. k-means++ algorithm addresses this problem by augmenting Lloyd's algorithm with a simple and intuitive seeding technique. A formal proof shows that k-means++ algorithm is O(log k) competitive. We also examine the k-meansjj algorithm, which is an algorithm inspired by kmeans++ algorithm that can be effectively parallelized. In the last chapter, we consider cases where the entire input is not available from the beginning. That is, we study algorithms for k-means in the streaming model, where the data is too large to be stored in main memory and must be accessed sequentially. Finally, we study the facility location problem and discuss the online facility location algorithm of Meyerson. | en |
heal.advisorName | Φωτάκης, Δημήτριος | el |
heal.committeeMemberName | Ζάχος, Ευστάθιος | el |
heal.committeeMemberName | Συμβώνης, Αντώνιος | el |
heal.academicPublisher | Σχολή Εφαρμοσμένων Μαθηματικών και Φυσικών Επιστημών | el |
heal.academicPublisherID | ntua | |
heal.fullTextAvailability | true |
Οι παρακάτω άδειες σχετίζονται με αυτό το τεκμήριο: