dc.contributor.author |
Παπανικολάου, Ηλίας
|
el |
dc.contributor.author |
Papanikolaou, Ilias
|
en |
dc.date.accessioned |
2023-05-24T07:00:14Z |
|
dc.date.available |
2023-05-24T07:00:14Z |
|
dc.identifier.uri |
https://dspace.lib.ntua.gr/xmlui/handle/123456789/57751 |
|
dc.identifier.uri |
http://dx.doi.org/10.26240/heal.ntua.25448 |
|
dc.rights |
Default License |
|
dc.subject |
Ερμηνεύσιμη Μηχανική Μάθηση |
el |
dc.subject |
Interpretable Machine Learning |
en |
dc.subject |
Εξηγήσιμη Ομαδοποίηση |
el |
dc.subject |
Ανάλυση πέρα από τη χειρότερη περίπτωση |
el |
dc.subject |
Ευστάθεια σε διαταραχές |
el |
dc.subject |
Μετρικοί Χώροι |
el |
dc.subject |
Explainable Clustering |
en |
dc.subject |
Beyond the worst-case analysis |
en |
dc.subject |
Perturbation stability |
en |
dc.subject |
Metric Spaces |
en |
dc.title |
Εξηγήσιμη ομαδοποίηση σε ευσταθή στιγμιότυπα |
el |
heal.type |
bachelorThesis |
|
heal.classification |
Algorithms |
en |
heal.language |
el |
|
heal.language |
en |
|
heal.access |
free |
|
heal.recordProvider |
ntua |
el |
heal.publicationDate |
2023-04-04 |
|
heal.abstract |
This thesis is concerned with the explainable clustering problem under stability assumptions. Explainable Clustering is an interpretation method developed by Dasgupta et al. that aims to provide concise explanations for the inclusion of each data point in a cluster. The question that we try to answer is whether the Price of Explainability, i.e. the inherent cost due to the restricted solution format that guarantees explainability, can be reduced if we assume that the input clustering instances satisfy either the proximity or the perturbation stability property. After we introduce several stability notions in the context of clustering and analyze the most important algorithms for explainable clustering, we show that under $a$-center stability, with $a = \Omega(k d^{\frac{1}{p}})$ there are explainable algorithms with constant approximation ratio. Next, we study the stability of several hard explainable clustering instances and prove that this dependence on the number of dimensions $d$ and clusters $k$ is necessary. More specifically, we manage to show that there are some hard clustering instances with the $\mathcal{l}_p$ objective that satisfy the $a$-proximity with $a = \Omega\left(kd^{\frac{1}{p}}\right)$ and there exist $\Omega(\sqrt{d})$-(metric) perturbation stable instances in the $k$-median case ($p = 1$), where $d$ is the number of the dimensions of the dataset. To prove the second result, we show that if a clustering instance satisfies the $a$-proximity property along with a property that ensures that all clusters in the optimal clustering have roughly the same cost, then this instance is $\Omega(\sqrt{a})$-metric perturbation stable. We conclude that it is not reasonable to assume that practical instances are stable enough for the Price of Explainability to reduce, under these stability assumptions. |
en |
heal.advisorName |
Φωτάκης, Δημήτριος |
el |
heal.committeeMemberName |
Παγουρτζής, Αριστείδης |
el |
heal.committeeMemberName |
Χατζηαφράτης, Ευάγγελος |
el |
heal.committeeMemberName |
Φωτάκης, Δημήτριος |
el |
heal.academicPublisher |
Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών. Τομέας Τεχνολογίας Πληροφορικής και Υπολογιστών |
el |
heal.academicPublisherID |
ntua |
|
heal.numberOfPages |
84 σ. |
el |
heal.fullTextAvailability |
false |
|