HEAL DSpace

Approximating a collection of frequent sets

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Afrati, F en
dc.contributor.author Gionis, A en
dc.contributor.author Mannila, H en
dc.date.accessioned 2014-03-01T02:42:30Z
dc.date.available 2014-03-01T02:42:30Z
dc.date.issued 2004 en
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/31022
dc.subject Foundations of data mining en
dc.subject Mining frequent itemsets en
dc.subject.other Algorithms en
dc.subject.other Computational complexity en
dc.subject.other Data mining en
dc.subject.other Database systems en
dc.subject.other Polynomial approximation en
dc.subject.other Probability en
dc.subject.other Foundations of data mining en
dc.subject.other Frequent patterns en
dc.subject.other Mining frequent itemsets en
dc.subject.other Polynomial-time approximation algorithms en
dc.subject.other Set theory en
dc.title Approximating a collection of frequent sets en
heal.type conferenceItem en
heal.identifier.primary 10.1145/1014052.1014057 en
heal.identifier.secondary http://dx.doi.org/10.1145/1014052.1014057 en
heal.publicationDate 2004 en
heal.abstract One of the most well-studied problems in data mining is computing the collection of frequent item sets in large transactional databases. One obstacle for the applicability of frequent-set mining is that the size of the output collection can be far too large to be carefully examined and understood by the users. Even restricting the output to the border of the frequent item-set collection does not help much in alleviating the problem. In this paper we address the issue of overwhelmingly large output size by introducing and studying the following problem: What are the k sets that best approximate a collection of frequent item sets ? Our measure of approximating a collection of sets by k sets is defined to be the size of the collection covered by the the k sets, i.e., the part of the collection that is included in one of the k sets. We also specify a bound on the number of extra sets that are allowed to be covered. We examine different problem variants for which we demonstrate the hardness of the corresponding problems and we provide simple polynomial-time approximation algorithms. We give empirical evidence showing that the approximation methods work well in practice. en
heal.journalName KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining en
dc.identifier.doi 10.1145/1014052.1014057 en
dc.identifier.spage 12 en
dc.identifier.epage 19 en


Αρχεία σε αυτό το τεκμήριο

Αρχεία Μέγεθος Μορφότυπο Προβολή

Δεν υπάρχουν αρχεία που σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής