HEAL DSpace

A methodology for clustering XML documents by structure

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Dalamagas, T en
dc.contributor.author Cheng, T en
dc.contributor.author Winkel, K-J en
dc.contributor.author Sellis, T en
dc.date.accessioned 2014-03-01T01:23:25Z
dc.date.available 2014-03-01T01:23:25Z
dc.date.issued 2006 en
dc.identifier.issn 0306-4379 en
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/16960
dc.subject Clustering en
dc.subject Structural similarity en
dc.subject Structural summary en
dc.subject Tree edit distance en
dc.subject XML en
dc.subject.classification Computer Science, Information Systems en
dc.subject.other Algorithms en
dc.subject.other Estimation en
dc.subject.other XML en
dc.subject.other Clustering en
dc.subject.other Structural summary en
dc.subject.other Tree edit distance en
dc.subject.other Information theory en
dc.title A methodology for clustering XML documents by structure en
heal.type journalArticle en
heal.identifier.primary 10.1016/j.is.2004.11.009 en
heal.identifier.secondary http://dx.doi.org/10.1016/j.is.2004.11.009 en
heal.language English en
heal.publicationDate 2006 en
heal.abstract The processing and management of XML data are popular research issues. However, operations based on the structure of XML data have not received strong attention. These operations involve, among others, the grouping of structurally similar XML documents. Such grouping results from the application of clustering methods with distances that estimate the similarity between tree structures. This paper presents a framework for clustering XML documents by structure. Modeling the XML documents as rooted ordered labeled trees, we study the usage of structural distance metrics in hierarchical clustering algorithms to detect groups of structurally similar XML documents. We suggest the usage of structural summaries for trees to improve the performance of the distance calculation and at the same time to maintain or even improve its quality. Our approach is tested using a prototype testbed. (c) 2004 Elsevier B.V. All rights reserved. en
heal.publisher PERGAMON-ELSEVIER SCIENCE LTD en
heal.journalName Information Systems en
dc.identifier.doi 10.1016/j.is.2004.11.009 en
dc.identifier.isi ISI:000234906300003 en
dc.identifier.volume 31 en
dc.identifier.issue 3 en
dc.identifier.spage 187 en
dc.identifier.epage 228 en


Αρχεία σε αυτό το τεκμήριο

Αρχεία Μέγεθος Μορφότυπο Προβολή

Δεν υπάρχουν αρχεία που σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής