Efficient keyword search on large tree structured datasets

Dimitriou, A; Theodoratos, D

dc.contributor.author	Dimitriou, A	en
dc.contributor.author	Theodoratos, D	en
dc.date.accessioned	2014-03-01T02:53:37Z
dc.date.available	2014-03-01T02:53:37Z
dc.date.issued	2012	en
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/36452
dc.subject	Keyword search	en
dc.subject	LCA	en
dc.subject	Ranking	en
dc.subject	Search algorithm	en
dc.subject	Tree-structured data	en
dc.subject	XML	en
dc.subject.other	Keyword search	en
dc.subject.other	LCA	en
dc.subject.other	Ranking	en
dc.subject.other	Search Algorithms	en
dc.subject.other	Tree-structured data	en
dc.subject.other	Algorithms	en
dc.subject.other	Forestry	en
dc.subject.other	Input output programs	en
dc.subject.other	Search engines	en
dc.subject.other	XML	en
dc.subject.other	Trees (mathematics)	en
dc.subject.other	Algorithms	en
dc.subject.other	Data	en
dc.subject.other	Forestry	en
dc.subject.other	Mathematics	en
dc.subject.other	Trees	en
dc.title	Efficient keyword search on large tree structured datasets	en
heal.type	conferenceItem	en
heal.identifier.primary	10.1145/2254736.2254749	en
heal.identifier.secondary	http://dx.doi.org/10.1145/2254736.2254749	en
heal.publicationDate	2012	en
heal.abstract	Keyword search is the most popular paradigm for querying XML data on the web. In this context, three challenging problems are (a) to avoid missing useful results in the answer set, (b) to rank the results with respect to some relevance criterion and (c) to design algorithms that can efficiently compute the results on large datasets. In this paper, we present a novel multi-stack based algorithm that returns as an answer to a keyword query all the results ranked on their size. Our algorithm exploits a lattice of stacks each corresponding to a partition of the keyword set of the query. This feature empowers a linear time performance on the size of the input data for a given number of query keywords. As a result, our algorithm can run efficiently on large input data for several keywords. We also present a variation of our algorithm which accounts for infrequent keywords in the query and show that it can significantly improve the execution time. An extensive experimental evaluation of our approach confirms the theoretical analysis, and shows that it scales smoothly when the size of the input data and the number of input keywords increases. Copyright 2012 ACM.	en
heal.journalName	KEYS 2012 - Proceedings of the 3rd International Workshop on Keyword Search on Structured Data	en
dc.identifier.doi	10.1145/2254736.2254749	en
dc.identifier.spage	63	en
dc.identifier.epage	74	en