Cluster computing, recursion and datalog

Afrati, FN; Borkar, V; Carey, M; Polyzotis, N; Ullman, JD

dc.contributor.author	Afrati, FN	en
dc.contributor.author	Borkar, V	en
dc.contributor.author	Carey, M	en
dc.contributor.author	Polyzotis, N	en
dc.contributor.author	Ullman, JD	en
dc.date.accessioned	2014-03-01T02:52:56Z
dc.date.available	2014-03-01T02:52:56Z
dc.date.issued	2011	en
dc.identifier.issn	03029743	en
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/36155
dc.subject.other	Computing clusters	en
dc.subject.other	Datalog	en
dc.subject.other	Datalog programs	en
dc.subject.other	Execute query	en
dc.subject.other	Key elements	en
dc.subject.other	Map-reduce	en
dc.subject.other	Node failure	en
dc.subject.other	Open source implementation	en
dc.subject.other	Output only	en
dc.subject.other	Recursions	en
dc.subject.other	Recursive process	en
dc.subject.other	Recursive programs	en
dc.subject.other	Semi-naive evaluation	en
dc.subject.other	Transitive closure	en
dc.subject.other	Artificial intelligence	en
dc.subject.other	Cluster computing	en
dc.title	Cluster computing, recursion and datalog	en
heal.type	conferenceItem	en
heal.identifier.primary	10.1007/978-3-642-24206-9_8	en
heal.identifier.secondary	http://dx.doi.org/10.1007/978-3-642-24206-9_8	en
heal.publicationDate	2011	en
heal.abstract	The cluster-computing environment typified by Hadoop, the open-source implementation of map-reduce, is receiving serious attention as the way to execute queries and other operations on very large-scale data. Datalog execution presents several unusual issues for this enviroment. We discuss the best way to execute a round of seminaive evaluation on a computing cluster using the map-reduce. Using transitive closure as an example, we examine the cost of executing recursions in several different ways. Recursive processes such as evaluation of a recursive Datalog program do not fit the key map-reduce assumption that tasks deliver output only when they are completed. As a result, the resilience under compute-node failure that is a key element of the map-reduce framework is not supported for recursive programs. We discuss extensions to this framework that are suitable for executing recursive Datalog programs on very large-scale data in a way that allows progress to continue after node failures, without restarting the entire job. © 2011 Springer-Verlag.	en
heal.journalName	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)	en
dc.identifier.doi	10.1007/978-3-642-24206-9_8	en
dc.identifier.volume	6702 LNCS	en
dc.identifier.spage	120	en
dc.identifier.epage	144	en