heal.abstract |
Data exchange is the problem of transforming data structured under a schema (called source) into data structured under a different schema (called target). The emphasis of data exchange is to materialize a target instance (called solution) that satisfies the relationship between the schemas. Universal solutions were shown to be the most suitable solutions, mainly because they can be used to answer conjunctive queries posed over the target schema. Trying to extend this result to more expressive query languages fails, even if we only add inequalities (≠) to conjunctive queries. In this work we study data exchange in the presence of general arithmetic comparisons (<, ≤, >, ≥,=, ≠): (a) We consider queries posed over the target schema that belong to the class of unions of conjunctive queries with arithmetic comparisons (in short CQACs). (b) We exploit arithmetic comparisons to define more expressive data exchange settings, called DEAC settings. In particular, DEAC settings consist of constraints that involve arithmetic comparisons. For that, two new classes of dependencies (tgd-ACs and acgds) are introduced, to capture the need of arithmetic comparisons in source-to-target and target constraints. We show that in DEAC settings the existence of solution problem is in NP. We define a novel chase procedure called AC-chase which is a tree and we prove that it produces a universal solution (appropriately defined to deal with arithmetic comparisons). We show that the new concept of universal solution is the right tool for query answering in the case of unions of CQACs. The complexity of computing certain answers for unions of CQACs is shown to be coNP-complete. Moreover, we identify polynomial cases for a) computing a universal solution and b) computing certain answers. For that, we introduce the succinct AC-chase which is a sequence instead of a tree, but its result is not necessarily a solution. We identify cases where succinct AC-chase returns indeed a universal solution and we investigate the syntactic conditions of the query under which query answering takes polynomial time. We show that the latter is feasible even in cases where the result of chase is not a universal solution. Copyright 2008 ACM. |
en |