dc.contributor.author |
Kokolakis, G |
en |
dc.contributor.author |
Fouskakis, D |
en |
dc.date.accessioned |
2014-03-01T01:28:55Z |
|
dc.date.available |
2014-03-01T01:28:55Z |
|
dc.date.issued |
2008 |
en |
dc.identifier.issn |
0176-4268 |
en |
dc.identifier.uri |
https://dspace.lib.ntua.gr/xmlui/handle/123456789/19034 |
|
dc.subject |
Asymptotic theory |
en |
dc.subject |
Convex partition |
en |
dc.subject |
Discrepancy measures |
en |
dc.subject |
Identity protection |
en |
dc.subject |
Multivariate micro-aggregation |
en |
dc.subject |
Probabilistic distances |
en |
dc.subject.classification |
Mathematics, Interdisciplinary Applications |
en |
dc.subject.classification |
Psychology, Mathematical |
en |
dc.subject.other |
DISCLOSURE |
en |
dc.title |
On the discrepancy measures for the optimal equal probability partitioning in bayesian multivariate micro-aggregation |
en |
heal.type |
journalArticle |
en |
heal.identifier.primary |
10.1007/s00357-008-9014-8 |
en |
heal.identifier.secondary |
http://dx.doi.org/10.1007/s00357-008-9014-8 |
en |
heal.language |
English |
en |
heal.publicationDate |
2008 |
en |
heal.abstract |
Data holders, such as statistical institutions and financial organizations, have a very serious and demanding task when producing data for official and public use. It's about controlling the risk of identity disclosure and protecting sensitive information when they communicate data-sets among themselves, to governmental agencies and to the public. One of the techniques applied is that of micro-aggregation. In a Bayesian setting, micro-aggregation can be viewed as the optimal partitioning of the original data-set based on the minimization of an appropriate measure of discrepancy, or distance, between two posterior distributions, one of which is conditional on the original data-set and the other conditional on the aggregated data-set. Assuming d-variate normal data-sets and using several measures of discrepancy, it is shown that the asymptotically optimal equal probability m-partition of ℝd, with m 1/d ε ℕ, is the convex one which is provided by hypercubes whose sides are formed by hyperplanes perpendicular to the canonical axes, no matter which discrepancy measure has been used. On the basis of the above result, a method that produces a sub-optimal partition with a very small computational cost is presented. © 2008 Springer Science+Business Media, LLC. |
en |
heal.publisher |
SPRINGER |
en |
heal.journalName |
Journal of Classification |
en |
dc.identifier.doi |
10.1007/s00357-008-9014-8 |
en |
dc.identifier.isi |
ISI:000262413100007 |
en |
dc.identifier.volume |
25 |
en |
dc.identifier.issue |
2 |
en |
dc.identifier.spage |
209 |
en |
dc.identifier.epage |
224 |
en |