HEAL DSpace

Wavelet-based Algorithms for Approximate Processing in the Big Data Era

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Μυτιλήνης, Ιωάννης el
dc.contributor.author Mytilinis, Ioannis en
dc.date.accessioned 2021-12-15T11:14:20Z
dc.date.available 2021-12-15T11:14:20Z
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/54180
dc.identifier.uri http://dx.doi.org/10.26240/heal.ntua.21878
dc.rights Default License
dc.subject wavelets, approximate query processing en
dc.title Wavelet-based Algorithms for Approximate Processing in the Big Data Era en
dc.contributor.department Computer Science - Computing Systems Laboratory el
heal.type doctoralThesis
heal.generalDescription Έχω ήδη παραδώσει το cd στην κεντρική βιβλιοθήκη el
heal.classification Computer Science, Databases en
heal.language en
heal.access free
heal.recordProvider ntua el
heal.publicationDate 2019-11-20
heal.abstract Modern analytics involve computations over enormous numbers of data records, which often ar- rive in the form of high-throughput streams. The need for real-time processing of huge amounts of data places increasing emphasis on the efficiency of approximate query processing (AQP). A common practice for enabling AQP is to construct a lossy, compressed representation of a dataset and execute user queries against these synopses instead of the original data. A major challenge over the past years has been the construction of synopses that provide deterministic quality guarantees, often expressed in terms of maximum error metrics. Deterministic guaran- tees are strong and easier for the user to understand and interpret. As samples and sketches usually provide statistical guarantees, deterministic schemes are mainly supported by space- partitioning techniques such as histograms and wavelets. By approximating sharp discontinu- ities, wavelet decomposition has proven to be a very effective tool for data reduction. However, existing wavelet thresholding schemes that minimize maximum error metrics are constrained with impractical complexities for large datasets. Furthermore, they cannot efficiently handle the multidimensional version of the problem. In order to provide a practical solution, the first part of this dissertation proposes parallel algorithms that take advantage of key-properties of the wavelet decomposition and efficiently construct synopses that minimize non-Euclidean errors. The experimental evaluation over the Hadoop distributed processing framework showed linear scalability with both the data and cluster size; when the whole execution fits in the cluster and all workers can run fully in parallel, a synopsis construction speedup of 20× is witnessed. The second part of the thesis targets the problem in an IoT streaming environment. The IoT era has brought forth a computing paradigm shift from traditional high-end servers to “edge” devices of limited processing and memory capabilities. Thus, the designed algorithms for such architec- tures should be “cheap” in time complexity and have a minimal memory footprint. Moreover, in many streaming scenarios, fresh data tend to be prioritized. A sliding-window model is an important case of stream processing, where only the most recent elements remain active and the rest are discarded. As in IoT scenarios the available memory is typically much less than the window size, queries are answered from compact synopses that are maintained in an online fashion. For the efficient construction of such synopses, wavelet-based algorithms are presented. The proposed algorithms provide deterministic guarantees and near exact results for a variety of data distributions and query workloads. en
heal.advisorName Koziris, Nectarios
heal.committeeMemberName Koziris, Nectarios
heal.committeeMemberName Tsoumakos, Dimitrios
heal.committeeMemberName Tsanakas, Panayiotis
heal.committeeMemberName Deligiannakis, Antonios
heal.committeeMemberName Papavassiliou, Symeon
heal.committeeMemberName Pagourtzis, Aris
heal.committeeMemberName Kotidis, Yannis
heal.academicPublisher Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών el
heal.academicPublisherID ntua
heal.fullTextAvailability false


Αρχεία σε αυτό το τεκμήριο

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής