heal.abstract |
The purpose of the present diploma thesis is the implementation of a MapReduce framework on a Network-on-chip that has DSM characteristics. MapReduce is a programming model capable of processing large data sets with a parallel distributed algorithm using a large number of processing nodes. Our objective goal was to determine the feasibility of implementing MapReduce on a many-core embedded system as the NoC described, and evaluate its performance in terms of scalability. Furthermore, we wanted to exploit platform’s characteristics in order to provide synchronization and communication among the cores. The proposed framework performed exceptionally, achieving speedups up to x85.2 for 36 cores, when compared to sequential code. Finally, we analyzed the frameworks behaviour while scaling different parameters.
This thesis includes five chapters. Chapter 1 contains an introduction to many-core systems, Big Data and MapReduce, as a parallel programming method. In chapter 2, we present the most popular MapReduce frameworks, alternative methods for parallel processing, the limitations of MapReduce and, finally, discuss our objectives. Chapter 3 consists of a presentation of the platform on which the proposed framework was implemented and a detailed overview of the framework. In chapter 4, we exhibit our test configuration, as well as the results of our simulations accompanied with an analysis. In the end, chapter 5 concludes our work and presents some topics and ideas that need future study. |
en |