heal.abstract |
Nowadays, Cloud computing is becoming one of the most attractive solutions for application execution, due to the enhanced flexibility and efficiency it offers. Cloud computing refers to the on-demand provision of system resources, especially computing power and storage. These resources typically exist on individual servers found in datacenter environments and/or server farms around the world. Proper and efficient management of those resources becomes crucial, both from the providers' as well as end-users' point of view, since it can dramatically improve performance and reduce cost expenses for all the parties involved. However, orchestrating cloud computing resources is not a straightforward matter, due to i) the huge amount of available optimization knobs, ii) the different levels that these optimizations can be applied (server- to cluster- to application-level) and iii) the interrelationship between software and hardware optimizations combined with the extreme hardware heterogeneity found in today's data-centers environments. On top of that novel computing paradigms are emerging (e.g., hardware disaggregation), which unveil extra optimization knobs on the foreground, thus, further complicating the problem of efficient resource orchestration.
In this dissertation, we examine the applicability of deep learning techniques for system optimization in Cloud architectures. Given the huge amount and the multi-level nature of the available optimization knobs, which tend to be unmanageable using conventional, human-driven orchestration mechanisms, deep learning approaches appear to be unavoidable in order to exploit and leverage the full potential of modern Cloud systems. First, we investigate the employment of state-of-the-art neural networks in the field of system monitoring, which forms an integral part of modern Cloud infrastructures. Then, we examine the application of ML-driven orchestration on different levels of the hierarchy, i.e., application-driven automatic optimizations, cluster-level application deployment and orchestration and system-level control of running applications. To achieve the above, we propose three frameworks, i.e., Rusty, Adrias and Sparkle, which employ deep learning driven optimizations on different levels of the hierarchy. |
en |