Resource aware GPU scheduling in kubernetes infrastructure

Φερίκογλου, Άγγελος; Ferikoglou, Angelos

dc.contributor.author	Φερίκογλου, Άγγελος	el
dc.contributor.author	Ferikoglou, Angelos	en
dc.date.accessioned	2020-10-20T07:22:50Z
dc.date.available	2020-10-20T07:22:50Z
dc.identifier.uri	https://dspace.lib.ntua.gr/xmlui/handle/123456789/51534
dc.identifier.uri	http://dx.doi.org/10.26240/heal.ntua.19232
dc.rights	Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα	*
dc.rights	Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/gr/	*
dc.subject	Cloud computing	en
dc.subject	GPU	en
dc.subject	Resource-aware	en
dc.subject	Kubernetes	en
dc.subject	Scheduling	en
dc.title	Resource aware GPU scheduling in kubernetes infrastructure	en
heal.type	bachelorThesis
heal.classification	Computer Engineering	en
heal.language	el
heal.language	en
heal.access	free
heal.recordProvider	ntua	el
heal.publicationDate	2020-07-31
heal.abstract	Nowadays, there is an ever-increasing number of Artificial Intelligence (AI) and Machine Learning (ML) workloads pushed and executed on the Cloud. To effectively serve and manage these huge computational demands, data center operators and cloud providers have provisioned GPU resources at the scale of thousands of nodes. Since GPUs are relatively new to the cloud stack, support for efficient GPU management lacks, as state-of-the-art schedulers and orchestrators treat GPUs only as a specific resource constraint while ignoring its unique characteristics and application properties. In addition, users tend to request more GPU resources than they actually need, leading to resource under-utilization. In this thesis, we design a resource aware GPU scheduling system, able to efficiently colocate applications on the same card arriving at a data center. We integrate our solution with Kubernetes, one of the most widely used cloud orchestration frameworks nowadays. We show that our scheduler can achieve better quality of service (QoS) and higher resource utilization compared to the state-of-the-art schedulers, for a variety of ML cloud representative workloads.	en
heal.advisorName	Σούντρης, Δημήτριος	el
heal.committeeMemberName	Πνευματικάκος, Διονύσιος	el
heal.committeeMemberName	Σούντρης, Δημήτριος	el
heal.committeeMemberName	Γκούμας, Γεώργιος	el
heal.academicPublisher	Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών. Τομέας Τεχνολογίας Πληροφορικής και Υπολογιστών	el
heal.academicPublisherID	ntua
heal.numberOfPages	96 σ.	el
heal.fullTextAvailability	false