HEAL DSpace

Total fuel oil consumption minimization using reinforcement learning

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Mesolongitis, Konstantinos en
dc.contributor.author Μεσολογγίτης, Κωνσταντίνος el
dc.date.accessioned 2023-05-31T10:53:30Z
dc.date.available 2023-05-31T10:53:30Z
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/57782
dc.identifier.uri http://dx.doi.org/10.26240/heal.ntua.25479
dc.rights Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα *
dc.rights Αναφορά Δημιουργού-Όχι Παράγωγα Έργα 3.0 Ελλάδα *
dc.rights.uri http://creativecommons.org/licenses/by-nd/3.0/gr/ *
dc.subject Machine Learning en
dc.subject Reinforcement Learning en
dc.subject Fuel Oil Consumption Minimization en
dc.title Total fuel oil consumption minimization using reinforcement learning en
heal.type bachelorThesis
heal.classification Machine Learning en
heal.classification Reinforcement Learning en
heal.language en
heal.access free
heal.recordProvider ntua el
heal.publicationDate 2022-09-01
heal.abstract The purpose of this research was to discover if Reinforcement Learning (RL) could produce remarkable results in assisting an autonomous ship to minimize the Total Fuel Oil Consumption (TFOC). Furthermore, if the first goal was to be achieved, a second equally important objective was the discovery of the most suitable RL agent for this initiative. In this particular case study, the environment consisted of the sea area - which can be explored by the ship - the existing boundaries, - such as islands, rocky islets and land where the access is forbidden - and the weather conditions. The first part was related to the environment construction. A grid-world environment was selected and the testing route represented a short journey between two ports in the area of the Faroe Islands: Torshavn and Krambatangi. There was no particular reason for choosing this exact course, other than the fact that this was the first constructed environment, and since it was capable of acting as proof of concept for the needs of the project, it was decided to be the final environment. The boundaries were transformed according to a computationally cost effective technique that was developed for the purpose of this task. In order to achieve this, some of the accuracy, as regarding the boundaries position, was sacrificed. This decision was made taking into consideration the massive amount of computations being executed during a Reinforcement Learning experiment. The aforementioned choice did not really substantially affect the problem at hand at all. Since permitting an actual ship to explore its environment for the sake of learning is an unnecessary and costly enterprise (ship rental, fuel costs, crew related costs, ...), relevant simulations are implemented instead. A reliable fuel consumption process was mandatory for the success of this project. To achieve that, data were obtained from a shipping company, which were used to train an Artificial Neural Network (ANN). A Long Short Term Memory (LSTM) ANN was implemented as it has been proven to be superior at tracking a time-series outcome. Especially when the previous observations are not independent form the current ones. The following features were used as inputs; Speed overground, Significant Waves Height, Draught AFT, Draught FWD and Distance Overground. A weather approximating function was designed in order to be applied into the environment. A storm center that interfered with the minimum distance route was chosen, in order to examine if the agent was capable of learning to choose the longer but cheaper route - the one where the minimum amount of fuel was consumed. Since the environment’s state space was really large, it appeared like a Deep Q-Network (DQN) agent would be the best option for this project. A value-approximating agent was required, in order to predict the values of states that had not been experienced before. However, the limits of the Q-learning agent were put to the test and the improved and more sophisticated rainbow agent was employed in order to detect the best available option. The results revealed the dominance of the rainbow agent. Q-learning agent, as expected, was limited from the state’s space size. While, on the other hand, the DQN agent was extremely unstable and sensitive to hyperparameters tuning. This application, also demonstrates the feasibility of minimizing the Total Fuel Oil Consumption (TFOC) using Reinforcement Learning (RL). en
heal.advisorName Papalambrou, George en
heal.committeeMemberName Papadopoulos, Christos en
heal.committeeMemberName Themelis, Nikolaos en
heal.academicPublisher Εθνικό Μετσόβιο Πολυτεχνείο. Σχολή Ναυπηγών Μηχανολόγων Μηχανικών. Τομέας Ναυτικής Μηχανολογίας el
heal.academicPublisherID ntua
heal.numberOfPages 75 σ. el
heal.fullTextAvailability false


Αρχεία σε αυτό το τεκμήριο

Οι παρακάτω άδειες σχετίζονται με αυτό το τεκμήριο:

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής

Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα Εκτός από όπου ορίζεται κάτι διαφορετικό, αυτή η άδεια περιγράφεται ως Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα