HEAL DSpace

An RBF based neuro-dynamic approach for the control of stochastic dynamic systems

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Patrinos, PK en
dc.contributor.author Sarimveis, H en
dc.date.accessioned 2014-03-01T02:49:59Z
dc.date.available 2014-03-01T02:49:59Z
dc.date.issued 2005 en
dc.identifier.issn 14746670 en
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/34856
dc.relation.uri http://www.scopus.com/inward/record.url?eid=2-s2.0-79960740645&partnerID=40&md5=ccdec55739708756eadadfcd9d67027e en
dc.subject Markov decision processes en
dc.subject Optimal control en
dc.subject Radial base function networks en
dc.subject Uncertain dynamic systems en
dc.subject.other Bellman equations en
dc.subject.other Compact representation en
dc.subject.other Management problems en
dc.subject.other Markov Decision Processes en
dc.subject.other Neuro dynamic programming en
dc.subject.other Optimal controls en
dc.subject.other Policy evaluation en
dc.subject.other Policy iteration en
dc.subject.other Radial base function en
dc.subject.other Stochastic dynamic systems en
dc.subject.other Uncertain dynamic systems en
dc.subject.other Automation en
dc.subject.other Control en
dc.subject.other Dynamic programming en
dc.subject.other Inventory control en
dc.subject.other Learning algorithms en
dc.subject.other Markov processes en
dc.subject.other Radial basis function networks en
dc.subject.other Process control en
dc.title An RBF based neuro-dynamic approach for the control of stochastic dynamic systems en
heal.type conferenceItem en
heal.publicationDate 2005 en
heal.abstract This paper presents a neuro-dynamic programming methodology for the control of markov decision processes. The proposed method can be considered as a variant of the optimistic policy iteration, where radial basis function (RBF) networks are employed as a compact representation of the cost-to-go function and the λ-LSPE is used for policy evaluation. We also emphasize the reformulation of the Bellman equation around the post-decision state in order to circumvent the calculation of the expectation. The proposed algorithm is applied to a retailer-inventory management problem. Copyright © 2005 IFAC. en
heal.journalName IFAC Proceedings Volumes (IFAC-PapersOnline) en
dc.identifier.volume 16 en
dc.identifier.spage 52 en
dc.identifier.epage 57 en


Αρχεία σε αυτό το τεκμήριο

Αρχεία Μέγεθος Μορφότυπο Προβολή

Δεν υπάρχουν αρχεία που σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής