dc.contributor.author |
Patrinos, PK |
en |
dc.contributor.author |
Sarimveis, H |
en |
dc.date.accessioned |
2014-03-01T02:49:59Z |
|
dc.date.available |
2014-03-01T02:49:59Z |
|
dc.date.issued |
2005 |
en |
dc.identifier.issn |
14746670 |
en |
dc.identifier.uri |
https://dspace.lib.ntua.gr/xmlui/handle/123456789/34856 |
|
dc.relation.uri |
http://www.scopus.com/inward/record.url?eid=2-s2.0-79960740645&partnerID=40&md5=ccdec55739708756eadadfcd9d67027e |
en |
dc.subject |
Markov decision processes |
en |
dc.subject |
Optimal control |
en |
dc.subject |
Radial base function networks |
en |
dc.subject |
Uncertain dynamic systems |
en |
dc.subject.other |
Bellman equations |
en |
dc.subject.other |
Compact representation |
en |
dc.subject.other |
Management problems |
en |
dc.subject.other |
Markov Decision Processes |
en |
dc.subject.other |
Neuro dynamic programming |
en |
dc.subject.other |
Optimal controls |
en |
dc.subject.other |
Policy evaluation |
en |
dc.subject.other |
Policy iteration |
en |
dc.subject.other |
Radial base function |
en |
dc.subject.other |
Stochastic dynamic systems |
en |
dc.subject.other |
Uncertain dynamic systems |
en |
dc.subject.other |
Automation |
en |
dc.subject.other |
Control |
en |
dc.subject.other |
Dynamic programming |
en |
dc.subject.other |
Inventory control |
en |
dc.subject.other |
Learning algorithms |
en |
dc.subject.other |
Markov processes |
en |
dc.subject.other |
Radial basis function networks |
en |
dc.subject.other |
Process control |
en |
dc.title |
An RBF based neuro-dynamic approach for the control of stochastic dynamic systems |
en |
heal.type |
conferenceItem |
en |
heal.publicationDate |
2005 |
en |
heal.abstract |
This paper presents a neuro-dynamic programming methodology for the control of markov decision processes. The proposed method can be considered as a variant of the optimistic policy iteration, where radial basis function (RBF) networks are employed as a compact representation of the cost-to-go function and the λ-LSPE is used for policy evaluation. We also emphasize the reformulation of the Bellman equation around the post-decision state in order to circumvent the calculation of the expectation. The proposed algorithm is applied to a retailer-inventory management problem. Copyright © 2005 IFAC. |
en |
heal.journalName |
IFAC Proceedings Volumes (IFAC-PapersOnline) |
en |
dc.identifier.volume |
16 |
en |
dc.identifier.spage |
52 |
en |
dc.identifier.epage |
57 |
en |