HEAL DSpace

Improved QMDP policy for partially observable Markov decision processes in large domains: Embedding exploration dynamics

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Apostolikas, G en
dc.contributor.author Tzafestas, S en
dc.date.accessioned 2014-03-01T02:42:48Z
dc.date.available 2014-03-01T02:42:48Z
dc.date.issued 2004 en
dc.identifier.issn 1079-8587 en
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/31089
dc.relation.uri http://www.scopus.com/inward/record.url?eid=2-s2.0-4344697161&partnerID=40&md5=e3fd8e8490f9ff67ea9d6f90b0bbd0e5 en
dc.subject Action selection en
dc.subject POMDP en
dc.subject QMDP en
dc.subject.classification Automation & Control Systems en
dc.subject.classification Computer Science, Artificial Intelligence en
dc.subject.other Computational complexity en
dc.subject.other Costs en
dc.subject.other Embedded systems en
dc.subject.other Evolutionary algorithms en
dc.subject.other Feedback en
dc.subject.other Information retrieval en
dc.subject.other Learning systems en
dc.subject.other Statistical methods en
dc.subject.other Vectors en
dc.subject.other Action Selection en
dc.subject.other Optimal policy en
dc.subject.other Partially Observable Markov Decision Processes (POMDP) en
dc.subject.other QMDP en
dc.subject.other Markov processes en
dc.title Improved QMDP policy for partially observable Markov decision processes in large domains: Embedding exploration dynamics en
heal.type conferenceItem en
heal.language English en
heal.publicationDate 2004 en
heal.abstract Artificial Intelligence techniques were primarily focused on domains in which at each time the state of the world is known to the system. Such domains can be modeled as a Markov Decision Process (MDP). Action and planning policies for MDPs have been studied extensively and several efficient methods exist. However, in real world problems pieces of information useful for the process of action selection are often missing. The theory of Partially Observable Markov Decision Processes (POMDP's) covers the problem domain in which the full state of the environment is not directly perceivable by the agent. Current algorithms for the exact solution of POMDP's are only applicable to domains with a small number of states. To cope with more extended state spaces, a number of methods that achieve sub-optimal solutions exist and among these the Q(MDP) approach seems to be the best. We introduce a novel technique, called Explorative Q(MDP) (EQ(MDP)) which constitutes an important enhancement of the Q(MDP) method. To the best knowledge of the authors, EQ(MDP) is currently the most efficient method applicable to large POMDP domains. en
heal.publisher AUTOSOFT PRESS en
heal.journalName Intelligent Automation and Soft Computing en
dc.identifier.isi ISI:000223356100002 en
dc.identifier.volume 10 en
dc.identifier.issue 3 en
dc.identifier.spage 209 en
dc.identifier.epage 220 en


Αρχεία σε αυτό το τεκμήριο

Αρχεία Μέγεθος Μορφότυπο Προβολή

Δεν υπάρχουν αρχεία που σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής