dc.contributor.author |
Apostolikas, G |
en |
dc.contributor.author |
Tzafestas, S |
en |
dc.date.accessioned |
2014-03-01T02:42:48Z |
|
dc.date.available |
2014-03-01T02:42:48Z |
|
dc.date.issued |
2004 |
en |
dc.identifier.issn |
1079-8587 |
en |
dc.identifier.uri |
https://dspace.lib.ntua.gr/xmlui/handle/123456789/31089 |
|
dc.relation.uri |
http://www.scopus.com/inward/record.url?eid=2-s2.0-4344697161&partnerID=40&md5=e3fd8e8490f9ff67ea9d6f90b0bbd0e5 |
en |
dc.subject |
Action selection |
en |
dc.subject |
POMDP |
en |
dc.subject |
QMDP |
en |
dc.subject.classification |
Automation & Control Systems |
en |
dc.subject.classification |
Computer Science, Artificial Intelligence |
en |
dc.subject.other |
Computational complexity |
en |
dc.subject.other |
Costs |
en |
dc.subject.other |
Embedded systems |
en |
dc.subject.other |
Evolutionary algorithms |
en |
dc.subject.other |
Feedback |
en |
dc.subject.other |
Information retrieval |
en |
dc.subject.other |
Learning systems |
en |
dc.subject.other |
Statistical methods |
en |
dc.subject.other |
Vectors |
en |
dc.subject.other |
Action Selection |
en |
dc.subject.other |
Optimal policy |
en |
dc.subject.other |
Partially Observable Markov Decision Processes (POMDP) |
en |
dc.subject.other |
QMDP |
en |
dc.subject.other |
Markov processes |
en |
dc.title |
Improved QMDP policy for partially observable Markov decision processes in large domains: Embedding exploration dynamics |
en |
heal.type |
conferenceItem |
en |
heal.language |
English |
en |
heal.publicationDate |
2004 |
en |
heal.abstract |
Artificial Intelligence techniques were primarily focused on domains in which at each time the state of the world is known to the system. Such domains can be modeled as a Markov Decision Process (MDP). Action and planning policies for MDPs have been studied extensively and several efficient methods exist. However, in real world problems pieces of information useful for the process of action selection are often missing. The theory of Partially Observable Markov Decision Processes (POMDP's) covers the problem domain in which the full state of the environment is not directly perceivable by the agent. Current algorithms for the exact solution of POMDP's are only applicable to domains with a small number of states. To cope with more extended state spaces, a number of methods that achieve sub-optimal solutions exist and among these the Q(MDP) approach seems to be the best. We introduce a novel technique, called Explorative Q(MDP) (EQ(MDP)) which constitutes an important enhancement of the Q(MDP) method. To the best knowledge of the authors, EQ(MDP) is currently the most efficient method applicable to large POMDP domains. |
en |
heal.publisher |
AUTOSOFT PRESS |
en |
heal.journalName |
Intelligent Automation and Soft Computing |
en |
dc.identifier.isi |
ISI:000223356100002 |
en |
dc.identifier.volume |
10 |
en |
dc.identifier.issue |
3 |
en |
dc.identifier.spage |
209 |
en |
dc.identifier.epage |
220 |
en |