HEAL DSpace

Temporal differences learning with the conjugate gradient algorithm

Αποθετήριο DSpace/Manakin

Εμφάνιση απλής εγγραφής

dc.contributor.author Falas, T en
dc.contributor.author Stafylopatis, A-G en
dc.date.accessioned 2014-03-01T02:42:02Z
dc.date.available 2014-03-01T02:42:02Z
dc.date.issued 2001 en
dc.identifier.uri https://dspace.lib.ntua.gr/xmlui/handle/123456789/30732
dc.subject Conjugate gradient en
dc.subject Neural networks en
dc.subject Reinforcement learning en
dc.subject Temporal differences en
dc.subject Time series prediction en
dc.subject.other Backpropagation en
dc.subject.other Computational complexity en
dc.subject.other Gradient methods en
dc.subject.other Learning systems en
dc.subject.other Optimization en
dc.subject.other Probability en
dc.subject.other Sensitivity analysis en
dc.subject.other Time series analysis en
dc.subject.other Conjugate gradient algorithm en
dc.subject.other Generalization ability en
dc.subject.other Learning speed en
dc.subject.other Reinforcement learning en
dc.subject.other Temporal differences en
dc.subject.other Learning algorithms en
dc.title Temporal differences learning with the conjugate gradient algorithm en
heal.type conferenceItem en
heal.identifier.primary 10.1109/IJCNN.2001.939012 en
heal.identifier.secondary http://dx.doi.org/10.1109/IJCNN.2001.939012 en
heal.publicationDate 2001 en
heal.abstract This paper investigates the use of the Conjugate Gradient (CG) algorithm in comparison to the traditional backpropagation (BP) algorithm, as applied to the Temporal Differences (TD) method for reinforcement learning. Time series prediction is the application domain examined. Simple time series (linear, sinusoidal, etc) as well as more complex ones, coming from real data (stock market indices), are used as benchmark problems. The performance measures used are the learning speed, the generalization ability, and the sensitivity on user-set parameters. Preliminary experimental results suggest that the performance (both learning speed and generalization ability) of TD learning can be significantly improved when the CG algorithm is employed, as compared to the traditional BP algorithm. In addition, as expected, the CG algorithm has been proven to be more robust and less dependent on user-set training parameters and initial conditions, especially for rather complicated time series. The use of the CG algorithm in TD learning is therefore promising for real-life applications in time series prediction. en
heal.journalName Proceedings of the International Joint Conference on Neural Networks en
dc.identifier.doi 10.1109/IJCNN.2001.939012 en
dc.identifier.volume 1 en
dc.identifier.spage 171 en
dc.identifier.epage 176 en


Αρχεία σε αυτό το τεκμήριο

Αρχεία Μέγεθος Μορφότυπο Προβολή

Δεν υπάρχουν αρχεία που σχετίζονται με αυτό το τεκμήριο.

Αυτό το τεκμήριο εμφανίζεται στην ακόλουθη συλλογή(ές)

Εμφάνιση απλής εγγραφής