Epoch-incremental reinforcement learning algorithms

Atiya, A.F., Parlos, A.G. and Ingber, L. (2003). A reinforcement learning method based on adaptive simulated annealing, Proceedings of the 46th International Midwest Symposiumon Circuits and Systems, Cairo, Egypt, pp. 121-124.Search in Google Scholar

Barto, A., Sutton, R. and Anderson, C. (1983). Neuronlike adaptive elements that can solve difficult learning problem, IEEE Transactions on Systems, Man, and Cybernetics13(5): 834-847.10.1109/TSMC.1983.6313077Search in Google Scholar

Cichosz, P. (1995). Truncating temporal differences: On the efficient implementation of TD(λ) for reinforcement learning, Journal of Artificial Intelligence Research2: 287-318.10.1613/jair.135Search in Google Scholar

Crook, P. and Hayes, G. (2003). Learning in a state of confusion: Perceptual aliasing in grid world navigation, TechnicalReport EDI-INF-RR-0176, University of Edinburgh, Edinburgh.Search in Google Scholar

Ernst, D., Geurts, P. and Wehenkel, L. (2005). Tree-based batch mode reinforcement learning, Journal of Machine LearningResearch 6: 503-556.Search in Google Scholar

Forbes, J. R. N. (2002). Reinforcement Learning for AutonomousVehicles, Ph.D. thesis, University of California, Berkeley, CA.Search in Google Scholar

Gelly, S. and Silver, D. (2007). Combining online and offline knowledge in UCT, Proceedings of the 24th InternationalConference on Machine Learning, Corvallis, OR, USA, pp. 273-280.Search in Google Scholar

Kaelbing, L.P., Litman, M.L. and Moore, A.W. (1996). Reinforcement learning: A survey, Journal of Artificial Intelligence4(1): 237-285.10.1613/jair.301Search in Google Scholar

Krawiec, K., Jaśkowski, W.G. and Szubert, M.G. (2011). Evolving small-board Go players using coevolutionary temporal difference learning with archives, InternationalJournal of Applied Mathematics and Computer Science21(4): 717-731, DOI: 10.2478/v10006-011-0057-3.10.2478/v10006-011-0057-3Search in Google Scholar

Lagoudakis, M. and Parr, R. (2003). Least-squares policy iteration, Journal of Machine Learning Research4: 1107-1149.Search in Google Scholar

Lanzi, P. (2000). Adaptive agents with reinforcement learning and internal memory, From Animals to Animats 6: Proceedingsof the Sixth International Conference on Simulationof Adaptive Behavior, Cambridge, MA, USA, pp. 333-342.Search in Google Scholar

Lin, L.-J. (1993). Reinforcement Learning for Robots UsingNeural Networks, Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA.Search in Google Scholar

Markowska-Kaczmar, U. and Kwaśnicka, H. (2005). NeuralNetworks Applications, Wrocław University of Technology Press, Wrocław, (in Polish).Search in Google Scholar

Moore, A. and Atkeson, C. (1993). Prioritized sweeping: Reinforcement learning with less data and less time, Machine Learning 13(1): 103-130, DOI: 10.1007/BF00993104.10.1007/BF00993104Search in Google Scholar

Moriarty, D., Schultz, A. and Grefenstette, J. (1999). Evolutionary algorithms for reinforcement learning, Journalof Artificial Intelligence Research 11: 241-276.10.1613/jair.613Search in Google Scholar

Peng, J. and Williams, R. (1993). Efficient learning and planning within the Dyna framework, Adaptive Behavior1(4): 437-454.10.1177/105971239300100403Search in Google Scholar

Reynolds, S. (2002). Experience stack reinforcement learning for off-policy control, Technical ReportCSRP-02-1, University of Birmingham, Birmingham, ftp://ftp.cs.bham.ac.uk/pub/tech-reports/2002/CSRP-02-01.ps.gz.Search in Google Scholar

Riedmiller, M. (2005). Neural reinforcement learning to swing-up and balance a real pole, Proceedings of the IEEE2005 International Conference on Systems, Man and Cybernetics,Big Island, HI, USA, pp. 3191-3196.Search in Google Scholar

Rummery, G. and Niranjan, M. (1994). On-line q-learning using connectionist systems, Technical Report CUED/FINFENG/TR 166, Cambridge University, Cambridge.Search in Google Scholar

Smart, W. and Kaelbing, L. (2002). Effective reinforcement learning for mobile robots, Proceedings of the InternationalConference on Robotics and Automation, Washington,DC, USA, pp. 3404-3410.Search in Google Scholar

Sutton, R. (1990). Integrated architectures for learning, planning, and reacting based on approximating dynamic programming, Proceedings of the Seventh InternationalConference on Machine Learning, Austin, TX, USA, pp. 216-224.Search in Google Scholar

Sutton, R. (1991). Planning by incremental dynamic programming, Proceedings of the 8th InternationalWorkshop on Machine Learning, Evanston, IL, USA, pp. 353-357.Search in Google Scholar

Sutton, R. and Barto, A. (1998). Reinforcement Learning: AnIntroduction, MIT Press, Cambridge, MA. 10.1109/TNN.1998.712192Search in Google Scholar

Vanhulsel, M., Janssens, D. and Vanhoof, K. (2009). Simulation of sequential data: An enhanced reinforcement learning approach, Expert Systems with Applications36(4): 8032-8039.10.1016/j.eswa.2008.10.056Search in Google Scholar

Watkins, C. (1989). Learning from Delayed Rewards, Ph.D. thesis, Cambridge University, Cambridge.Search in Google Scholar

Whiteson, S. (2012). Evolutionary computation for reinforcement learning, in M. Wiering and M. van Otterlo (Eds.), Reinforcement Learning: State of the Art, Springer, Berlin, pp. 325-358.10.1007/978-3-642-27645-3_10Search in Google Scholar

Whiteson, S. and Stone, P. (2006). Evolutionary function approximation for reinforcement learning, Journal of MachineLearning Research 7: 877-917.Search in Google Scholar

Ye, C., Young, N.H.C. and Wang, D. (2003). A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance, IEEE Transactionson Systems, Man, and Cybernetics, Part B: Cybernetics33(1): 17-27.10.1109/TSMCB.2003.80817918238153Search in Google Scholar

Zajdel, R. (2012). Fuzzy epoch-incremental reinforcement learning algorithm, in L. Rutkowski, M. Korytkowski, R. Scherer, R. Tadeusiewicz, L.A. Zadeh and J.M. Zurada (Eds.), Artificial Intelligence and Soft Computing, Lecture Notes in Computer Science, Vol. 7267, Springer-Verlag, Berlin/Heidelberg, pp. 359-366. 10.1007/978-3-642-29347-4_42Search in Google Scholar

eISSN:: 2083-8492
ISSN:: 1641-876X
Język:: Angielski

Częstotliwość wydawania:: 4 razy w roku
Dziedziny czasopisma:: Mathematics, Applied Mathematics

Kanał RSS czasopisma