
Ashby, W. R. 1960. Design for a Brain. Springer Science & Business Media.10.1007/978-94-015-1320-3Search in Google Scholar

Barto, A. G.; Singh, S.; and Chentanez, N. 2004. Intrinsically motivated learning of hierarchical collections of skills. In Proc. 3rd Int. Conf. Development Learn, 112-119.Search in Google Scholar

Cañamero, D. 1997. Modeling motivations and emotions as a basis for intelligent behavior. In Proceedings of the first international conference on Autonomous agents, 148-155. ACM.10.1145/267658.267688Search in Google Scholar

Dawkins, R. 1976. The Selfish Gene. Oxford University Press, Oxford, UK.Search in Google Scholar

Dayan, P., and Hinton, G. E. 1996. Varieties of Helmholtz machine. Neural Networks 9(8):1385-1403.10.1016/S0893-6080(96)00009-3Search in Google Scholar

Doya, K., and Uchibe, E. 2005. The cyber rodent project: Exploration of adaptive mechanisms for self-preservation and self-reproduction. Adaptive Behavior 13(2):149-160.10.1177/105971230501300206Search in Google Scholar

Elfwing, S.; Uchibe, E.; Doya, K.; and Christensen, H. I. 2005. Biologically inspired embodied evolution of survival. In Evolutionary Computation, 2005. The 2005 IEEE Congress on, volume 3, 2210-2216. IEEE.Search in Google Scholar

Hester, T., and Stone, P. 2012. Learning and using models. In Reinforcement Learning. Springer. 111-141.10.1007/978-3-642-27645-3_4Search in Google Scholar

Jordan, M. I.; Ghahramani, Z.; Jaakkola, T. S.; and Saul, L. K. 1999. An introduction to variational methods for graphical models. Machine learning 37(2):183-233.10.1023/A:1007665907178Open DOISearch in Google Scholar

Kaelbling, L. P.; Littman, M. L.; and Moore, A. W. 1996. Reinforcement learning: A survey. arXiv preprint cs/9605103.10.1613/jair.301Search in Google Scholar

Kappen, H. J.; G´omez, V.; and Opper, M. 2012. Optimal control as a graphical model inference problem. Machine learning 87(2):159-182.10.1007/s10994-012-5278-7Search in Google Scholar

Keramati, M., and Gutkin, B. S. 2011. A reinforcement learning theory for homeostatic regulation. In Advances in Neural Information Processing Systems, 82-90.Search in Google Scholar

Kingma, D., and Ba, J. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.Search in Google Scholar

Kingma, D. P., and Welling, M. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.Search in Google Scholar

Konidaris, G., and Barto, A. 2006. An adaptive robot motivational system. In From Animals to Animats 9. Springer. 346-356.10.1007/11840541_29Search in Google Scholar

Lange, S.; Riedmiller, M.; and Voigtlander, A. 2012. Autonomous reinforcement learning on raw visual input data in a real world application. In Neural Networks (IJCNN), The 2012 International Joint Conference on, 1-8. IEEE.10.1109/IJCNN.2012.6252823Search in Google Scholar

Lin, L.-J. 1992. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine learning 8(3-4):293-321.10.1007/BF00992699Open DOISearch in Google Scholar

McFarland, D., and B¨osser, T. 1993. Intelligent behavior in animals and robots. MIT Press.Search in Google Scholar

McFarland, D., and Houston, A. 1981. Quantitative ethology. Pitman Advanced Pub. Program.Search in Google Scholar

McFarland, D., and Spier, E. 1997. Basic cycles, utility and opportunism in self-sufficient robots. Robotics and Autonomous Systems 20(2):179-190.10.1016/S0921-8890(96)00069-3Open DOISearch in Google Scholar

Meyer, J.-A., and Guillot, A. 1991. Simulation of adaptive behavior in animats: Review and prospect. In In J.-A. Meyer and S.W. Wilson (Eds.) From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior, 2-14.Search in Google Scholar

Mnih, A., and Gregor, K. 2014. Neural variational inference and learning in belief networks. arXiv preprint arXiv:1402.0030.Search in Google Scholar

Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A. A.; Veness, J.; Bellemare, M. G.; Graves, A.; Riedmiller, M.; Fidjeland, A. K.; Ostrovski, G.; et al. 2015. Human-level control through deep reinforcement learning. Nature 518(7540):529-533.10.1038/nature1423625719670Search in Google Scholar

Nakamura, M., and Yamakawa, H. 2016. A Game-Engine-Based Learning Environment Framework for Artificial General Intelligence. In International Conference on Neural Information Processing, 351-356. Springer.10.1007/978-3-319-46687-3_39Search in Google Scholar

Ng, A. Y.; Harada, D.; and Russell, S. 1999. Policy invariance under reward transformations: Theory and application to reward shaping. In ICML, volume 99, 278-287.Search in Google Scholar

Ogata, T., and Sugano, S. 1997. Emergence of Robot Behavior Based on Self-Preservation. Research Methodology and Embodiment of Mechanical System. Journal of the Robotics Society of Japan 15(5):710-721.10.7210/jrsj.15.710Search in Google Scholar

Omohundro, Stephen M, S. M. 2008. The Basic AI Drives. In Artificial General Intelligence, 2008: Proceedings of the First AGI Conference, volume 171, 483. IOS Press.Search in Google Scholar

Pfeifer, R., and Scheier, C. 1999. Understanding intelligence. MIT press.Search in Google Scholar

Ranganath, R.; Gerrish, S.; and Blei, D. M. 2013. Black box variational inference. arXiv preprint arXiv:1401.0118.Search in Google Scholar

Rawlik, K.; Toussaint, M.; and Vijayakumar, S. 2013. On stochastic optimal control and reinforcement learning by approximate inference. In Proceedings of the Twenty-Third international joint conference on Artificial Intelligence, 3052-3056. AAAI Press.10.15607/RSS.2012.VIII.045Search in Google Scholar

Rummery, G. A., and Niranjan, M. 1994. On-line Q-learning using connectionist systems. University of Cambridge, Department of Engineering.Search in Google Scholar

Rusu, A. A.; Vecerik, M.; Roth¨orl, T.; Heess, N.; Pascanu, R.; and Hadsell, R. 2016. Sim-to-real robot learning from pixels with progressive nets. arXiv preprint arXiv:1610.04286.Search in Google Scholar

Sibly, R., and McFarland, D. 1976. On the fitness of behavior sequences. American Naturalist 601-617.10.1086/283093Open DOISearch in Google Scholar

Spier, E. 1997. From reactive behaviour to adaptive behaviour: motivational models for behaviour in animals and robots. Ph.D. Dissertation, University of Oxford.Search in Google Scholar

Toda, M. 1962. The design of a fungus-eater: A model of human behavior in an unsophisticated environment. Behavioral Science 7(2):164-183.10.1002/bs.3830070203Open DOISearch in Google Scholar

Toda, M. 1982. Man, robot, and society: Models and speculations. M. Nijhoff Pub.10.1007/978-94-017-5358-6Open DOISearch in Google Scholar

Todorov, E. 2008. General duality between optimal control and estimation. In Decision and Control, 2008. CDC 2008. 47th IEEE Conference on, 4286-4292. IEEE.10.1109/CDC.2008.4739438Search in Google Scholar

Toussaint, M.; Harmeling, S.; and Storkey, A. 2006. Probabilistic inference for solving (PO) MDPs. Informatics research report 0934, University of Edinburgh.Search in Google Scholar

Toussaint, M. 2009. Robot trajectory optimization using approximate inference. In Proceedings of the 26th Annual International Conference on Machine Learning, 1049-1056. ACM.10.1145/1553374.1553508Search in Google Scholar

Vlassis, N., and Toussaint, M. 2009. Model-free reinforcement learning as mixture learning. In Proceedings of the 26th Annual International Conference on Machine Learning, 1081-1088. ACM.10.1145/1553374.1553512Search in Google Scholar

Walter, W. 1953. The living brain. Norton.Search in Google Scholar

Young, J. Z. 1966. The Memory System of the Brain. Oxford University Press.10.1525/9780520346468Search in Google Scholar

Calendario de la edición:
2 veces al año
Temas de la revista:
Computer Sciences, Artificial Intelligence