[Aarts, E. H. L., and Lenstra, J. K., eds. 1997. Local Search in Combinatorial Optimization. Discrete Mathematics and Optimization. Chichester, England: Wiley-Interscience.]Search in Google Scholar
[Banzhaff, W.; Nordin, P.; Keller, E.; and Francone, F. 1998. Genetic Programming. San Francisco, CA, U.S.A.: Morgan-Kaufmann.]Search in Google Scholar
[Barron, A. R. 1985. Logically Smooth Density Estimation. Ph.D. Dissertation, Stanford University.]Search in Google Scholar
[Berry, D. A., and Fristedt, B. 1985. Bandit Problems: Sequential Allocation of Experiments. London: Chapman and Hall.10.1007/978-94-015-3711-7]Search in Google Scholar
[Brafman, R. I., and Tennenholtz, M. 2002. R-max - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning. Journal of Machine Learning Research 3:213-231.]Search in Google Scholar
[Cover, T. M., and Thomas, J. A. 2006. Elements of Information Theory. Wiley-Intersience, 2nd edition.]Search in Google Scholar
[Dearden, R.; Friedman, N.; and Andre, D. 1999. Model based Bayesian Exploration. In Proc. 15th Conference on Uncertainty in Artificial Intelligence (UAI-99), 150-159.]Search in Google Scholar
[Duff, M. 2002. Optimal Learning: Computational procedures for Bayes-adaptive Markov decision processes. Ph.D. Dissertation, Department of Computer Science, University of Massachusetts Amherst.]Search in Google Scholar
[Dzeroski, S.; de Raedt, L.; and Driessens, K. 2001. Relational Reinforcement Learning. Machine Learning 43:7-52.10.1023/A:1007694015589]Search in Google Scholar
[Fishman, G. 2003. Monte Carlo. Springer.]Search in Google Scholar
[Givan, R.; Dean, T.; and Greig, M. 2003. Equivalence Notions and Model Minimization in Markov Decision Processes. Artificial Intelligence 147(1-2):163-223.10.1016/S0004-3702(02)00376-4]Search in Google Scholar
[Goertzel, B., and Pennachin, C., eds. 2007. Artificial General Intelligence. Springer.10.1007/978-3-540-68677-4]Search in Google Scholar
[Gordon, G. 1999. Approximate Solutions to Markov Decision Processes. Ph.D. Dissertation, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA.]Search in Google Scholar
[Grünwald, P. D. 2007. The Minimum Description Length Principle. Cambridge: The MIT Press.10.7551/mitpress/4643.001.0001]Search in Google Scholar
[Guyon, I., and Elisseeff, A., eds. 2003. Variable and Feature Selection. JMLR Special Issue: MIT Press.]Search in Google Scholar
[Hastie, T.; Tibshirani, R.; and Friedman, J. H. 2001. The Elements of Statistical Learning. Springer.10.1007/978-0-387-21606-5]Search in Google Scholar
[Hutter, M. 2005. Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Berlin: Springer. 300 pages, http://www.hutter1.net/ai/uaibook.htm. http://www.hutter1.net/ai/uaibook.htm]Search in Google Scholar
[Hutter, M. 2007. Universal Algorithmic Intelligence: A Mathematical TopDown Approach. In Artificial General Intelligence. Berlin: Springer. 227-290.10.1007/978-3-540-68677-4_8]Search in Google Scholar
[Hutter, M. 2009a. Feature Dynamic Bayesian Networks. In Proc. 2nd Conf. on Artificial General Intelligence (AGI'09), volume 8, 67-73. Atlantis Press.10.2991/agi.2009.6]Search in Google Scholar
[Hutter, M. 2009b. Feature Markov Decision Processes. In Proc. 2nd Conf. on Artificial General Intelligence (AGI'09), volume 8, 61-66. Atlantis Press.10.2991/agi.2009.30]Search in Google Scholar
[Hutter, M. 2009c. Feature Reinforcement Learning: Part II: Structured MDPs. In progress. Will extend Hutter (2009a).10.2478/v10229-011-0002-8]Search in Google Scholar
[Kaelbling, L. P.; Littman, M. L.; and Cassandra, A. R. 1998. Planning and Acting in Partially Observable Stochastic Domains. Artificial Intelligence 101:99-134.10.1016/S0004-3702(98)00023-X]Search in Google Scholar
[Kearns, M. J., and Singh, S. 1998. Near-optimal reinforcement learning in polynomial time. In Proc. 15th International Conf. on Machine Learning, 260-268. Morgan Kaufmann, San Francisco, CA.]Search in Google Scholar
[Koza, J. R. 1992. Genetic Programming. The MIT Press.]Search in Google Scholar
[Kumar, P. R., and Varaiya, P. P. 1986. Stochastic Systems: Estimation, Identification, and Adaptive Control. Englewood Cliffs, NJ: Prentice Hall.]Search in Google Scholar
[Legg, S., and Hutter, M. 2007. Universal Intelligence: A Definition of Machine Intelligence. Minds & Machines 17(4):391-444.10.1007/s11023-007-9079-x]Search in Google Scholar
[Legg, S. 2008. Machine Super Intelligence. Ph.D. Dissertation, IDSIA, Lugano.]Search in Google Scholar
[Li, M., and Vitányi, P. M. B. 2008. An Introduction to Kolmogorov Complexity and its Applications. Berlin: Springer, 3rd edition.10.1007/978-0-387-49820-1]Search in Google Scholar
[Liang, P., and Jordan, M. 2008. An Asymptotic Analysis of Generative, Discriminative, and Pseudolikelihood Estimators. In Proc. 25th International Conf. on Machine Learning (ICML'08), volume 307, 584-591. ACM.10.1145/1390156.1390230]Search in Google Scholar
[Liu, J. S. 2002. Monte Carlo Strategies in Scientific Computing. Springer.]Search in Google Scholar
[Lusena, C.; Goldsmith, J.; and Mundhenk, M. 2001. Nonapproximability Results for Partially Observable Markov Decision Processes. Journal of Artificial Intelligence Research 14:83-103.10.1613/jair.714]Search in Google Scholar
[MacKay, D. J. C. 2003. Information theory, inference and learning algorithms. Cambridge, MA: Cambridge University Press.]Search in Google Scholar
[Madani, O.; Hanks, S.; and Condon, A. 2003. On the Undecidability of Probabilistic Planning and Related Stochastic Optimization Problems. Artificial Intelligence 147:5-34.10.1016/S0004-3702(02)00378-8]Search in Google Scholar
[McCallum, A. K. 1996. Reinforcement Learning with Selective Perception and Hidden State. Ph.D. Dissertation, Department of Computer Science, University of Rochester.]Search in Google Scholar
[Ng, A. Y.; Coates, A.; Diel, M.; Ganapathi, V.; Schulte, J.; Tse, B.; Berger, E.; and Liang, E. 2004. Autonomous Inverted Helicopter Flight via Reinforcement Learning. In ISER, volume 21 of Springer Tracts in Advanced Robotics, 363-372. Springer.10.1007/11552246_35]Search in Google Scholar
[Pankov, S. 2008. A Computational Approximation to the AIXI Model. In Proc. 1st Conference on Artificial General Intelligence, volume 171, 256-267.]Search in Google Scholar
[Pearlmutter, B. A. 1989. Learning State Space Trajectories in Recurrent Neural Networks. Neural Computation 1(2):263-269.10.1162/neco.1989.1.2.263]Search in Google Scholar
[Poland, J., and Hutter, M. 2006. Universal Learning of Repeated Matrix Games. In Proc. 15th Annual Machine Learning Conf. of Belgium and The Netherlands (Benelearn'06), 7-14.]Search in Google Scholar
[Poupart, P.; Vlassis, N. A.; Hoey, J.; and Regan, K. 2006. An Analytic Solution to Discrete Bayesian Reinforcement Learning. In Proc. 23rd International Conf. on Machine Learning (ICML'06), volume 148, 697-704. Pittsburgh, PA: ACM.10.1145/1143844.1143932]Search in Google Scholar
[Puterman, M. L. 1994. Markov Decision Processes — Discrete Stochastic Dynamic Programming. New York, NY: Wiley.10.1002/9780470316887]Search in Google Scholar
[Raedt, L. D.; Hammer, B.; Hitzler, P.; and Maass, W., eds. 2008. Recurrent Neural Networks - Models, Capacities, and Applications, volume 08041 of Dagstuhl Seminar Proceedings. IBFI, Schloss Dagstuhl, Germany.]Search in Google Scholar
[Ring, M. 1994. Continual Learning in Reinforcement Environments. Ph.D. Dissertation, University of Texas, Austin.]Search in Google Scholar
[Ross, S., and Pineau, J. 2008. Model-Based Bayesian Reinforcement Learning in Large Structured Domains. In Proc. 24th Conference in Uncertainty in Artificial Intelligence (UAI'08), 476-483. Helsinki: AUAI Press.]Search in Google Scholar
[Ross, S.; Pineau, J.; Paquet, S.; and Chaib-draa, B. 2008. Online Planning Algorithms for POMDPs. Journal of Artificial Intelligence Research 2008(32):663-704.10.1613/jair.2567]Search in Google Scholar
[Russell, S. J., and Norvig, P. 2003. Artificial Intelligence. A Modern Approach. Englewood Cliffs, NJ: Prentice-Hall, 2nd edition.]Search in Google Scholar
[Sanner, S., and Boutilier, C. 2009. Practical Solution Techniques for First-Order MDPs. Artificial Intelligence 173(5-6):748-788.10.1016/j.artint.2008.11.003]Search in Google Scholar
[Schmidhuber, J. 2004. Optimal Ordered Problem Solver. Machine Learning 54(3):211-254.10.1023/B:MACH.0000015880.99707.b2]Search in Google Scholar
[Schwarz, G. 1978. Estimating the Dimension of a Model. Annals of Statistics 6(2):461-464.10.1214/aos/1176344136]Search in Google Scholar
[Singh, S.; Littman, M.; Jong, N.; Pardoe, D.; and Stone, P. 2003. Learning Predictive State Representations. In Proc. 20th International Conference on Machine Learning (ICML'03), 712-719.]Search in Google Scholar
[Strehl, A. L.; Diuk, C.; and Littman, M. L. 2007. Efficient Structure Learning in Factored-State MDPs. In Proc. 27th AAAI Conference on Artificial Intelligence, 645-650. Vancouver, BC: AAAI Press.]Search in Google Scholar
[Sutton, R. S., and Barto, A. G. 1998. Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press.10.1109/TNN.1998.712192]Search in Google Scholar
[Szita, I., and Lörincz, A. 2008. The Many Faces of Optimism: a Unifying Approach. In Proc. 12th International Conference (ICML 2008), volume 307.]Search in Google Scholar
[Wallace, C. S. 2005. Statistical and Inductive Inference by Minimum Message Length. Berlin: Springer.]Search in Google Scholar
[Willems, F. M. J.; Shtarkov, Y. M.; and Tjalkens, T. J. 1997. Reections on the Prize Paper: The Context-Tree Weighting Method: Basic Properties. IEEE Information Theory Society Newsletter 20-27.]Search in Google Scholar
[Wolpert, D. H., and Macready, W. G. 1997. No Free Lunch Theorems for Optimization. IEEE Transactions on Evolutionary Computation 1(1):67-82.10.1109/4235.585893]Search in Google Scholar