1. bookVolume 9 (2018): Issue 1 (March 2018)
Journal Details
License
Format
Journal
eISSN
1946-0163
First Published
23 Nov 2011
Publication timeframe
2 times per year
Languages
English
access type Open Access

Towards General Evaluation of Intelligent Systems: Lessons Learned from Reproducing AIQ Test Results

Published Online: 07 Mar 2018
Volume & Issue: Volume 9 (2018) - Issue 1 (March 2018)
Page range: 1 - 54
Received: 17 Feb 2017
Accepted: 06 Feb 2018
Journal Details
License
Format
Journal
eISSN
1946-0163
First Published
23 Nov 2011
Publication timeframe
2 times per year
Languages
English
Abstract

This paper attempts to replicate the results of evaluating several artificial agents using the Algorithmic Intelligence Quotient test originally reported by Legg and Veness. Three experiments were conducted: One using default settings, one in which the action space was varied and one in which the observation space was varied. While the performance of freq, Q0, Qλ, and HLQλ corresponded well with the original results, the resulting values differed, when using MC-AIXI. Varying the observation space seems to have no qualitative impact on the results as reported, while (contrary to the original results) varying the action space seems to have some impact. An analysis of the impact of modifying parameters of MC-AIXI on its performance in the default settings was carried out with the help of data mining techniques used to identifying highly performing configurations. Overall, the Algorithmic Intelligence Quotient test seems to be reliable, however as a general artificial intelligence evaluation method it has several limits. The test is dependent on the chosen reference machine and also sensitive to changes to its settings. It brings out some differences among agents, however, since they are limited in size, the test setting may not yet be sufficiently complex. A demanding parameter sweep is needed to thoroughly evaluate configurable agents that, together with the test format, further highlights computational requirements of an agent. These and other issues are discussed in the paper along with proposals suggesting how to alleviate them. An implementation of some of the proposals is also demonstrated.

Keywords

Besold, T.; Hernández-Orallo, J.; and Schmid, U. 2015. Can Machine Intelligence be Measured in the Same Way as Human intelligence? KI - Künstliche Intelligenz 29(3):291-297.10.1007/s13218-015-0361-4Search in Google Scholar

Breiman, L.; Friedman, J. H.; Olsen, R. A.; and Stone, C. J. 1984. Classification and Regression Trees. Belmont: Thomson Wadsworth.Search in Google Scholar

Bringsjord, S., and Schimanski, B. 2003. What Is Artificial Intelligence? Psychometric AI as an Answer. In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI’03), 887-893.Search in Google Scholar

de Mey, M. 1992. The Cognitive Paradigm. Chicago and London: University of Chicago Press.Search in Google Scholar

Dennett, D. C. 1991. Consciousness Explained. London: Penguin Books.Search in Google Scholar

Descartes, R. 1637. A Discourse on Method. Oxford: Oxford University Press.Search in Google Scholar

Dowe, D. L., and Hájek, A. R. 1998. A Non-Behavioural, Computational Extension to the Turing Test. In Proceedings of International Conference on Computational Intelligence & Multimedia Applications (ICCIMA’98), Gippsland, Australia, 101-106.Search in Google Scholar

Goertzel, B. 2010. Toward a Formal Characterization of Real-World General Intelligence. In Baum, E.; Hutter, M.; and Kitzelmann, E., eds., Proceedings of the 3rd Conference on Artificial General Intelligence, AGI 2010, 19-24. Amsterdam-Beijing-Paris: Atlantis Press.10.2991/agi.2010.17Search in Google Scholar

Goertzel, B. 2014. Artificial General Intelligence: Concept, State of the Art, and Future Prospects. Journal of Artificial General Intelligence 5(1):1-48.10.2478/jagi-2014-0001Search in Google Scholar

Harnad, S. 1991. Other Bodies, Other Minds: A Machine Incarnation of an Old Philosophical Problem. Minds and Machines 1(1):43-54.Search in Google Scholar

Hernández-Orallo, J., and Dowe, D. L. 2010. Measuring Universal Intelligence: Towards an Anytime Intelligence Test. Artificial Intelligence 174(18):1508-1539.10.1016/j.artint.2010.09.006Search in Google Scholar

Hernandez-Orallo, J. 2000. Beyond the Turing Test. Journal of Logic, Language and Information 9(4):447-466.10.1023/A:1008367325700Search in Google Scholar

Hernández-Orallo, J. 2010. A (hopefully) Unbiased Universal Environment Class for Measuring Intelligence of Biological and Artificial Systems. In Baum, E.; Hutter, M.; and Kitzelmann, E., eds., Proceedings of the 3rd Conference on Artificial General Intelligence, AGI 2010, 182-183. Amsterdam-Beijing-Paris: Atlantis Press.10.2991/agi.2010.18Search in Google Scholar

Hernández-Orallo, J. 2015. C-Tests Revisited: Back and Forth with Complexity. In Bieger, J.; Goertzel, B.; and Potapov, A., eds., Proceedings of the 8th Conference on Artificial General Intelligence, AGI 2015, volume 9205 of Lecture notes in artificial intelligence, 272-282. Berlin: Springer.10.1007/978-3-319-21365-1_28Search in Google Scholar

Hernández-Orallo, J. 2017. The Measure of All Minds. Cambridge: Cambridge University Press.10.1017/9781316594179Search in Google Scholar

Hibbard, B. 2009. Bias and No Free Lunch in Formal Measures of Intelligence. Journal of Artificial General Intelligence 1(1):54-61.10.2478/v10229-011-0004-6Search in Google Scholar

Hothorn, T.; Hornik, K.; and Zeileis, A. 2006. Unbiased Recursive Partitioning: A Conditional Inference Framework. Journal of Computational and Graphical Statistics 3(15):651-674.10.1198/106186006X133933Search in Google Scholar

Hutter, M., and Legg, S. 2007. Temporal Difference Updating without a Learning Rate. In Platt, J. C.; Koller, D.; Singer, Y.; and Roweis, S. T., eds., Advances in Neural Information Processing Systems 20, 705-712. Curran Associates, Inc.Search in Google Scholar

Insa-Cabrera, J.; Dowe, D. L.; Espa˜na-Cubillo, S.; Hernández-Lloreda, M. V.; and Hernández-Orallo, J. 2011. Comparing Humans and AI Agents. In Schmidhuber, J.; Th´orisson, K. R.; and Looks, M., eds., Proceedings of the 4th Conference on Artificial General Intelligence, AGI 2011, volume 6830 of Lecture notes in artificial intelligence, 122-132. Berlin: Springer.10.1007/978-3-642-22887-2_13Search in Google Scholar

Legg, S., and Hutter, M. 2007a. A Collection of Definitions of Intelligence. In Goertzel, B., and Wang, P., eds., Advances in Artificial General Intelligence: Concepts, Architectures and Algorithms, volume 157 of Frontiers in Artificial Intelligence and Applications. Amsterdam: IOS Press. 17-24.Search in Google Scholar

Legg, S., and Hutter, M. 2007b. Universal Intelligence: A Definition of Machine Intelligence. Minds and Machines 17(4):391-444.10.1007/s11023-007-9079-xSearch in Google Scholar

Legg, S., and Veness, J. 2011. AIQ: Algorithmic Intelligence Quotient [source codes]. https: //github.com/mathemajician/AIQ. Accessed: 2017-06-26.Search in Google Scholar

Legg, S., and Veness, J. 2013. An Approximation of the Universal Intelligence Measure. In Dowe, D. L., ed., Algorithmic Probability and Friends. Bayesian Prediction and Artificial Intelligence, volume 7070 of Lecture Notes in Computer Science. Berlin: Springer. 236-249.Search in Google Scholar

Müller, U. 1993. dev/lang/brainfuck-2.lha in Aminet. http://aminet.net/package.php?package=dev/lang/brainfuck-2.lha. Accessed: 2017-06-26.Search in Google Scholar

Schweizer, P. 2012. The Externalist Foundations of a Truly Total Turing Test. Minds and Machines 22(3):191-212.10.1007/s11023-012-9272-4Search in Google Scholar

Searle, J. R. 1980. Minds, Brains, and Programs. Behavioral and Brain Sciences 3(3):417-457.10.1017/S0140525X00005756Search in Google Scholar

Sun, R. 2007. The Importance of Cognitive Architectures: An Analysis Based on CLARION. Journal of Experimental & Theoretical Artificial Intelligence 19(2):159-193.10.1080/09528130701191560Search in Google Scholar

Turing, A. M. 1950. Computing Machinery and Intelligence. Mind 59(236):433-460.10.1093/mind/LIX.236.433Search in Google Scholar

Vadinský, O. 2015. Towards an Artificially Intelligent System: Possibilities of General Evaluation of Hybrid Paradigm. In Besold, T. R.; Lamb, L. C.; Icard, T.; and Miikkulainen, R., eds., Proceedings of the 10th International Workshop on Neural-Symbolic Learning and Reasoning NeSy’15, 23-29. Buenos Aires: IJCAI.Search in Google Scholar

Veness, J.; Ng, K. S.; Hutter, M.; Uther, W.; and Silver, D. 2011. A Monte Carlo AIXI Approximation. Journal of Artificial Intelligence Research 40(1):95-142.10.1613/jair.3125Search in Google Scholar

Watkins, C. 1989. Learning from Delayed Rewards. Ph.D. Dissertation, Kings College, Cambridge, England.Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo