Open Access

Metrics for Assessing Generalization of Deep Reinforcement Learning in Parameterized Environments


Cite

Robert Kirk, Amy Zhang, Edward Grefenstette, and Tim Rockt¨aschel. A Survey of Zero-shot Generalisation in Deep Reinforcement Learning. Journal of Artificial Intelligence Research, 76: 201–264, January 2023. ISSN 1076-9757. doi:10.1613/jair.1.14174. Search in Google Scholar

Katsuhiko Ogata. Modern Control Engineering. Prentice Hall, 2010. ISBN 978-0-13-615673-4. Search in Google Scholar

Richard S. Sutton and Andrew G. Barto. Sutton & Barto Book: Reinforcement Learning: An Introduction. 2018. ISBN 978-0-262-03924-6. Search in Google Scholar

Dimitri P. Bertsekas. Reinforcement Learning and Optimal Control. 2019. ISBN 978-1-886529-39-7. Search in Google Scholar

Hiroki Furuta and et al. Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning. In Proceedings of the 38th International Conference on Machine Learning, pages 3541–3552. PMLR, July 2021. Search in Google Scholar

Richard S. Sutton, Michael H. Bowling, and Patrick M. Pilarski. The Alberta Plan for AI Research, August 2022. Search in Google Scholar

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal Policy Optimization Algorithms. August 2017. Search in Google Scholar

John Schulman, Sergey Levine, Philipp Moritz, Michael I. Jordan, and Pieter Abbeel. Trust Region Policy Optimization. April 2017. Search in Google Scholar

Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. arXiv:1801.01290 [cs, stat], August 2018. Search in Google Scholar

Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement learning. arXiv:1509.02971 [cs, stat], July 2019. Search in Google Scholar

Assaf Hallak, Dotan Di Castro, and Shie Mannor. Contextual Markov Decision Processes, February 2015. Search in Google Scholar

Dibya Ghosh, Jad Rahme, Aviral Kumar, Amy Zhang, Ryan P. Adams, and Sergey Levine. Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability, July 2021. Search in Google Scholar

Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. OpenAI Gym. arXiv:1606.01540 [cs], June 2016. Search in Google Scholar

Saran Tunyasuvunakool, Alistair Muldal, Yotam Doron, Siqi Liu, Steven Bohez, Josh Merel, Tom Erez, Timothy Lillicrap, Nicolas Heess, and Yuval Tassa. Dm control: Software and tasks for continuous control. Software Impacts, 6: 100022, November 2020. ISSN 26659638. doi:10.1016/j.simpa.2020.100022. Search in Google Scholar

Karl Cobbe, Chris Hesse, Jacob Hilton, and John Schulman. Leveraging Procedural Generation to Benchmark Reinforcement Learning. In Proceedings of the 37th International Conference on Machine Learning, pages 2048–2056. PMLR, November 2020. Search in Google Scholar

Kevin Frans and Phillip Isola. Powderworld: A Platform for Understanding Generalization via Rich Task Distributions, November 2022. Search in Google Scholar

Farama Foundation. Gymnasium, 2023. URL https://gymnasium.farama.org/. Search in Google Scholar

Sumukh Aithal K, Dhruva Kashyap, and Natarajan Subramanyam. Robustness to Augmentations as a Generalization metric. arXiv:2101.06459 [cs], January 2021. Search in Google Scholar

OpenAI, Ilge Akkaya, and et al. Solving Rubik’s Cube with a Robot Hand, October 2019. Search in Google Scholar

Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, and Sergey Levine. Soft Actor-Critic Algorithms and Applications. arXiv:1812.05905 [cs, stat], January 2019. Search in Google Scholar

Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn, and Shimon Whiteson. A Survey of Meta-Reinforcement Learning, January 2023. Search in Google Scholar

Charles Packer, Katelyn Gao, Jernej Kos, Philipp Kr¨ahenbühl, Vladlen Koltun, and Dawn Song. Assessing Generalization in Deep Reinforcement Learning. March 2019. Search in Google Scholar

Jianda Chen and Sinno Pan. Learning representations via a robust behavioral metric for deep reinforcement learning. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 36654–36666. Curran Associates, Inc., 2022. Search in Google Scholar

Sam Witty, Jun K. Lee, Emma Tosch, Akanksha Atrey, Kaleigh Clary, Michael L. Littman, and David Jensen. Measuring and characterizing generalization in deep reinforcement learning. Applied AI Letters, 2(4), December 2021. ISSN 2689-5595, 2689-5595. doi:10.1002/ail2.45. Search in Google Scholar

Karl Cobbe, Oleg Klimov, Chris Hesse, Taehoon Kim, and John Schulman. Quantifying Generalization in Reinforcement Learning. In Proceedings of the 36th International Conference on Machine Learning, pages 1282–1289. PMLR, May 2019. Search in Google Scholar

Qucheng Peng, Zhengming Ding, Lingjuan Lyu, Lichao Sun, and Chen Chen. RAIN: RegulArization on Input and Network for Black-Box Domain Adaptation. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, pages 4118–4126. International Joint Conferences on Artificial Intelligence Organization,. ISBN 978-1-956792-03-4. doi:10.24963/ijcai.2023/458. URL https://www.ijcai.org/proceedings/2023/458 . Search in Google Scholar

Qucheng Peng, Ce Zheng, and Chen Chen. Source-free Domain Adaptive Human Pose Estimation. pages 4826–4836. URL https://openaccess.thecvf.com/content/ICCV2023/html/Peng_Source-free_Domain_Adaptive_Human_Pose_Estimation_ICCV_2023_paper.html. Search in Google Scholar

Xingyou Song, Yilun Du, and Jacob Jackson. An Empirical Study on Hyperparameters and their Interdependence for RL Generalization. June 2019. Search in Google Scholar

Aravind Rajeswaran, Kendall Lowrey, Emanuel V. Todorov, and Sham M Kakade. Towards Generalization and Simplicity in Continuous Control. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. Search in Google Scholar

Philipp Moritz and et al. Ray: A distributed framework for emerging AI applications. In Proceedings of the 13th USENIX Conference on Operating Systems Design and Implementation, OSDI’18, pages 561–577, USA, October 2018. USENIX Association. ISBN 978-1-931971-47-8. Search in Google Scholar

Stephanie C. Y. Chan, Samuel Fishman, John Canny, Anoop Korattikara, and Sergio Guadarrama. Measuring the Reliability of Reinforcement Learning Algorithms, February 2020. Search in Google Scholar

Denis Yarats, Rob Fergus, Alessandro Lazaric, and Lerrel Pinto. Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning. In Deep RL Workshop NeurIPS 2021, 2021. Search in Google Scholar

Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, and Timothy Lillicrap. Mastering Diverse Domains through World Models, January 2023. Search in Google Scholar

eISSN:
2449-6499
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Computer Sciences, Databases and Data Mining, Artificial Intelligence