Discrete Uncertainty Quantification For Offline Reinforcement Learning

In many Reinforcement Learning (RL) tasks, the classical online interaction of the learning agent with the environment is impractical, either because such interaction is expensive or dangerous. In these cases, previous gathered data can be used, arising what is typically called Offline RL. However, this type of learning faces a large number of challenges, mostly derived from the fact that exploration/exploitation trade-off is overshadowed. In addition, the historical data is usually biased by the way it was obtained, typically, a sub-optimal controller, producing a distributional shift from historical data and the one required to learn the optimal policy. In this paper, we present a novel approach to deal with the uncertainty risen by the absence or sparse presence of some state-action pairs in the learning data. Our approach is based on shaping the reward perceived from the environment to ensure the task is solved. We present the approach and show that combining it with classic online RL methods make them perform as good as state of the art Offline RL algorithms such as CQL and BCQ. Finally, we show that using our method on top of established offline learning algorithms can improve them.

Język:: Angielski

Częstotliwość wydawania:: 4 razy w roku
Dziedziny czasopisma:: Informatyka, Bazy danych i eksploracja danych, Sztuczna inteligencja

Kanał RSS czasopisma

Discrete Uncertainty Quantification For Offline Reinforcement Learning

José Luis Pérez

Javier Corrochano

Javier García

Rubén Majadas

Cristina Ibañez-Llano

Sergio Pérez

Fernando Fernández

Data publikacji: 30 paź 2023

Zakres stron: 273 - 287

Otrzymano: 02 cze 2023

Przyjęty: 07 paź 2023

DOI: https://doi.org/10.2478/jaiscr-2023-0019

Słowa kluczoweOff-line Reinforcement Learning, uncertainty quantification, Machine Learning

© 2023 José Luis Pérez et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Słowa kluczowe
Off-line Reinforcement Learning, uncertainty quantification, Machine Learning