Markov Decision Processes (
This study investigates non-stationary finite-horizon
To consider the fluctuations of rewards with time, the authors define new nonstationary finite-horizon
The proposed method calculates the optimal values of the investment to maximize its expected total return with consideration of the time value of money.
No existing studies have before examined dynamic finite-horizon problems that account for temporal fluctuations in rewards.