[
Allamigeon, X., Boyet, M., Gaubert, S. (2021). Piecewise Affine Dynamical Models of Petri Nets–Application to Emergency Call Centers. Fundamenta Informaticae, 183(3–4), 169–201. DOI: 10.3233/FI-2021-2086.
]Search in Google Scholar
[
Asadi, A., Pinkley, S.N., Mes, M. (2022). A Markov decision process approach for managing medical drone deliveries. Expert Systems With Applications, 204, 117490. DOI: 10.1016/j. eswa.2022.117490.
]Search in Google Scholar
[
Bellman, R. (1958). Dynamic programming and stochastic control processes. Information and Control, 1(3), 228–239. DOI: 10.1016/S0019-9958(58)80003-0.
]Search in Google Scholar
[
Bertsekas, D. (2012). Dynamic programming and optimal control: Volume I (vol. 1). Athena scientific.
]Search in Google Scholar
[
Bertsimas, D., Mišić, V.V. (2016). Decomposable markov decision processes: A fluid optimization approach. Operations Research, 64(6), 1537–1555. DOI: 10.1287/opre.2016.1531.
]Search in Google Scholar
[
Dulac-Arnold, G., Levine, N., Mankowitz, D.J., Li, J., Paduraru, C., Gowal, S., Hester, T. (2021). Challenges of real-world reinforcement learning: Definitions, benchmarks and analysis. Machine Learning, 110(9), 2419–2468. DOI: 10.1007/s10994-021-05961-4.
]Search in Google Scholar
[
El Akraoui, B., Daoui, C., Larach, A. (2022). Decomposition Methods for Solving Finite-Horizon Large MDPs. Journal of Mathematics, 2022. DOI: 10.1155/2022/8404716.
]Search in Google Scholar
[
Emadi, H., Atkins, E., Rastgoftar, H. (2022). A Finite-State Fixed-Corridor Model for UAS Traffic Management. ArXiv Preprint ArXiv:2204.05517.
]Search in Google Scholar
[
Feinberg, E.A. (2016). Optimality conditions for inventory control. In Optimization Challenges in Complex, Networked and Risky Systems (pp. 14–45). INFORMS. DOI: 10.1287/educ.2016.0145.
]Search in Google Scholar
[
Hordijk, A., Kallenberg, L.C.M. (1984). Transient policies in discrete dynamic programming: Linear programming including suboptimality tests and additional constraints. Mathematical Programming, 30(1), 46–70. DOI: 10.1007/BF02591798.
]Search in Google Scholar
[
Howard, R.A. (1960). Dynamic programming and markov processes. MIT Press, Cambridge, MA. https://books.google.co.ma/books?id=fXJEAAAAIAAJ.
]Search in Google Scholar
[
Kallenberg, L.C.M. (1983). Linear programming and finite Markovian control problems, Math. Centre Tracts, 148, 1–245.
]Search in Google Scholar
[
Larach, A., Chafik, S., Daoui, C. (2017). Accelerated decomposition techniques for large discounted Markov decision processes. Journal of Industrial Engineering International, 13(4), 417–426. DOI: 10.1007/s40092-017-0197-7.
]Search in Google Scholar
[
Mao, W., Zheng, Z., Wu, F., Chen, G. (2018). Online Pricing for Revenue Maximization with Unknown Time Discounting Valuations. IJCAI, 440–446. DOI: 10.24963/ijcai.2018/61.
]Search in Google Scholar
[
Pavitsos, A., Kyriakidis, E.G. (2009). Markov decision models for the optimal maintenance of a production unit with an upstream buffer. Computers & Operations Research, 36(6), 1993–2006. DOI: 10.1016/j.cor.2008.06.014.
]Search in Google Scholar
[
Peng, H., Cheng, Y., Li, X. (2023). Real-Time Pricing Method for Spot Cloud Services with Non-Stationary Excess Capacity. Sustainability, 15(4), 3363.
]Search in Google Scholar
[
Puterman, M.L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons Inc. DOI: 10.1002/9780470316887.
]Search in Google Scholar
[
Rimélé, A., Grangier, P., Gamache, M., Gendreau, M., Rousseau, L.-M. (2021). E-commerce warehousing: Learning a storage policy. ArXiv Preprint ArXiv:2101.08828. DOI: 10.48550/arXiv.2101.08828.
]Search in Google Scholar
[
Spieksma, F., Nunez-Queija, R. (2015). Markov Decision Processes. Adaptation of the Text by R. Nunez-Queija, 55.
]Search in Google Scholar
[
Sutton, R.S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3(1), 9–44. DOI: 10.1007/BF00115009.
]Search in Google Scholar
[
White III, C.C., White, D.J. (1989). Markov decision processes. European Journal of Operational Research, 39(1), 1–16. DOI: 10.1016/0377-2217(89)90348-2.
]Search in Google Scholar
[
Wu, Y., Zhang, J., Ravey, A., Chrenko, D., Miraoui, A. (2020). Real-time energy management of photovoltaic-assisted electric vehicle charging station by markov decision process. Journal of Power Sources, 476, 228504.
]Search in Google Scholar
[
Ye, G., Lin, Q., Juang, T.-H., Liu, H. (2020). Collision-free Navigation of Human-centered Robots via Markov Games. 2020 IEEE International Conference on Robotics and Automation (ICRA), 11338–11344. DOI: 10.1109/ICRA40945.2020.9196810.
]Search in Google Scholar
[
Ye, Y. (2011). The simplex and policy-iteration methods are strongly polynomial for the Markov decision problem with a fixed discount rate. Mathematics of Operations Research, 36(4), 593–603. DOI: 10.1287/moor.1110.0516.
]Search in Google Scholar
[
Zhang, Y., Kim, C.-W., Tee, K.F. (2017). Maintenance management of offshore structures using Markov process model with random transition probabilities. Structure and Infrastructure Engineering, 13(8), 1068–1080. DOI: 10.1080/15732479.2016.1236393.
]Search in Google Scholar