[Allgower, F. and Zheng, A. (2012). Nonlinear Model Predictive Control, Springer, New York, NY.]Search in Google Scholar
[Ariño, C., Querol, A. and Sala, A. (2017). Shape-independent model predictive control for Takagi–Sugeno fuzzy systems, Engineering Applications of Artificial Intelligence65(1): 493–505.10.1016/j.engappai.2017.07.011]Search in Google Scholar
[Armesto, L., Girbés,V., Sala, A., Zima, M. and Šmídl, V. (2015). Duality-based nonlinear quadratic control: Application to mobile robot trajectory-following, IEEE Transactions on Control Systems Technology23(4): 1494–1504.10.1109/TCST.2014.2377631]Search in Google Scholar
[Armesto, L., Moura, J., Ivan, V., Erden, M.S., Sala, A. and Vijayakumar, S. (2018). Constraint-aware learning of policies by demonstration, International Journal of Robotics Research37(13–14): 1673–1689.10.1177/0278364918784354]Search in Google Scholar
[Bertsekas, D.P. (2017). Dynamic Programming and Optimal Control, Vol. 1, 4th Edn, Athena Scientific, Belmont, MA.]Search in Google Scholar
[Bertsekas, D.P. (2019). Reinforcement Learning and Optimal Control, Athena Scientific, Belmont, MA.]Search in Google Scholar
[Busoniu, L., Babuska, R., De Schutter, B. and Ernst, D. (2010). Reinforcement Learning and Dynamic Programming Using Function Approximators, CRC Press, Boca Raton, FL.]Search in Google Scholar
[Cervellera, C., Wen, A. and Chen, V.C. (2007). Neural network and regression spline value function approximations for stochastic dynamic programming, Computers & Operations Research34(1): 70–90.10.1016/j.cor.2005.02.043]Search in Google Scholar
[De Farias, D.P. and Van Roy, B. (2003). The linear programming approach to approximate dynamic programming, Operations Research51(6): 850–865.10.1287/opre.51.6.850.24925]Search in Google Scholar
[Deisenroth, M.P., Neumann, G. and Peters, J. (2013). A survey on policy search for robotics, Foundations and Trends in Robotics2(1–2): 1–142.10.1561/2300000021]Search in Google Scholar
[Díaz, H., Armesto, L. and Sala, A. (2019). Metodología de programación dinámica aproximada para control óptimo basada en datos, Revista Iberoamericana de Automática e Informática industrial16(3): 273–283.10.4995/riai.2019.10379]Search in Google Scholar
[Díaz, H., Armesto, L. and Sala, A. (2020). Fitted Q-function control methodology based on Takagi–Sugeno systems, IEEE Transactions on Control Systems Technology28(2): 477–488.10.1109/TCST.2018.2885689]Search in Google Scholar
[Lagoudakis, M.G. and Parr, R. (2003). Least-squares policy iteration, Journal of Machine Learning Research4(Dec): 1107–1149.]Search in Google Scholar
[Lewis, F.L. and Liu, D. (2013). Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, Wiley, Hoboken, NJ.10.1002/9781118453988]Search in Google Scholar
[Lewis, F.L. and Vrabie, D. (2009). Reinforcement learning and adaptive dynamic programming for feedback control, IEEE Circuits and Systems Magazine9(3): 32–50.10.1109/MCAS.2009.933854]Search in Google Scholar
[Lewis, F., Vrabie, D. and Syrmos, V. (2012). Optimal Control, 3rd Edn, John Wiley & Sons, Hoboken, NJ.10.1002/9781118122631]Search in Google Scholar
[Liu, D., Wei, Q., Wang, D., Yang, X. and Li, H. (2017). Adaptive Dynamic Programming with Applications in Optimal Control, Springer, Berlin.10.1007/978-3-319-50815-3]Search in Google Scholar
[Manne, A.S. (1960). Linear programming and sequential decisions, Management Science6(3): 259–267.10.1287/mnsc.6.3.259]Search in Google Scholar
[Marsh, L.C. and Cormier, D.R. (2001). Spline Regression Models, Number 137, Sage, Thousand Oaks, CA.10.4135/9781412985901]Search in Google Scholar
[Munos, R., Baird, L.C. and Moore, A.W. (1999). Gradient descent approaches to neural-net-based solutions of the Hamilton–Jacobi–Bellman equation, International Joint Conference on Neural Networks, Washington, DC, USA, Vol. 3, pp. 2152–2157.]Search in Google Scholar
[Munos, R. and Szepesvári, C. (2008). Finite-time bounds for fitted value iteration, Journal of Machine Learning Research9(May): 815–857.]Search in Google Scholar
[Powell, W.B. (2011). Approximate Dynamic Programming: Solving the Curses of Dimensionality, 2nd Edn, Wiley, Hoboken, NJ.10.1002/9781118029176]Search in Google Scholar
[Preitl, S., Precup, R.-E., Preitl, Z., Vaivoda, S., Kilyeni, S. and Tar, J.K. (2007). Iterative feedback and learning control. servo systems applications, IFAC Proceedings Volumes40(8): 16–27.]Search in Google Scholar
[Rantzer, A. (2006). Relaxed dynamic programming in switching systems, IEE Proceedings: Control Theory and Applications153(5): 567–574.10.1049/ip-cta:20050094]Search in Google Scholar
[Robles, R., Sala, A. and Bernal, M. (2019). Performance-oriented quasi-LPV modeling of nonlinear systems, International Journal of Robust and Nonlinear Control29(5): 1230–1248.10.1002/rnc.4444]Search in Google Scholar
[Sutton, R.S. and Barto, A.G. (2018). Reinforcement Learning: An Introduction, 2nd Edn, MIT Press, Cambridge, MA.]Search in Google Scholar
[Tan, K., Zhao, S. and Xu, J. (2007). Online automatic tuning of a proportional integral derivative controller based on an iterative learning control approach, IET Control Theory Applications1(1): 90–96.10.1049/iet-cta:20050004]Search in Google Scholar
[Zajdel, R. (2013). Epoch-incremental reinforcement learning algorithms, International Journal of Applied Mathematics and Computer Science23(3): 623–635, DOI: 10.2478/amcs-2013-0047.10.2478/amcs-2013-0047]Search in Google Scholar
[Zhao, D., Liu, J., Wu, R., Cheng, D. and Tang, X. (2019). An active exploration method for data efficient reinforcement learning, International Journal of Applied Mathematics and Computer Science29(2): 351–362, DOI: 10.2478/amcs-2019-0026.10.2478/amcs-2019-0026]Search in Google Scholar