[Agullo, E., Demmel, J., Dongarra, J., Hadri, B., Kurzak, J., Langou, J., Ltaief, H., Luszczek, P. and Tomov, S. (2009). Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects, Journal of Physics: Conference Series180(1): 012037.10.1088/1742-6596/180/1/012037]Search in Google Scholar
[Amdahl, G.M. (1967). Validity of the single processor approach to achieving large scale computing capabilities, Proceedings of the Spring Joint Computer Conference, AFIPS’67 (Spring), Atlantic City, NJ, USA, pp. 483–485.10.1145/1465482.1465560]Search in Google Scholar
[Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A. and Sorensen, D. (1999). LAPACK Users’ Guide, 3rd Edn., SIAM, Philadelphia, PA.10.1137/1.9780898719604]Search in Google Scholar
[Buttari, A., Langou, J., Kurzak, J. and Dongarra, J. (2009). A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel Computing35(1): 38–53.10.1016/j.parco.2008.10.002]Search in Google Scholar
[Bylina, B. (2018). The block WZ factorization, Journal of Computational and Applied Mathematics331(C): 119–132.10.1016/j.cam.2017.10.004]Search in Google Scholar
[Bylina, B. and Bylina, J. (2007). Incomplete WZ factorization as an alternative method of preconditioning for solving Markov chains, in R. Wyrzykowski et al. (Eds.), PPAM, Lecture Notes in Computer Science, Vol. 4967, Springer, Berlin/Heidelberg, pp. 99–107.10.1007/978-3-540-68111-3_11]Search in Google Scholar
[Bylina, B. and Bylina, J. (2009). Influence of preconditioning and blocking on accuracy in solving Markovian models, International Journal of Applied Mathematics and Computer Science19(2): 207–217, DOI: 10.2478/v10006-009-0017-3.10.2478/v10006-009-0017-3]Open DOISearch in Google Scholar
[Bylina, B. and Bylina, J. (2015). Strategies of parallelizing nested loops on the multicore architectures on the example of the WZ factorization for the dense matrices, in M. Ganzha et al. (Eds.), Proceedings of the 2015 Federated Conference on Computer Science and Information Systems, Annals of Computer Science and Information Systems, Vol. 5, IEEE, Piscataway, NJ, pp. 629–639.10.15439/2015F354]Search in Google Scholar
[Donfack, S., Dongarra, J., Faverge, M., Gates, M., Kurzak, J., Luszczek, P. and Yamazaki, I. (2015). A survey of recent developments in parallel implementations of Gaussian elimination, Concurrency and Computation: Practice and Experience27(5): 1292–1309.10.1002/cpe.3306]Search in Google Scholar
[Dongarra, J., DuCroz, J., Duff, I.S. and Hammarling, S. (1990). A set of level-3 basic linear algebra subprograms, ACM Transactions on Mathematics Software16(1): 1–17.10.1145/77626.79170]Search in Google Scholar
[Dongarra, J.J., Faverge, M., Ltaief, H. and Luszczek, P. (2013). Achieving numerical accuracy and high performance using recursive tile LU factorization, Concurrency and Computation: Practice and Experience26(6): 1408–1431.10.1002/cpe.3110]Search in Google Scholar
[Dumas, J.G., Gautier, T., Pernet, C., Roch, J.L. and Sultan, Z. (2016). Recursion based parallelization of exact dense linear algebra routines for Gaussian elimination, Parallel Computing57: 235–249.10.1016/j.parco.2015.10.003]Search in Google Scholar
[Evans, D. and Hatzopoulos, M. (1979). A parallel linear system solver, International Journal of Computer Mathematics7(3): 227–238.10.1080/00207167908803174]Search in Google Scholar
[Flynn, M.J. (1972). Some computer organizations and their effectiveness, IEEE Transactions on Computers21(9): 948–960.10.1109/TC.1972.5009071]Search in Google Scholar
[García, I., Merelo, J., Bruguera, J. and Zapata, E. (1990). Parallel quadrant interlocking factorization on hypercube computers, Parallel Computing15(1–3): 87–100.10.1016/0167-8191(90)90033-6]Search in Google Scholar
[Gustavson, F.G. (1997). Recursion leads to automatic variable blocking for dense linear-algebra algorithms, IBM Journal of Research and Development41(6): 737–756.10.1147/rd.416.0737]Search in Google Scholar
[Intel (2019). Math Kernel Library, https://software.intel.com/en-us/mkl.]Search in Google Scholar
[Kurzak, J., Langou, J., Langou, C.D.J., Ltaief, H., Luszczek, P., Yarkhan, A., Haidar, A., Hoffman, J., Agullo, P.D.E., Buttari, A. and Hadri, B. (2010). PLASMA Users’ Guide: Parallel Linear Algebra Software for Multicore Architectures, Version 2.3., http://icl.cs.utk.edu/projectsfiles/plasma/pdf/users_guide.pdf.]Search in Google Scholar
[Marqués, M., Quintana-Ortí, G., Quintana-Ortí, E.S. and van de Geijn, R.A. (2011). Using desktop computers to solve large-scale dense linear algebra problems, The Journal of Supercomputing58(2): 145–150.10.1007/s11227-010-0394-2]Search in Google Scholar
[Rao, S.C.S. (1997). Existence and uniqueness of WZ factorization, Parallel Computing23(8): 1129–1139.10.1016/S0167-8191(97)00042-2]Search in Google Scholar
[Yalamov, P. and Evans, D. (1995). The WZ matrix factorisation method, Parallel Computing21(7): 1111–1120.10.1016/0167-8191(94)00088-R]Search in Google Scholar
[Yarkhan, A., Kurzak, J., Luszczek, P. and Dongarra, J. (2017). Porting the PLASMA numerical library to the OpenMP standard, International Journal of Parallel Programming45(3): 612–633, DOI:10.1007/s10766-016-0441-6.10.1007/s10766-016-0441-6]Open DOISearch in Google Scholar