Code generation approaches for parallel geometric multigrid solvers

[1] M. Adams, P. Colella, D. T. Graves, J. N. Johnson, Keen, N. D., T. J. Ligocki, D. F. Martin, P. W. McCorquodale, D. Modiano, P. Schwartz, T. Sternberg, and B. van Straalen. Chombo software package for AMR applications - design document. Technical Report LBNL-6616E, Lawrence Berkeley National Laboratory, Jan 2015.Search in Google Scholar

[2] S. Balay, W. D. Gropp, L. C. McInnes, and B. F. Smith. Efficient management of parallelism in object oriented numerical software libraries. In Modern Software Tools in Scientific Computing, pages 163–202. Birkhäuser Press, 1997.10.1007/978-1-4612-1986-6_8Search in Google Scholar

[3] W. Bangerth, R. Hartmann, and G. Kanschat. deal.II – a general purpose object oriented finite element library. ACM Trans. Math. Softw., 33(4):24/1–24/27, 2007.10.1145/1268776.1268779Search in Google Scholar

[4] P. Bastian, C. Engwer, D. Göddeke, O. Iliev, O. Ippisch, M. Ohlberger, S. Turek, J. Fahlke, S. Kaulmann, S. Müthing, and D. Ribbrock. EXA-DUNE: Flexible pde solvers, numerical methods and applications. In Euro-Par 2014: Parallel Processing Workshops, volume 8806 of Lecture Notes in Computer Science, pages 530–541. Springer, 2014.10.1007/978-3-319-14313-2_45Search in Google Scholar

[5] M. Bauer, F. Schornbaum, C. Godenschwager, M. Markl, D. Anderl, H. Köstler, and U. Rüde. A python extension for the massively parallel multiphysics simulation framework walberla. International Journal of Parallel, Emergent and Distributed Systems, 31(6):529–542, 2016.10.1080/17445760.2015.1118478Search in Google Scholar

[6] B. Bergen, T. Gradl, F. Hülsemann, and U. Rüde. A massively parallel multigrid method for finite elements. Computing in Science and Engineering, 8(6):56–62, 2006.10.1109/MCSE.2006.102Search in Google Scholar

[7] B. Bergen and F. Hülsemann. Hierarchical hybrid grids: data structures and core algorithms for multigrid. Numer. Linear Algebra Appl., 11:279–291, 2004.10.1002/nla.382Search in Google Scholar

[8] M. Blatt, A. Burchardt, A. Dedner, C. Engwer, J. Fahlke, B. Flemisch, C. Gersbacher, C. Grüser, F. Gruber, C. Gräninger, D. Kempf, R. Klöfkorn, T. Malkmus, S. Müthing, M. Nolte, M. Piatkowski, and O. Sander. The distributed and unified numerics environment, version 2.4. Archive of Numerical Software, 4(100):13–29, 2016.Search in Google Scholar

[9] M. Bolten, F. Franchetti, P. H. J. Kelly, C. Lengauer, and M. Mohr. Algebraic description and automatic generation of multigrid methods in SPIRAL. Concurrency and Computation: Practice and Experience, 29(17):4105:1–4105:11, 2017. Special Issue on Advanced Stencil-Code Engineering.10.1002/cpe.4105Search in Google Scholar

[10] T. Brandvik and G. Pullan. SBLOCK: A framework for efficient stencil-based PDE solvers on multi-core platforms. In 2010 10th IEEE International Conference on Computer and Information Technology, pages 1181–1188, Jun 2010.10.1109/CIT.2010.214Search in Google Scholar

[11] M. Christen, O. Schenk, and H. Burkhart. PATUS: A code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures. In 2011 IEEE International Parallel Distributed Processing Symposium, pages 676–687, May 2011.10.1109/IPDPS.2011.70Search in Google Scholar

[12] C. Coarfa, Y. Dotsenko, J. Mellor-Crummey, F. Cantonnet, T. El-Ghazawi, A. Mohanti, Y. Yao, and D. Chavarría-Miranda. An evaluation of global address space languages: Co-array Fortran and unified parallel C. In Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ‘05, pages 36–47, New York, NY, USA, 2005. ACM.10.1145/1065944.1065950Search in Google Scholar

[13] Z. DeVito, N. Joubert, F. Palaciosy, S. Oakley, M. Medina, M. Barrientos, E. Elsen, F. Ham, A. Aiken, K. Duraisamy, E. Darve, J. Alonso, and P. Hanrahan. Liszt: A domain specific language for building portable mesh-based PDE solvers. In Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC), pages 1–12. ACM, 2011.10.1145/2063384.2063396Search in Google Scholar

[14] H. C. Edwards, C. R. Trott, and D. Sunderland. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns. Journal of Parallel and Distributed Computing, 74(12):3202 – 3216, 2014. Special issue on Domain-Specific Languages and High-Level Frameworks for High-Performance Computing.10.1016/j.jpdc.2014.07.003Search in Google Scholar

[15] R. D. Falgout, J. E. Jones, and U. M. Yang. The design and implementation of hypre, a library of parallel high performance preconditioners. In Numerical Solution of Partial Differential Equations on Parallel Computers, pages 267–294, Berlin, Heidelberg, 2006. Springer.10.1007/3-540-31619-1_8Search in Google Scholar

[16] M. Frigo, C. E. Leiserson, and K. H. Randall. The implementation of the Cilk-5 multithreaded language. SIGPLAN Not., 33(5):212–223, May 1998.10.1145/277652.277725Search in Google Scholar

[17] K. Fürlinger, C. Glass, A. Knüpfer, J. Tao, D. Hünich, K. Idrees, M. Maiterth, Y. Mhedheb, and H. Zhou. DASH: Data structures and algorithms with support for hierarchical locality. In Euro-Par 2014 Workshops (Porto, Portugal), pages 542–552, 2014.10.1007/978-3-319-14313-2_46Search in Google Scholar

[18] B. Gmeiner, T. Gradl, H. Köstler, and U. Rüde. Highly parallel geometric multigrid algorithm for hierarchical hybrid grids. In K. Binder, G. Münster, and M. Kremer, editors, NIC Symposium 2012, volume 45 of Publication series of the John von Neumann Institute for Computing, pages 323–330, Jülich, Germany, 2012.Search in Google Scholar

[19] B. Gmeiner, M. Huber, L. John, U. Rüde, and B. Wohlmuth. A quantitative performance study for Stokes solvers at the extreme scale. J. Comput. Sci., 17(3):509–521, 2016.10.1016/j.jocs.2016.06.006Search in Google Scholar

[20] B. Gmeiner, H. Köstler, M. Stürmer, and U. Rüde. Parallel multigrid on hierarchical hybrid grids: a performance study on current high performance computing clusters. Concurrency and Computation: Practice and Experience, 26(1):217–240, 2014.10.1002/cpe.2968Search in Google Scholar

[21] B. Gmeiner, U. Rüde, H. Stengel, C. Waluga, and B. Wohlmuth. Performance and Scalability of Hierarchical Hybrid Multigrid Solvers for Stokes Systems. SIAM J. Sci. Comput., 37(2):C143–C168, 2015.10.1137/130941353Search in Google Scholar

[22] B. Gmeiner, U. Rüde, H. Stengel, C. Waluga, and B. Wohlmuth. Towards textbook efficiency for parallel multigrid. Numer. Math. Theory Methods Appl., 8:2246, 2015.10.4208/nmtma.2015.w10siSearch in Google Scholar

[23] T. Gysi, T. Grosser, and T. Hoefler. MODESTO: Data-centric analytic optimization of complex stencil programs on heterogeneous architectures. In Proceedings of the 29th ACM on International Conference on Supercomputing, ICS ‘15, pages 177–186, New York, NY, USA, 2015. ACM.10.1145/2751205.2751223Search in Google Scholar

[24] T. Gysi, C. Osuna, O. Fuhrer, M. Bianco, and T. C. Schulthess. STELLA: A domain-specific tool for structured grid methods in weather and climate models. In Proceedings International Conference on High Performance Computing, Networking, Storage and Analysis (SC), pages 41:1–41:12. ACM, Nov 2015.10.1145/2807591.2807627Search in Google Scholar

[25] M. Heisig. Petalisp: A common lisp library for data parallel programming. In 11th European Lisp Symposium, page 4, 2018.Search in Google Scholar

[26] M. Heisig and H. Köstler. Petalisp: run time code generation for operations on strided arrays. In Proceedings of the 5th ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming, pages 11–17. ACM, 2018.10.1145/3219753.3219755Search in Google Scholar

[27] M. A. Heroux, R. A. Bartlett, V. E. Howle, R. J. Hoekstra, J. J. Hu, T. G. Kolda, R. B. Lehoucq, K. R. Long, R. P. Pawlowski, E. T. Phipps, A. G. Salinger, H. K. Thornquist, R. S. Tuminaro, J. M. Willenbring, A. Williams, and K. S. Stanley. An overview of the Trilinos project. ACM Trans. Math. Softw., 31(3):397–423, 2005.10.1145/1089014.1089021Search in Google Scholar

[28] L. V. Kale and S. Krishnan. CHARM++: A portable concurrent object oriented system based on C++. SIGPLAN Notices, 28(10):91–108, Oct 1993.10.1145/167962.165874Search in Google Scholar

[29] N. Kohl, D. Thönnes, D. Drzisga, D. Bartuschat, and U. Rüde. The hyteg finite-element software framework for scalable multigrid solvers. International Journal of Parallel, Emergent and Distributed Systems, 0(0):1–20, 2018.Search in Google Scholar

[30] H. Köstler, C. Schmitt, S. Kuckuk, F. Hannig, J. Teich, and U. Rüde. A scala prototype to generate multigrid solver implementations for different problems and target multi-core platforms. Int. J. of Computational Science and Engineering, 14(2):150–163, 2017.10.1504/IJCSE.2017.082879Search in Google Scholar

[31] H. Köstler, M. Stürmer, and T. Pohl. Performance engineering to achieve real-time high dynamic range imaging. Journal of Real-Time Image Processing, pages 1–13, 2013.10.1007/s11554-012-0312-3Search in Google Scholar

[32] S. Kronawitter, S. Kuckuk, H. Köstler, and C. Lengauer. Automatic data layout transformations in the exastencils code generator. Modern Physics Letters A, 28(03):1850009, 2018.10.1142/S0129626418500093Search in Google Scholar

[33] S. Kronawitter, S. Kuckuk, H. Köstler, and C. Lengauer. Automatic data layout transformations in the ExaStencils code generator. Parallel Processing Letters, 28(03):1850009, 2018.10.1142/S0129626418500093Search in Google Scholar

[34] S. Kronawitter, S. Kuckuk, and C. Lengauer. Redundancy elimination in the ExaStencils code generator. In Algorithms and Architectures for Parallel Processing, pages 159–173, Cham, 2016. Springer International Publishing.10.1007/978-3-319-49956-7_13Search in Google Scholar

[35] S. Kuckuk, G. Haase, D. A. Vasco, and H. Köstler. Towards generating efficient flow solvers with the ExaStencils approach. Concurrency and Computation: Practice and Experience, 29(17):4062:1–4062:17, 2017. Special Issue on Advanced Stencil-Code Engineering.10.1002/cpe.4062Search in Google Scholar

[36] S. Kuckuk and H. Köstler. Automatic generation of massively parallel codes from ExaSlang. Computation, 4(3):27:1–27:20, 2016. Special Issue on High Performance Computing (HPC) Software Design.10.3390/computation4030027Search in Google Scholar

[37] S. Kuckuk and H. Köstler. Whole program generation of massively parallel shallow water equation solvers. In 2018 IEEE International Conference on Cluster Computing (CLUSTER), pages 78–87, Sept 2018.10.1109/CLUSTER.2018.00020Search in Google Scholar

[38] S. Kuckuk and H. Kstler. Automatic generation of massively parallel codes from exaslang. Computation, 4(3):27:1–27:20, 2016. Special Issue on High Performance Computing (HPC) Software Design.10.3390/computation4030027Search in Google Scholar

[39] S. Kuckuk, L. Leitenmaier, C. Schmitt, D. Schönwetter, H. Köstler, and D. Fey. Towards virtual hardware prototyping for generated geometric multigrid solvers. Technical Report CS 2017-01, Technische Fakultät, 2017.Search in Google Scholar

[40] C. Lengauer, S. Apel, M. Bolten, A. Größlinger, F. Hannig, H. Köstler, U. Rüde, J. Teich, A. Grebhahn, S. Kronawitter, et al. Exastencils: Advanced stencil-code engineering. In European Conference on Parallel Processing, pages 553–564. Springer, 2014.10.1007/978-3-319-14313-2_47Search in Google Scholar

[41] C. Lengauer, S. Apel, M. Bolten, A. Größlinger, F. Hannig, H. Köstler, U. Rüde, J. Teich, A. Grebhahn, S. Kronawitter, S. Kuckuk, H. Rittich, and C. Schmitt. ExaStencils: Advanced stencil-code engineering. In L. Lopes et al., editors, Euro-Par 2014: Parallel Processing Workshops, volume 8806 of Lecture Notes in Computer Science (LNCS), pages 553–564. Springer, 2014.10.1007/978-3-319-14313-2_47Search in Google Scholar

[42] A. Logg, K.-A. Mardal, and G. N. Wells. Automated Solution of Differential Equations by the Finite Element Method, volume 84 of Lecture Notes in Computational Science and Engineering (LNCSE). Springer, 2012.10.1007/978-3-642-23099-8Search in Google Scholar

[43] N. Maruyama, K. Sato, T. Nomura, and S. Matsuoka. Physis: An implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers. In SC ‘11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–12, Nov 2011.10.1145/2063384.2063398Search in Google Scholar

[44] G. R. Mudalige, I. Reguly, M. B. Giles, C. Bertolli, and P. H. J. Kelly. OP2: An active library framework for solving unstructured mesh-based applications on multi-core and many-core architectures. In Proc. Innovative Parallel Computing (InPar), San Jose, California, May 2012. IEEE.10.1109/InPar.2012.6339594Search in Google Scholar

[45] G. Ofenbeck, T. Rompf, and M. Püschel. Staging for generic programming in space and time. SIGPLAN Not., 52(12):15–28, Oct 2017.10.1145/3170492.3136060Search in Google Scholar

[46] M. Püschel, F. Franchetti, and Y. Voronenko. Spiral, volume 4, pages 1920–1933. Springer, 2011.Search in Google Scholar

[47] F. Rathgeber, D. A. Ham, L. Mitchell, M. Lange, F. Luporini, A. T. T. Mcrae, G.-T. Bercea, G. R. Markall, and P. H. J. Kelly. Firedrake: Automating the finite element method by composing abstractions. ACM Trans. on Mathematical Software (TOMS), 43(3):24:1–24:27, 2016.10.1145/2998441Search in Google Scholar

[48] P. Rawat, M. Kong, T. Henretty, J. Holewinski, K. Stock, L.-N. Pouchet, J. Ramanujam, A. Rountev, and P. Sadayappan. SDSLc: A multi-target domain-specific compiler for stencil computations. In Proc. 5th Int’l Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC), pages 6:1–6:10. ACM, Nov 2015.10.1145/2830018.2830025Search in Google Scholar

[49] C. Schmitt, S. Kuckuk, F. Hannig, H. Köstler, and J. Teich. Exa-Slang: A domain-specific language for highly scalable multigrid solvers. In Proc. 4th Int’l Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC), pages 42–51. IEEE Computer Society, Nov. 2014.10.1109/WOLFHPC.2014.11Search in Google Scholar

[50] C. Schmitt, M. Schmid, F. Hannig, J. Teich, S. Kuckuk, and H. Köstler. Generation of multigrid-based numerical solvers for FPGA accelerators. In Proc. 2nd Int’l Workshop on High-Performance Stencil Computations (HiStencils), pages 9–15, Jan. 2015.Search in Google Scholar

[51] C. Schmitt, M. Schmid, S. Kuckuk, H. Köstler, J. Teich, and F. Hannig. Reconfigurable hardware generation of multigrid solvers with conjugate gradient coarse-grid solution. Parallel Processing Letters, 28(04):1850016, 2018.10.1142/S0129626418500160Search in Google Scholar

[52] J. Schmitt, H. Köstler, J. Eitzinger, and R. Membarth. Unified code generation for the parallel computation of pairwise interactions using partial evaluation. In 2018 17th International Symposium on Parallel and Distributed Computing (ISPDC), pages 17–24. IEEE, 2018.10.1109/ISPDC2018.2018.00012Search in Google Scholar

[53] Y. Tang, R. A. Chowdhury, B. C. Kuszmaul, C.-K. Luk, and C. E. Leiserson. The Pochoir stencil compiler. In Proceedings of the Twenty-Third Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), pages 117–128. ACM, 2011.10.1145/1989493.1989508Search in Google Scholar

[54] U. Trottenberg, C. Oosterlee, and A. Schüller. Multigrid. Academic Press, San Diego, CA, USA, 2001.Search in Google Scholar

[55] A. Vogel, S. Reiter, M. Rupp, A. Nägel, and G. Wittum. UG 4: A novel flexible software system for simulating pde based models on high performance computers. Computing and Visualization in Science, 16(4):165–179, 2013.10.1007/s00791-014-0232-9Search in Google Scholar

[56] T. Weinzierl. The peano softwareparallel, automaton-based, dynamically adaptive grid traversals. ACM Transactions on Mathematical Software (TOMS), 45(2):14, 2019.10.1145/3319797Search in Google Scholar

eISSN:: 1844-0835
Sprache:: Englisch

Zeitrahmen der Veröffentlichung:: Volume Open
Fachgebiete der Zeitschrift:: Mathematik, Allgemeines

Zeitschrift RSS Feed

Code generation approaches for parallel geometric multigrid solvers

Online veröffentlicht: 28. Dez. 2020

Seitenbereich: 123 - 152

Eingereicht: 10. Juli 2019

Akzeptiert: 16. Dez. 2019

DOI: https://doi.org/10.2478/auom-2020-0038

Schlüsselwörtercode generation, domain-specific languages, multigrid solvers, elliptic PDEs

© 2020 Harald Köstler et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Schlüsselwörter
code generation, domain-specific languages, multigrid solvers, elliptic PDEs