Minisymposium Presentation
Optimizing CRK-HACC for Performance Portability Using SYCL
Presenter
Esteban Rangel joined the Computational Science (CPS) division at Argonne National Laboratory as a staff scientist in 2021. He became a postdoc at the Argonne Leadership Computing Facility (ALCF) after receiving his PhD in Computer Science from Northwestern University in 2018. He began contributing to the HACC codebase as a graduate student, where much of the work towards his PhD thesis was designing and implementing scalable analysis software for N-body cosmological simulations.
Description
In this talk, we discuss the development of the SYCL implementation of CRK-HACC, an extreme-scale cosmological simulation code with physics for resolving gas hydrodynamics. We describe our CUDA-to-SYCL migration pipeline for producing function objects and detail how we achieved a high level of “performance portability” across GPUs from AMD, Intel, and NVIDIA, requiring us to develop an abstraction for multiple “shuffle” operations: the sycl::select_from_group function from SYCL 2020, a shuffle operation emulated via work-group local memory, and a highly specialized shuffle operation implemented for Intel GPUs in assembly (vISA). To facilitate code maintainability we also created abstractions for host-side code that is shared across HIP, SYCL, and CUDA. We believe our techniques will generalize well to other application domains and provide a balance of maintainability and performance portability.