Paper
Performance Analysis and Optimizations of ERO2.0 Fusion Code
Presenter
I am a researcher at the Computer Science department of the Barcelona Supercomputing Center (BSC) since 2006. My research interest includes load balancing, parallel programming, performance analysis, and optimization.I co-lead the Best Practices for Performance and Productivity (BePPP) group at BSC. BePPP aims to be the bridge between scientific domain researchers and computer scientists. Promoting best practices for programmers to productively (re)structure their codes in ways that can result in high efficiency and portability. And capture the fundamental co-design input forwarding it to the appropriate system software and architecture team to target their developments in the most useful direction.
Description
In this paper, we present the thorough performance analysis of a highly parallel Monte Carlo code for modeling global erosion and redeposition in fusion devices, ERO2.0. The study shows that the main bottleneck preventing the code from efficiently using the resources is the load imbalance at different levels. Load imbalance is inherent to the problem being solved, particle transport, and deposition. Based on the findings of the analysis, we also describe the optimizations implemented on the code to improve its performance on HPC clusters. The proposed optimizations use MPI and OpenMP features, making them portable across architectures and achieving a 3.34x speedup.