Back

Paper

AP2E - ACM Papers Session 2E

Fully booked
Tuesday, June 4, 2024
14:00
-
15:30
CEST
HG E 3

Replay

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Session Chair

Description

Presentations

14:00
-
14:30
CEST
Efficient Computation of Large-Scale Statistical Solutions to Incompressible Fluid Flows

This work presents the development, performance analysis and subsequent optimization of a GPU-based spectral hyperviscosity solver for turbulent flows described by the three dimensional incompressible Navier-Stokes equations. The method solves for the fluid velocity fields directly in Fourier space, eliminating the need to solve a large-scale linear system of equations in order to find the pressure field. Special focus is put on the communication intensive transpose operation required by the fast Fourier transform when using distributed memory parallelism. After multiple iterations of benchmarking and improving the code, the simulation achieves close to optimal performance on the Piz Daint supercomputer cluster, even outperforming the Cray MPI implementation on Piz Daint in its communication routines. This optimal performance enables the computation of large-scale statistical solutions of incompressible fluid flows in three space dimensions.

Tobias Rohner and Siddhartha Mishra (ETH Zurich)
With Thorsten Kurth (NVIDIA Inc.)
14:30
-
15:00
CEST
Efficient Parallel Strategies For Conjugate Heat Transfer Problems

Temperature boundary conditions in thermal fluids have conventionally been approached as Robin-type boundary conditions. However, with the emergence of supercomputing capabilities, there is the opportunity to explore the solution of heat transfer in the surrounding domains and establish a strong coupling with the temperature equation in the fluid, giving rise to what is known as Conjugate Heat Transfer problems. This paper introduces two strategies based on volume and surface algebraic couplings, solved using either a block Gauss-Seidel method or a block Jacobi method. The volume coupling implies solving the heat transfer problem in the fluid and solid monolithically and coupling it to the Navier-Stokes equations solved in the fluid. On the other hand, in the case of surface coupling, the Boussinesq system is solved within the fluid and then coupled to the solid through their shared interface. A comparative analysis of these approaches is presented, considering both algorithmic and computational performances within the framework of a multi-code coupling strategy. In the parallel execution of such problems, a decision involves determining how to distribute the cores among the various coupled codes. We propose a method that involves overloading computational nodes, allowing different codes to utilize the entire available resources. To enhance efficiency, the overload approach is implemented with a barrier, utilizing the DLB library, to mitigate the busy wait induced by MPI subroutines during data exchange. The solution to a practical example demonstrates a nearly twofold speedup achieved by the proposed method compared to a classical approach when employing volume coupling.

Guillaume Houzeaux, Simon Santoso, Marta Garcia-Gasulla, Cristóbal Samaniego, and Hadrien Calmet (Barcelona Supercomputing Center)
With Thorsten Kurth (NVIDIA Inc.)
15:00
-
15:30
CEST
Parallel Algorithms for Intersection Computation

This paper discusses parallel algorithms for computing intersections
between pairs of meshes. We used parallel intersection algorithms
to compute interpolation weights in coupled solvers which are part
of multi-physics simulations. We present a parallel algorithm for
computing intersections that has linear computational complexity.
We analyze the computation and communication complexities of
this algorithm, along with lower bounds for parallel intersection
computation. The algorithm has low contention and can be executed
on many-core CPUs or offloaded to GPUs. We present strong scaling
results for this algorithm on a heterogeneous machine with multiple
GPUs per node.

Aparna Sasidharan (Illinois Institute of Technology)
With Thorsten Kurth (NVIDIA Inc.)