Back

Paper

AP2B - ACM Papers Session 2B

Fully booked
Tuesday, June 4, 2024
14:00
-
15:30
CEST
HG F 3

Replay

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Session Chair

Description

Presentations

14:00
-
14:30
CEST
MultIO: A Framework for Message-Driven Data Routing For Weather and Climate Simulations

In numerical weather prediction and high-performance computing, the primary computational bottleneck has gradually evolved from floating-point arithmetic to the throughput of data to and from the storage. This phenomenon is commonly referred to as the I/O performance gap. We present MultIO, a set of software libraries that provide two mechanisms to mitigate this effect: an asynchronous I/O-server to decouple data output from model computations, and user-programmable processing pipelines that operate on model output directly. MultIO is a metadata-driven, message-based system. This means that the I/O-server and processing pipelines fundamentally handle and operate on discrete self-describing messages. The behaviour of the I/O-server, data routing decisions and selection of actions undertaken are driven by the metadata attached to each message. The user may control the type and amount of post-processing by setting the message metadata via the Fortran/C/Python APIs, and by configuring a processing pipeline of actions. Users are also able to implement custom actions to be incorporated into the pipelines. The MultIO system has been used with the NEMOv4 model to implement the upcoming ocean re-analysis dataset, which will feed into the production runs of the next generation of global re-analysis dataset, ERA6. It has also been used to move computation closer to the model for climate runs at scale in the nextGEMS and Destination Earth projects.

Domokos Sarmany, Mirco Valentini, Pedro Maciel, Philipp Geier, Simon Smart, Razvan Aguridan, James Hawkes, and Tiago Quintino (ECMWF)
With Thorsten Kurth (NVIDIA Inc.)
14:30
-
15:00
CEST
Reducing the Impact of I/O Contention in Numerical Weather Prediction Workflows at Scale Using DAOS

Operational Numerical Weather Prediction (NWP) workflows are highly data-intensive. Data volumes have increased by many orders of magnitude over the last 40 years, and are expected to continue to do so, especially given the upcoming adoption of Machine Learning in forecast processes. Parallel POSIX-compliant file systems have been the dominant paradigm in data storage and exchange in HPC workflows for many years. This paper presents ECMWF's move beyond the POSIX paradigm, implementing a backend for their storage library to support DAOS --- a novel high-performance object store designed for massively distributed Non-Volatile Memory. This system is demonstrated to be able to outperform the highly mature and optimised POSIX backend when used under high load and contention, as per typical forecast workflow I/O patterns. This work constitutes a significant step forward, beyond the performance constraints imposed by POSIX semantics.

Nicolau Manubens Gil (ECMWF, EPCC); Simon D. Smart, Emanuele Danovaro, and Tiago Quintino (ECMWF); and Adrian Jackson (EPCC)
With Thorsten Kurth (NVIDIA Inc.)
15:00
-
15:30
CEST
Towards a GPU-Parallelization of the neXtSIM-DG Dynamical Core

The cryosphere plays a significant role in Earth's climate system. Therefore, an accurate simulation of sea ice is of great importance to improve climate projections. To enable higher resolution simulations, graphics processing units (GPUs) have become increasingly attractive as they offer higher floating point peak performance and better energy efficiency compared to CPUs. However, making use of this theoretical peak performance, which is based on massive data parallelism, usually requires more care and effort in the implementation. In recent years, a number of frameworks have become available that promise to simplify general purpose GPU programming. In this work, we compare multiple such frameworks, including CUDA, SYCL, Kokkos and PyTorch, for the parallelization of neXtSIM-DG, a finite-element based dynamical core for sea ice. We evaluate the different approaches according to their usability and performance.

Robert Jendersie (Otto-von-Guericke-Universitat Magdeburg); Christian Lessig (ECMWF, Otto-von-Guericke-Universitat Magdeburg); and Thomas Richter (Otto-von-Guericke-Universitat Magdeburg)
With Thorsten Kurth (NVIDIA Inc.)