Back

Paper

MultIO: A Framework for Message-Driven Data Routing For Weather and Climate Simulations

Tuesday, June 4, 2024
14:00
-
14:30
CEST
Climate, Weather and Earth Sciences
Climate, Weather and Earth Sciences
Climate, Weather and Earth Sciences
Chemistry and Materials
Chemistry and Materials
Chemistry and Materials
Computer Science and Applied Mathematics
Computer Science and Applied Mathematics
Computer Science and Applied Mathematics
Humanities and Social Sciences
Humanities and Social Sciences
Humanities and Social Sciences
Engineering
Engineering
Engineering
Life Sciences
Life Sciences
Life Sciences
Physics
Physics
Physics

Presenter

Domokos
Sarmany
-
ECMWF

Domokos Sármány has more than 15 years of academic and industrial expertise in scientific computing, numerical modelling and message-driven data routing. Domokos is currently leading the Model Data Services team at ECMWF. The team's main focus is efficient, asynchronous data output and on-the-fly post-processing for Earth-system models, as well as providing a plugin framework for coupling Earth-system models with impact models. Domokos is also responsible for leading the development and integration of MultIO -- the set of software libraries used for data output and in-memory data processing in ECMWF's weather forecasting system.

Description

In numerical weather prediction and high-performance computing, the primary computational bottleneck has gradually evolved from floating-point arithmetic to the throughput of data to and from the storage. This phenomenon is commonly referred to as the I/O performance gap. We present MultIO, a set of software libraries that provide two mechanisms to mitigate this effect: an asynchronous I/O-server to decouple data output from model computations, and user-programmable processing pipelines that operate on model output directly. MultIO is a metadata-driven, message-based system. This means that the I/O-server and processing pipelines fundamentally handle and operate on discrete self-describing messages. The behaviour of the I/O-server, data routing decisions and selection of actions undertaken are driven by the metadata attached to each message. The user may control the type and amount of post-processing by setting the message metadata via the Fortran/C/Python APIs, and by configuring a processing pipeline of actions. Users are also able to implement custom actions to be incorporated into the pipelines. The MultIO system has been used with the NEMOv4 model to implement the upcoming ocean re-analysis dataset, which will feed into the production runs of the next generation of global re-analysis dataset, ERA6. It has also been used to move computation closer to the model for climate runs at scale in the nextGEMS and Destination Earth projects.

Authors