Back

Minisymposium

MS1F - Energy Efficiency and Carbon Footprint of Earth System Modeling in Times of Spiraling Electricity Prices and Net-Zero Goals

Fully booked
Monday, June 3, 2024
11:30
-
13:30
CEST
HG D 1.2

Replay

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Session Chair

Description

Rising energy costs and the endeavor to reach net-zero have recently brought energy-efficient high-performance computing back to the forefront. In no application area is this topic more pressing than in Earth system modeling (ESM), which comes at a huge computational expense and where there is a strong moral prerogative to minimize its negative impact. Atmospheric models progress towards cloud-resolving scales requiring Exascale computing while electricity becomes more scarce. Computing centers are being challenged by rising energy budgets while transitioning to dramatically more powerful high-performance computing platforms. Are the ESM and HPC communities doing enough to address these challenges? We have invited four speakers to talk about a wide range of energy and climate footprint perspectives, ranging from the tools needed for energy consumption analysis, to the concrete carbon footprint analysis of high-resolution global simulations on a state-of-the-art computing platform, and finally to the societal obligation we face in our community to lead as role models. While we can highlight our best efforts towards energy efficiency and carbon prudence, we will leave it to the audience to decide if we are living up to our obligation to the planet while still promoting this important research area.

Presentations

11:30
-
12:00
CEST
Walking the Walk: The ESM Community must be a Role Model in Energy Efficiency

Recently, in the face of mounting concern about climate change, the energy demand (and carbon footprint) of rapidly growing IT tech, such as cloud, cryptocurrencies, and generative AI has come under increasing scrutiny. Compared to these global IT trends, the carbon emissions of earth system modelling activities conducted by comparatively few scientists on HPC systems might seem like small potatoes. But as ESMs continue to grow larger, more complex and more data intensive the amount of energy needed to run them continues to increase. The climate community must lead by example by measuring, reporting and reducing these emissions. Measurement of energy consumption will require improved infrastructure instrumentation, the community-wide agreement on end-to-end metrics. Reporting will require institutional transparency and integrity. Realizing reductions through improved computational efficiency will be comparatively harder, and will require investment and innovation in algorithmic, software and computational technologies.

Richard Loft (AreandDee LLC) and William Sawyer (ETH Zurich / CSCS)
With Thorsten Kurth (NVIDIA Inc.)
12:00
-
12:30
CEST
Energy Efficiency Analysis and Optimization for Present and Future, Possibly Heterogenous, HPC Systems

This talk is split into a summary of the work at DKRZ on the energy efficiency of the present supercomputer Levante and first insights from testing modular climate simulations on state-of-the-art hardware. A first overview covers job performance monitoring, energy measurement infrastructure and identified tuning options. Significant improvements on energy efficiency were already deployed based on both, systematic benchmarking of characteristic workloads and investigation of idle node configurations.Secondly, first results on analyzing the potential of executing functional units of modular simulations on their most suitable hardware in a heterogeneous cluster are shown. Modules of the climate simulation ICON are first benchmarked individually to be ultimately coupled again and distributed to their most energy efficient platform.

Pay Giesselmann (DKRZ)
With Thorsten Kurth (NVIDIA Inc.)
12:30
-
13:00
CEST
EMOI: CSCS Extensible Monitoring and Observability Infrastructure

The Swiss National Supercomputing Centre (CSCS) is expanding its computational capabilities with the Alps architecture, a Cray HPE EX system incorporating around 5000 GH200 modules, in addition to the pre-existing nodes. This expansion poses challenges in monitoring due to hardware heterogeneity, including AMD Rome CPUs, Mi250x and Mi300 GPUs, Nvidia A100, and the Arm-based Grace-Hopper GH200. Implementing measures to decrease power usage can help reduce the operational costs and environmental challenges associated with supercomputers. To address these challenges, CSCS has developed an Extensible Monitoring and Observability Infrastructure (EMOI), designed to manage the substantial data influx and provide insightful analysis of the infrastructure's behavior. EMOI integrates with Cray System Management (CSM) and Cray System Monitoring Application (SMA), emphasizing a Kafka-centric approach for enhanced interoperability. We will delve into the structure and quality of collected datasets, focusing on power consumption data. We hope that our experience will be beneficial not only to CSCS but also to other HPE/Cray sites facing similar challenges in supercomputing infrastructure management.

Jean-Guillaume Piccinali and Jonathan Coles (ETH Zurich / CSCS)
With Thorsten Kurth (NVIDIA Inc.)
13:00
-
13:30
CEST
The Energy Efficiency and Carbon Footprint of ICON Climate Model Simulations at CSCS

In an environmental impact case study of computing at the Swiss National Supercomputing Center (CSCS) we consider the energy consumption of climate simulations currently planned on the new CSCS "Alps" computing platform. We use the Icosahedral Non-hydrostatic (ICON) model, which has been extensively refactored within the https://c2sm.ethz.ch/research/exclaim.html project to improve its performance on already highly energy-efficient Nvidia Hopper Graphics Processing Units (GPUs). ICON simulations will use a substantial fraction of the overall Alps resources when it comes fully online. We briefly present these optimization efforts as well as our latest ICON benchmarks, and then extrapolate to the overall energy consumption needed for scientific use cases planned within the EXCLAIM project. The energy projections for these runs are crucial to budget for electricity, but accounting for the carbon footprint is also part of our societal obligation. As part of an ongoing carbon footprint study, we outline the energy efficiency of the CSCS infrastructure -- its power usage effectiveness and the power efficiency of the computing platform -- and quantify the CO2-equivalent emissions for these simulations. We leave it to the audience to decide if we are living up to our environmental obligations while still promoting this important research area.

William Sawyer (ETH Zurich / CSCS)
With Thorsten Kurth (NVIDIA Inc.)