Back

Minisymposium Presentation

Managing Converged HPC and Cloud Architectures with CSM/OCHAMI in OpenCUBE

Wednesday, June 5, 2024
11:30
-
12:00
CEST
Climate, Weather and Earth Sciences
Climate, Weather and Earth Sciences
Climate, Weather and Earth Sciences
Chemistry and Materials
Chemistry and Materials
Chemistry and Materials
Computer Science and Applied Mathematics
Computer Science and Applied Mathematics
Computer Science and Applied Mathematics
Humanities and Social Sciences
Humanities and Social Sciences
Humanities and Social Sciences
Engineering
Engineering
Engineering
Life Sciences
Life Sciences
Life Sciences
Physics
Physics
Physics

Presenter

Nina
Mujkanovic
-
HPE

Nina Mujkanovic joined the EMEA Research Lab in 2017, where she contributes to various European-funded projects as a research engineer. Prior to joining Cray, she was part of the HPC system administration team at the University of Bern in Switzerland, where she also earned an M.Sc. in computer science. During her studies, she specialized in advanced information processing, with a special focus on machine learning. Her thesis focused on creating a deep neural network for the detection of pathologies in the retina. Nina’s interests include deep learning, high-performance computing, containerization, orchestration, and open source.

Description

The convergence of cloud and HPC technologies has become a major theme in recent years. Virtualization and orchestration are increasingly used to offer an integrated workflow experience across heterogeneous hardware, be it a supercomputer or web service. Within the OpenCUBE project, we aim to develop an innovative full-stack solution for a European cloud computing blueprint that bridges this continuum while incorporating European Processor Initiative hardware.

The Cray System Management (CSM) is a cloud-based solution that delivers a system management platform that merges microservices and cloud technologies with HPC software to enable the management of large-scale supercomputers. OCHAMI, launched as an open community effort, consisting of LANL, LBNL, NERSC, CSCS, HPE, and Bristol University further extends on the CSM implementation to offer additional, tailored solutions.

In this talk, we explore the differences and commonalities between the cloud and HPC approaches to computing. We present the new OpenCUBE software and hardware stack centered around the CSM and OCHAMI in combination with a high-performance interconnect, and discuss how it is going to solve the cloud/HPC integration issues on the architecture and cluster management level. Finally, we give an outlook to current and future developments in the field.

Authors