Button Text
Back

P46 - Scaling Laws for Machine-Learned Reconstruction

This is some text inside of a div block.
This is some text inside of a div block.
-
This is some text inside of a div block.
CEST
Climate, Weather and Earth Sciences
Chemistry and Materials
Computer Science, Machine Learning, and Applied Mathematics
Applied Social Sciences and Humanities
Engineering
Life Sciences
Physics
This is some text inside of a div block.

Description

Machine Learning (ML) methods have been successfully applied to various High Energy Physics (HEP) problems, such as particle identification, event reconstruction, jet tagging, and anomaly detection. However, the relationship between the model size, i.e., the number of model parameters, and the physics performance for different HEP tasks is not well understood. In this work, we empirically determine the scaling laws for different commonly used ML model architectures such as Graph Neural Networks (GNNs) and Transformers on a challenging ML problem from HEP with the goal of finding how much physics performance can be gained by increasing the model size as opposed to investigating more complex model architectures. We also take memory usage and computational complexity, which is not directly related to model size, into account. High Performance Computing resources are used to train and optimize the models on large-scale HEP datasets for supervised learning. We evaluate the model performance in terms of accuracy, efficiency, and inference speed. We also observe that the optimal model size varies depending on the complexity and structure of the input data. Our work demonstrates the potential and challenges of applying ML methods to HEP problems, and contributes to the advancement of both fields.

Presenter(s)

Presenter

Eric
Wulff
-
CERN

Eric Wulff has a MSc in Engineering Physics from Lund University and is a fellow in the IT department at CERN. He is the Task Leader for the use-case on LHC collision event reconstruction at the European Center of Excellence in Exascale Computing (CoE RAISE). His experience includes large-scale distributed training and hyperparameter optimization of AI models on supercomputers as well as using quantum computing for DL-based algorithms. Prior to joining CERN, Eric was a Machine Learning Engineer at Axis Communications, where he worked on object detection and video analytics using DL techniques.

Authors