Back

Minisymposium Presentation

Julia-Based Multitask Surrogate Models for Heterogeneous Data Generated by Physical Models

Tuesday, June 4, 2024
17:30
-
18:00
CEST
Climate, Weather and Earth Sciences
Climate, Weather and Earth Sciences
Climate, Weather and Earth Sciences
Chemistry and Materials
Chemistry and Materials
Chemistry and Materials
Computer Science and Applied Mathematics
Computer Science and Applied Mathematics
Computer Science and Applied Mathematics
Humanities and Social Sciences
Humanities and Social Sciences
Humanities and Social Sciences
Engineering
Engineering
Engineering
Life Sciences
Life Sciences
Life Sciences
Physics
Physics
Physics

Description

Physical data is increasingly openly accessible though it may be challenging to definitively rank the accuracy of different information sources. We demonstrate that multitask Gaussian process regression can leverage “datasets of opportunity” to efficiently construct surrogate models. In particular, we consider training sets constructed from coupled-cluster (CC) and density functional theory (DFT) data generated with multiple exchange-correlation functional approximations. The cost of CC calculation scales at a rate of N to the power of seven where N is the number of atoms in the system while DFT demonstrates relatively tractable N cubed scaling. We report that multitask surrogates can predict at CC level accuracy with a reduction to data generation cost by over an order of magnitude. This interdisciplinary effort has been facilitated by Julia packages for atomistic computation and for the custom design of optimization and Gaussian process models. If time permits, we will discuss the extension of our computational models to produce calibrated uncertainty indicators for each prediction.

Authors