Back

Paper

Hybrid Parallel Tucker Decomposition of Streaming Data

Tuesday, June 4, 2024
14:00
-
14:30
CEST
Climate, Weather and Earth Sciences
Climate, Weather and Earth Sciences
Climate, Weather and Earth Sciences
Chemistry and Materials
Chemistry and Materials
Chemistry and Materials
Computer Science and Applied Mathematics
Computer Science and Applied Mathematics
Computer Science and Applied Mathematics
Humanities and Social Sciences
Humanities and Social Sciences
Humanities and Social Sciences
Engineering
Engineering
Engineering
Life Sciences
Life Sciences
Life Sciences
Physics
Physics
Physics

Presenter

Eric T.
Phipps
-
Sandia National Laboratories

Eric joined Sandia National Laboratories in September 2002 and is currently a principal member of the technical staff in the Scalable Algorithms Department. Eric’s research focuses on developing new capabilities for predictive simulation and analysis in Sandia’s large-scale parallel application codes using techniques based on automatic differentiation and template-based generic programming. His work has recently emphasized developing tools and techniques relevant to emerging extreme-scale computer architectures. He is the lead developer for several related software packages in Trilinos including the Sacado automatic differentiation, Stokhos embedded uncertainty quantification, and LOCA continuation/bifurcation analysis packages. Finally, he leads the development of the GenTen software for tensor-based analysis on emerging extreme-scale architectures.

Description

Tensor decompositions have emerged as powerful tools of multivariate data analysis, providing the foundation of numerous analysis methods. The Tucker decomposition in particular has been shown to be quite effective at compressing high-dimensional scientific data sets. However, applying these techniques to modern scientific simulation data is challenged by the massive data volumes these codes can produce, requiring scalable tensor decomposition methods that can exploit the hybrid parallelism available on modern computing architectures, as well as support in situ processing to compute decompositions as these simulations generate data. In this work, we overcome these challenges by presenting a first-ever hybrid parallel and performance-portable approach for Tucker decomposition of both batch and streaming data. Our work is based on the TuckerMPI package, which provides scalable, distributed memory Tucker decomposition techniques, as well as prior work on a sequential streaming Tucker decomposition algorithm. We extend TuckerMPI to hybrid parallelism through the use of the Kokkos/Kokkos-Kernels performance portability packages, develop a hybrid parallel streaming Tucker decomposition algorithm, and demonstrate performance and portability of these approaches on a variety of large-scale scientific data sets on both CPU and GPU architectures.

Authors