Back

Paper

AP1E - ACM Papers Session 1E

Fully booked
Monday, June 3, 2024
17:00
-
18:00
CEST
HG E 3

Replay

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Session Chair

Description

Presentations

17:00
-
17:30
CEST
Toward Improving Boussinesq Flow Simulations by Learning with Compressible Flow

In computational fluid dynamics, the Boussinesq approximation is a popular model for the numerical simulation of natural convection problems. Although using the Boussinesq approximation leads to significant performance gains over a full-fledged compressible flow simulation, the model is only plausible for scenarios where the temperature differences are relatively small, which limits its applicability. This paper bridges the gap between Boussinesq flow and compressible flow via deep learning: we introduce a computationally-efficient CNN-based framework that corrects Boussinesq flow simulations by learning from the full compressible model. Based on a modified U-Net architecture and incorporating a weighted physics penalty loss, our model is trained with and evaluated against a specific natural convection problem. Our results show that by correcting Boussinesq simulations using the trained network, we can enhance the accuracy of velocity, temperature, and pressure variables over the Boussinesq baseline—even for cases beyond the regime of validity of the Boussinesq approximation.

Nurshat Mangnike and David Hyde (Vanderbilt University)
With Thorsten Kurth (NVIDIA Inc.)
17:30
-
18:00
CEST
SoftCache: A Software Cache for PCIe-Attached Hardware Accelerators

Hardware accelerators are used to speed up computationally expensive
applications. Offloading
tasks to accelerator cards requires data to be transferred between
the memory of the host and the external memory of the accelerator
card; this data movement becomes the bottleneck for increasing
accelerator performance. Here, we explore the use
of a software cache to optimize communication and alleviate the
data-movement bottleneck by transparently exploiting locality and
data reuse. We present a generic, application-agnostic framework,
dubbed SoftCache, that can be used with GPU and FPGA accelerator
cards. SoftCache exploits locality to optimize data movement
in a non-intrusive manner (i.e., no algorithmic changes are
necessary) and allows the programmer to tune the cache size,
organization, and replacement policy toward the application needs.
Each cache line can store data of any size, thereby eliminating the
need for separate caches for different data types. We used a phylogenetic
application to showcase SoftCache. Phylogenetics study
the evolutionary history and relationships among different species
or groups of organisms. The phylogenetic application implements
a tree-search algorithm to create and evaluate phylogenetic trees,
while hardware accelerators are used to reduce the computation
time of probability vectors at every tree node. Using SoftCache,
we observed that the total number of bytes transferred during a
complete run of the application was reduced by as much as 89%,
resulting in up to 1.7x (81% of the theoretical peak) and 3.5x (75%
of the theoretical peak) higher accelerator performance (as seen by
the application) for a GPU and an FPGA accelerator, respectively.

Steven Wijnja and Nikolaos Alachiotis (University of Twente)
With Thorsten Kurth (NVIDIA Inc.)