Back

Minisymposium

MS4F - High Performance Graph Analytics

Fully booked
Tuesday, June 4, 2024
16:00
-
18:00
CEST
HG D 1.2

Replay

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Suspendisse varius enim in eros elementum tristique. Duis cursus, mi quis viverra ornare, eros dolor interdum nulla, ut commodo diam libero vitae erat. Aenean faucibus nibh et justo cursus id rutrum lorem imperdiet. Nunc ut sem vitae risus tristique posuere.

Session Chair

Description

Estimating the structure, partitioning, and analyzing graphs, are all critical tasks in a plethora of applications. Problems in domains such as image processing, social network analysis, and classification via neural networks, are often formulated as being graph-based. Simultaneously, graph analytic methods are traditionally an important subtask that enables the parallelization or the complexity reduction of the entire algorithmic workflow. This minisymposium samples recent advances in methods intended for graphs emerging from large-scale data, with a focus on performant, efficient, and scalable algorithms.

Presentations

16:00
-
16:30
CEST
Parallel Algorithms for Dynamic Graph Clustering

We consider the problem of incremental graph clustering where the graph to be clustered is given as a sequence of disjoint subsets of the edge set. The problem appears when dealing with graphs that are created over time, such as online social networks where new users appear continuously, or protein interaction networks when new proteins are discovered. For very large graphs, it is computationally too expensive to repeatedly apply standard clustering algorithms.Instead, algorithms whose time complexity only depends on the size of the incoming subset of edges in every step are needed. At the same time, such algorithms should find clusterings whose quality is close to that produced by offline algorithms. We discuss the computational model and present an incremental clustering algorithm, along with its parallel implementation. The scalability results suggest that our method is well suited for clustering massive graphs with acceptable running times while retaining a large fraction of the clustering quality.

Johannes Langguth (Simula Research Laboratory, University of Bergen)
With Thorsten Kurth (NVIDIA Inc.)
16:30
-
17:00
CEST
Faster Local Motif Clustering via Maximum Flows

Local clustering aims to identify a cluster within a given graph that includes a designated seed node or a significant portion of a group of seed nodes. This cluster should be well-characterized, i.e., in the context of motifs, such as edges or triangles, it should have a high number of internal motifs and a low number of external motifs (motif conductance). In this talk, we present SOCIAL, a state-of-the-art algorithm for local motif clustering which optimizes for motif conductance based on a local hypergraph model representation of the problem and an adapted version of the max-flow quotient-cut improvement algorithm (MQI). In our experiments with the triangle motif, SOCIAL produces local clusters with superior solution quality among alternate approaches, while being up to multiple orders of magnitude faster.

Adil Chhabra, Marcelo Fonseca Faraj, and Christian Schulz (University of Heidelberg)
With Thorsten Kurth (NVIDIA Inc.)
17:00
-
17:30
CEST
Algebraic Programming for Graph & Machine Learning

Algebraic Programming, or ALP for short, allows for programming with explicit algebraic structures. Such structures range from the simplistic such as associative binary operators, to richer constructs such as semirings. This algebraic knowledge, given to ALP by the programmer, percolates through the ALP framework and allows it to auto-parallelise, as well as perform other types of automatic program transformations for achieving high performance.Originating from the GraphBLAS, ALP’s initial and most mature interface concerns generalized linear algebra. This talk focuses on two recent works related to the use of ALP/GraphBLAS within high-performance machine learning. First, we shall overview how the ALP framework achieves interoperability with Spark, making accessible ALP/GraphBLAS algorithms from within Spark, a standard framework for Big Data analytics. Second, the talk details the ALP/Pregel implementation, which relies on ALP/GraphBLAS to simulate, at high efficiency and at full scalability, vertex-centric programming.Performance comparisons of canonical machine learning algorithms such as PageRank versus the standard GraphX library on top of Spark display significant performance gains of up to 21x on two nodes. The vertex-centric ALP/Pregel furthermore is shown to auto-parallelise well on shared-memory architectures; a vertex-centric PageRank-like algorithm achieves a 5.7x speedup versus a highly optimised parallel linear algebraic PageRank.

Albert-Jan Yzelman, Aristeidis Mastoras, and Alberto Scolari (Huawei)
With Thorsten Kurth (NVIDIA Inc.)
17:30
-
18:00
CEST
3S in Distributed Graph Neural Networks: Sparse Communication, Sampling, and Scalability

This talk will focus on distributed-memory parallel algorithms for graph neural network (GNN) training. We will first focus on utilizing sparse matrix primitives to parallelize mini-batch training based on node-wise and layer-wise sampling. Then, we will illustrate techniques that are based on sparsity-aware sparse matrix times dense matrix multiplication algorithms to accelerate both full-graph and mini-batch sampling based training.

Aydin Buluc (Lawrence Berkeley National Laboratory, UC Berkeley); Alok Tripathy (UC Berkeley); and Katherine Yelick (UC Berkeley, Lawrence Berkeley National Laboratory)
With Thorsten Kurth (NVIDIA Inc.)