P13 - A Distributed LogStore Design with Multi-Reader, Multi-Writer Semantics for Streaming Applications
Description
In this work we describe the design and implementation of a distributed logstore that can be used for storing events from streaming applications such as Telemetry and Satellite Remote Sensing. The logstore provides multi-writer, multi-reader (MWMR) semantics. It also totally orders events using timestamps as keys. Our implementation uses a distributed clock synchronization algorithm to synchronize all the processes on a cluster with respect to a master process. Since the logstore is designed to support streaming applications which run for long duration and sample data at constant rates, we used two levels of buffers in our implementation to reduce the total number of disk accesses. Events are buffered in CPU memories and NVMe files before eventually reaching disks. Timer threads running in the background control the flushing of data between memories and disks. They also handle the memory management in the system, thereby making it possible to stream several gigabytes of data over long periods of time. The logstore implementation is hybrid(multi-process and multi-threaded). We used multi-threaded RPCs, MPI, Pthreads and Argobots for implementation. All the IO in the logstore is performed using Parallel HDF5. We also implemented a KeyValueStore interface to the logstore for Client applications.
Presenter(s)
Presenter
I graduated from the University of Illinois,Urbana-Champaign in 2018 with a Ph.D in Computer Science. My advisor was Dr. Marc Snir and I developed a multi-threaded data partitioner for meshes and random point clouds as part of my thesis. After working for three years as R&D Engineer at Ansys, Inc, I joined Illinois Institute of Technology, Chicago in 2022 as R&D developer. My research interests are in the areas of Parallel Algorithms, HPC Software Architecture and Performance Modeling.