Various applications model problems as streaming graphs, which need to quickly
apply a stream of updates and run algorithms on the updated graph. Furthermore,
many dynamic real-world graphs, such as social networks, follow a skewed
distribution of vertex degrees, where there are a few high-degree vertices and
many low-degree vertices.

Existing static graph-processing systems optimized for graph skewness achieve
high performance and low space usage by preprocessing a cache-efficient graph
partitioning based on vertex degree. In the streaming setting, the whole graph
is not available upfront, however, so finding an optimal partitioning is not
feasible in the presence of updates. As a result, existing streaming
graph-processing systems take a "one-size-fits-all" approach, leaving
performance on the table.

We present Terrace, a system for streaming graphs that uses a hierarchical data
structure design to store a vertex's neighbors in different data structures
depending on the degree of the vertex. This multi-level structure enables
Terrace to dynamically partition vertices based on their degrees and adapt to
skewness in the underlying graph.

Our experiments show that Terrace supports faster batch insertions for batch
sizes up to 1M when compared to Aspen, a state-of-the-art graph streaming
system.
On graph query algorithms, Terrace is between 1.7x--2.6x faster
than Aspen and between 0.5x--1.3x as fast as Ligra, a
state-of-the-art static graph-processing system.

Presented at SIGMOD 2021

For a video of the talk see here

The code can be found on github https://github.com/PASSIONLab/terrace

We describe RADICS: Runtime Assurance of Distributed Intelligent Control Systems, which combines a Simplex-based, black-box monitor with a white-box monitor to ensure correct behavior and good performance of AI systems. The black-box monitor allows the system to detect when the AI controller is on a failing trajectory and use a provably safe, but less performant algorithm, to right the system. The white-box monitor predicts when the AI controller will be put on such a trajectory before it happens and helps maximize the performance of the overall system. We describe the overall approach in detail and implement a simple version of it on a case study into controlling the lights in a small traffic grid.

Presented at DSML 2021

For more information see http://www.dsn.jhu.edu/radics/

For a video of the talk see here

The ideal data structure for storing dynamic graphs would support fast updates as well as fast
range
queries which underlie graph traversals such as breadth-first search. The Packed Memory Array
(PMA)
seems like a good candidate for this setting because it supports fast updates as well as
cache-efficient range queries. Concurrently updating a PMA raises challenges, however, because
an
update may require rewriting the entire structure.

This paper introduces a parallel PMA with intra- and inter-operation parallelism and
deadlock-free
polylogarithmicspan operations. Our main observation is that the PMA is well-suited to
concurrent
updates despite occasionally requiring a rewrite of the entire structure because 1) most of the
updates only write to a small part of the structure and 2) the worst case is highly parallel and
cache-efficient.

To evaluate our data structure, we implemented Parallel Packed Compressed Sparse Row (PPCSR), a
dynamic-graph processing framework that extends the Ligra interface with graph updates. We show
that
PPCSR is on average about 1.6x faster on graph kernels than Aspen, a state-of-the-art
graph-streaming system. PPCSR achieves up to 80 million updates per second and is 2 – 5x faster
than
Aspen on most batch sizes. Finally, PPCSR is competitive with Ligra and Ligra+, two
state-of-the-art
static graph-processing frameworks.

For a video of the talk see here

The accuracy and computational efficiency of image alignment directly affects
the advancement of connectomics, a field which seeks to understand the
structure of the brain through electron microscopy.

We introduce the algorithms Quilter and Stacker that are designed to perform 2D
and 3D alignment respectively on petabyte-scale data sets from connectomics.
Quilter and Stacker are efficient, scalable, and simple to deploy on hardware
ranging from a researcher's laptop to a large computing cluster. On a single
18-core cloud machine each algorithm achieves throughputs of more than 1
TB/hr; when combined the algorithms produce an end-to-end alignment pipeline
that processes data at a rate of 0.82 TB/hr - an over 10x improvement
from previous systems. This efficiency comes from both traditional
optimizations and from the use of ``Frugal Snap Judgments'' to judiciously
exploit performance--accuracy trade offs.

A high-throughput image-alignment pipeline was implemented using the Quilter
and Stacker algorithms and its performance was evaluated using three datasets
whose size ranged from 550GB to 38TB. The full alignment pipeline achieved
a throughput of 0.6-0.8 TB/hr and 1.4-1.5 TB/hr on an 18-core and
112-core shared-memory multicore, respectively. On a supercomputing cluster
with 200 nodes and 1600 total cores, the pipeline achieved a throughput of
21.4 TB/hr.

This work won Best Student Paper at HPEC 2020.

This work was presented as a Poster at PPOPP 2019.

Perhaps the most popular sparse graph storage format is Compressed Sparse Row (CSR). CSR excels
at
storing graphs compactly with minimal overhead, allowing for fast traversals, lookups, and basic
graph computations such as PageRank. Since elements in CSR format are packed together, additions
and
deletions often require time linear in the size of the graph.

We introduce a new dynamic sparse
graph representation called Packed Compressed Sparse Row (PCSR), based on an array-based dynamic
data structure called the Packed Memory Array. PCSR is similar to CSR, but leaves spaces between
elements, allowing for asymptotically faster insertions and deletions in exchange for a constant
factor slowdown in traversals and a constant factor increase in space overhead.

Our contributions
are twofold. We describe PCSR and review the theoretical guarantees for update, insert, and
search
for PCSR. We also implemented PCSR as well as other basic graph storage formats and report our
findings on a variety of benchmarks. PCSR supports inserts orders of magnitude faster than CSR
and
is only a factor of two slower on graph traversals. Our results suggest that PCSR is a
lightweight
dynamic graph representation that supports fast inserts and competitive searches.

Energy efficiency is a key challenge for building modern sustainable societies. World’s energy
consumption is expected to grow annually by 1.6 %, increasing pressure for utilities and
governments
to fulfill demand and raising significant challenges in generation, distribution, and storage of
electricity. In this context, accurate predictions and understanding of population dynamics and
their relation to electricity demand dynamics is of high relevance.

We introduce a simple machine learning (ML) method for day-ahead predictions of hourly energy
consumption, based on population and electricity demand dynamics. We use anonymized mobile phone
records (CDRs) and historical energy records from a small European country. CDRs are large-scale
data that is collected passively and on a regular basis by mobile phone carriers, including time
and
location of calls and text messages, as well as phones’ countries of origin. We show that simple
support vector machine (SVM) autoregressive models are capable of baseline energy demand
predictions
with accuracies below 3 % percentage error and active population predictions below 10 %
percentage
error. Moreover, we show that population dynamics from mobile phone records contain information
additional to that of electricity demand records, which can be exploited to improve prediction
performance. Finally, we illustrate how the joint analysis of population and electricity
dynamics
elicits insights into the relation between population and electricity demand segments, allowing
for
potential demand management interventions and policies beyond reactive supply-side operations.

Since the publication of PPCSR above, I have continued working on more efficient data structures and systems to improve the performance of dynamic graph analytics by using techniques including parallelization, data hierarchy, and compression.

For my senior research paper in Computer Science, I studied a variant of the Traveling Salesman Problem.

For my senior seminar paper in Discrete Math, I studied Anti-magic graphs.