Projects

Publications

Terrace: A Hierarchical Graph Container for Skewed Dynamic Graphs

    Abstract

    Various applications model problems as streaming graphs, which need to quickly apply a stream of updates and run algorithms on the updated graph. Furthermore, many dynamic real-world graphs, such as social networks, follow a skewed distribution of vertex degrees, where there are a few high-degree vertices and many low-degree vertices.

    Existing static graph-processing systems optimized for graph skewness achieve high performance and low space usage by preprocessing a cache-efficient graph partitioning based on vertex degree. In the streaming setting, the whole graph is not available upfront, however, so finding an optimal partitioning is not feasible in the presence of updates. As a result, existing streaming graph-processing systems take a "one-size-fits-all" approach, leaving performance on the table.

    We present Terrace, a system for streaming graphs that uses a hierarchical data structure design to store a vertex's neighbors in different data structures depending on the degree of the vertex. This multi-level structure enables Terrace to dynamically partition vertices based on their degrees and adapt to skewness in the underlying graph.

    Our experiments show that Terrace supports faster batch insertions for batch sizes up to 1M when compared to Aspen, a state-of-the-art graph streaming system. On graph query algorithms, Terrace is between 1.7x--2.6x faster than Aspen and between 0.5x--1.3x as fast as Ligra, a state-of-the-art static graph-processing system.



RADICS: Runtime Assurance of Distributed Intelligent Control Systems

    Abstract

    We describe RADICS: Runtime Assurance of Distributed Intelligent Control Systems, which combines a Simplex-based, black-box monitor with a white-box monitor to ensure correct behavior and good performance of AI systems. The black-box monitor allows the system to detect when the AI controller is on a failing trajectory and use a provably safe, but less performant algorithm, to right the system. The white-box monitor predicts when the AI controller will be put on such a trajectory before it happens and helps maximize the performance of the overall system. We describe the overall approach in detail and implement a simple version of it on a case study into controlling the lights in a small traffic grid.



A Parallel Packed Memory Array to Store Dynamic Graphs

    Abstract

    The ideal data structure for storing dynamic graphs would support fast updates as well as fast range queries which underlie graph traversals such as breadth-first search. The Packed Memory Array (PMA) seems like a good candidate for this setting because it supports fast updates as well as cache-efficient range queries. Concurrently updating a PMA raises challenges, however, because an update may require rewriting the entire structure.

    This paper introduces a parallel PMA with intra- and inter-operation parallelism and deadlock-free polylogarithmicspan operations. Our main observation is that the PMA is well-suited to concurrent updates despite occasionally requiring a rewrite of the entire structure because 1) most of the updates only write to a small part of the structure and 2) the worst case is highly parallel and cache-efficient.

    To evaluate our data structure, we implemented Parallel Packed Compressed Sparse Row (PPCSR), a dynamic-graph processing framework that extends the Ligra interface with graph updates. We show that PPCSR is on average about 1.6x faster on graph kernels than Aspen, a state-of-the-art graph-streaming system. PPCSR achieves up to 80 million updates per second and is 2 – 5x faster than Aspen on most batch sizes. Finally, PPCSR is competitive with Ligra and Ligra+, two state-of-the-art static graph-processing frameworks.



High-Throughput Image Alignment for Connectomics using Frugal Snap Judgments

    Abstract

    The accuracy and computational efficiency of image alignment directly affects the advancement of connectomics, a field which seeks to understand the structure of the brain through electron microscopy.

    We introduce the algorithms Quilter and Stacker that are designed to perform 2D and 3D alignment respectively on petabyte-scale data sets from connectomics. Quilter and Stacker are efficient, scalable, and simple to deploy on hardware ranging from a researcher's laptop to a large computing cluster. On a single 18-core cloud machine each algorithm achieves throughputs of more than 1 TB/hr; when combined the algorithms produce an end-to-end alignment pipeline that processes data at a rate of 0.82 TB/hr - an over 10x improvement from previous systems. This efficiency comes from both traditional optimizations and from the use of ``Frugal Snap Judgments'' to judiciously exploit performance--accuracy trade offs.

    A high-throughput image-alignment pipeline was implemented using the Quilter and Stacker algorithms and its performance was evaluated using three datasets whose size ranged from 550GB to 38TB. The full alignment pipeline achieved a throughput of 0.6-0.8 TB/hr and 1.4-1.5 TB/hr on an 18-core and 112-core shared-memory multicore, respectively. On a supercomputing cluster with 200 nodes and 1600 total cores, the pipeline achieved a throughput of 21.4 TB/hr.


    • Presented as a Poster at PPOPP 2019.
    • Presented as a paper at HPEC 2020, where it won the best student paper award.

Packed Compressed Sparse Row: A Dynamic Graph Representation

    Abstract

    Perhaps the most popular sparse graph storage format is Compressed Sparse Row (CSR). CSR excels at storing graphs compactly with minimal overhead, allowing for fast traversals, lookups, and basic graph computations such as PageRank. Since elements in CSR format are packed together, additions and deletions often require time linear in the size of the graph.

    We introduce a new dynamic sparse graph representation called Packed Compressed Sparse Row (PCSR), based on an array-based dynamic data structure called the Packed Memory Array. PCSR is similar to CSR, but leaves spaces between elements, allowing for asymptotically faster insertions and deletions in exchange for a constant factor slowdown in traversals and a constant factor increase in space overhead.

    Our contributions are twofold. We describe PCSR and review the theoretical guarantees for update, insert, and search for PCSR. We also implemented PCSR as well as other basic graph storage formats and report our findings on a variety of benchmarks. PCSR supports inserts orders of magnitude faster than CSR and is only a factor of two slower on graph traversals. Our results suggest that PCSR is a lightweight dynamic graph representation that supports fast inserts and competitive searches.



Electricity Demand and Population Dynamics Prediction from Mobile Phone Metadata

    Abstract

    Energy efficiency is a key challenge for building modern sustainable societies. World’s energy consumption is expected to grow annually by 1.6 %, increasing pressure for utilities and governments to fulfill demand and raising significant challenges in generation, distribution, and storage of electricity. In this context, accurate predictions and understanding of population dynamics and their relation to electricity demand dynamics is of high relevance.

    We introduce a simple machine learning (ML) method for day-ahead predictions of hourly energy consumption, based on population and electricity demand dynamics. We use anonymized mobile phone records (CDRs) and historical energy records from a small European country. CDRs are large-scale data that is collected passively and on a regular basis by mobile phone carriers, including time and location of calls and text messages, as well as phones’ countries of origin. We show that simple support vector machine (SVM) autoregressive models are capable of baseline energy demand predictions with accuracies below 3 % percentage error and active population predictions below 10 % percentage error. Moreover, we show that population dynamics from mobile phone records contain information additional to that of electricity demand records, which can be exploited to improve prediction performance. Finally, we illustrate how the joint analysis of population and electricity dynamics elicits insights into the relation between population and electricity demand segments, allowing for potential demand management interventions and policies beyond reactive supply-side operations.



Projects

Streaming Sparse Graphs using Efficient Dynamic Sets

    Abstract

    We present the SSTGraph framework for the storage and analysis of dynamic graphs. Its performance matches or exceeds state-of-the-art static graph engines and supports streaming updates. SSTGraph builds on top of the tinyset parallel, dynamic set data structure. Tinyset implements set membership in a shallow hierarchy of sorted packed memory arrays to achieve logarithmic time access and updates, and it scans in optimal linear time. Tinyset uses space comparable to that of systems that use data compression while avoiding compression’s computation and serialization overhead.

    SSTGraph outperforms other streaming, dynamic graph engines on a suite of four graph algorithms. Our evaluation includes a comparison with the Aspen streaming graph system. SSTGraph reduces runtime by 40% on average, updates are 2x-5x faster on batch sizes up to 10 million, and graph are smaller. The partitioned data structure scales well and runs on billion edge graphs in just 15 GB of memory.


Small Ordered Dynamic Sets Using Compressed Packed Memory Arrays

    Abstract

    The Packed Memory Array (PMA) data structure maintains a dynamic set of 𝑛 elements in sorted order in a Θ(𝑛)-sized array by interspersing Θ(𝑛) empty spaces among elements. In practice, the PMA supports scans much faster than pointer-based structures such as trees because it stores elements contiguously in sorted order.

    This paper improves PMAs by 1) increasing update throughput with batch updates and 2) reducing space usage with data compression. Beyond fast search, insert, and deletions, efficient advanced operations such as batch updates and maps are a critical part of large-scale data-processing frameworks. Furthermore, the PMA’s asymptotics depend on a constant factor of extra space, which in practice may be infeasible on large datasets.

    Specifically, this paper introduces 1) a batch-update algorithm for PMAs and 2) a compressed PMA, which improve update throughput and space usage, respectively. Our experiments show that the PMA with batch updates supports up to 122 million updates per second. It achieves 1.6× and 1.3× faster update throughput compared toAspen and PAM, two state-of-the-art dynamic set data structures. ThePMA supports scans, which underlie maps, at over 10 billion elements per second and achieves 2.2× and 63× speedup over Aspen and PAM, respectively, on scans. Finally, the compressed PMA reduces the space usage of the PMA by about 3× while maintaining competitive update and scan throughputs.


Tourist Path Optimization Problem

    For my senior research paper in Computer Science, I studied a variant of the Traveling Salesman Problem.

Anti-magic and Edge Graceful Graphs

    For my senior seminar paper in Discrete Math, I studied Anti-magic graphs.