Article Source
GraphZeppelin - Streaming Graph Connectivity at Scale
Abstract
I presented GraphZeppelin: a high-performance graph stream processing system at SIGMOD 2022. It outperforms the state of the art: on large, dense graph streams, it is more compact and faster than the best existing stream processing systems. Systems takeaway: use linear sketching/lossy compression to improve the scalability of your graph systems. Theory takeaway: prove external memory I/O bounds for your semi-streaming algorithms because you can probably only run them out-of-core. Data science takeaway: large, dense graphs exist and we should study them and build tools to analyze them.