Stop Thinking, Just Do!

Sungsoo Kim's Blog

Low overhead self-optimizing storage for compression

tagsTags

24 July 2022


Article Source


Low overhead self-optimizing storage for compression

Abstract

Embeddable database management systems can handle large amounts of data in limited environments, such as laptops. This data is often stored in the cloud, where disk usage determines the cost, meaning efficient storage is essential. Therefore, the data is compressed using lightweight compression methods like run-length encoding (RLE). Is it possible to make RLE compression more effective by automatically reordering the data? This talk will describe the different reordering methods evaluated in my master thesis. Then, I will give an overview of how the reordering heuristic was implemented in DuckDB and discuss the results of the experiments on the Public BI benchmark.


comments powered by Disqus