Article Source
Low overhead self-optimizing storage for compression
Abstract
Embeddable database management systems can handle large amounts of data in limited environments, such as laptops. This data is often stored in the cloud, where disk usage determines the cost, meaning efficient storage is essential. Therefore, the data is compressed using lightweight compression methods like run-length encoding (RLE). Is it possible to make RLE compression more effective by automatically reordering the data? This talk will describe the different reordering methods evaluated in my master thesis. Then, I will give an overview of how the reordering heuristic was implemented in DuckDB and discuss the results of the experiments on the Public BI benchmark.