HYRISE
In-Memory Hybrid Storage Engine
Traditional databases are separated into ones for current data from the day-to-day business processes and ones for reporting and analytics. For fast moving businesses, moving data from one silo to another is cumbersome and takes too much time. As a result, the new data arriving in the reporting system is already old by the time it is loaded. HYRISE proposes a new way to solve this problem: It analyzes the query input and reorganizes the stored data in different dimensions. In detail, HYRISE partitions the layout of the underlying tables in a vertical and horizontal manner depending on the input to this layout management component. The workload is specified as a set of queries and weights and is processed by calculating the layout dependent costs for those queries. Based on our cost-model we can now calculate the best set of partitions for this input workload. This optimization allows great speed improvements compared to traditional storage models. For details, please refer to our PVLDB publications “HYRISE - A Main Memory Hybrid Storage Engine” (VLDB’11) and “Fast Updates on Read-Optimized Databases Using Multi-Core CPUs” (VLDB’12).
Hyrise is completely open source and available on Github: https://github.com/hyrise/hyrise
Project Team: Prof. Dr. h.c. Hasso Plattner, David Schwalb, Martin Faust, Martin Boissier, Carsten Meyer, Stefan Klauck, Markus Dreseler, Dr. Matthias Uflacker
Related Research Area: In-Memory Data Management for Enterprise Systems
Project Period: since Oct 2008
HYRISE-NV
HYRISE-NV is a database storage engine that completely maintains table and index structures on NVRAM. Our architecture updates the database state and index structures transactionally consistent on NVRAM using multi-version data structures, allowing to instantly recover databases independent of their size. For index structures, we use multi-versioned b-trees and adapted hashmaps to provide failure-atomic updates on NVM. We evaluated HYRISE-NV on DRAM and with a hardware-based emulation of NVM against the TPC-C benchmark. HYRISE-NV recovers databases independent of their size, allowing the recovery of a table with 10 million rows in less than 100ms. Comparing HYRISE NV with a traditional log-based approach, we report reduced transaction latencies as we can refrain from batching transactions.
HYRISE-R - Lazy Master Replication for Enterprise Applications
HYRISE-R is a lazy master replication system for the in-memory database HYRISE By setting up a snapshot-based HYRISE cluster, we can show an increase in both performance by distributing queries over multiple instances and availability by utilizing the redundancy of the cluster structure.
Dynamic Data Tiering
With the recent rise of new storage tiers as PCIe-connected Phase Change Memory cards and NVRAM as well as new partitioning schemes as Actual/Historical Partitioning, this project revisits in-memory storage formats. This includes, e.g., horizontal per-column partitioning to differentiate between columns often accessed by analytical queries and columns solely accessed by transactional queries as well as a hybrid storage format, which improves tuple materialization from secondary storage. Focussing on mixed workloads, less relevant data is put on secondary storage tiers without jeopardizing the original performance superiority of in-memory databases. Those new storage formats are both applicable to an Actual/Historical partitioning as well as a more general partitioning scheme.
Selected Publications
-
Martin Grund, Jens Krüger, Hasso Plattner, Alexander Zeier, Philippe Cudre-Mauroux, Samuel Madden. HYRISE: a main memory hybrid storage engine. VLDB, 2011.
-
Jens Krueger, Changkyu Kim, Martin Grund, Nadathur Satish, David Schwalb, Jatin Chhugani, Hasso Plattner, Pradeep Dubey, Alexander Zeier. Fast updates on read-optimized databases using multi-core CPUs. VLDB, 2012.
-
Martin Grund, Philippe Cudré-Mauroux, Jens Krüger, Samuel Madden, Hasso Plattner. An overview of HYRISE-a Main Memory Hybrid Storage Engine. IEEE Data Eng. Bull., 2012
-
Johannes Wust, Joos-Hendrick Boese, Frank Renkes, Sebastian Blessing, Jens Krueger, Hasso Plattner. Efficient Logging for Enterprise Workloads on Column-Oriented In-Memory Databases. CIKM, 2012.
-
Martin Faust, David Schwalb, Jens Krueger, Hasso Plattner. Fast Lookups for In-Memory Column Stores: Group-Key Indices, Lookup and Maintenance. ADMS, 2012.
-
David Schwalb, Martin Faust, Jens Krueger, Hasso Plattner. Physical Column Organization in In-Memory Column Stores. DASFAA, 2013.
-
Johannes Wust, Martin Grund, Hasso Plattner. TAMEX: a Task-Based Query Execution Framework for Mixed Enterprise Workloads on In-Memory Databases. IMDM, 2013.
-
David Schwalb, Martin Faust, Johannes Wust, Martin Grund, Hasso Plattner. Efficient Transaction Processing for Hyrise in Mixed Workload Environments. IMDM/VLDB, 2014.
-
Johannes Wust, Martin Grund, Kai Hoewelmeyer, David Schwalb, Hasso Plattner. Concurrent Execution of Mixed Enterprise Workloads on In-Memory Databases. DASFAA, 2014.
-
David Schwalb, Markus Dreseler, Matthias Uflacker, Hasso Plattner. NVC-Hashmap: A Persistent and Concurrent Hash-Map For Non-Volatile Memories. IMDM/VLDB, 2015.
-
David Schwalb, Tim Berning, Martin Faust, Markus Dreseler, Hasso Plattner. nvm_malloc: Memory Allocation for NVRAM. ADMS/VLDB, 2015.
-
Carsten Meyer, Martin Boissier, Adrian Michaud, Jan Ole Vollmer, Ken Taylor, David Schwalb, Matthias Uflacker and Kurt Roedszus. Dynamic and Transparent Data Tiering for In-Memory Databases in Mixed Workload Environments. ADMS/VLDB, 2015.
-
David Schwalb, Jan Kossmann, Stefan Klauck, Martin Faust, Hasso Plattner. Hyrise-R: Scale-out and Hot-Standby through Lazy Master Replication for Enterprise Applications. IMDM/VLDB, 2015.