Article Source
Data Mesh and Lakehouse
Abstract
Data mesh is an emerging data management architecture that makes it easier for organizations to collaborate. At the same time, in the technology layer beneath it, a new class of data management systems have been gaining traction: lakehouse systems that combine the capabilities of data warehouses and data lakes, bringing high performance query processing and governance to large-scale data stored in open formats. These systems, such as Delta Lake and Apache Hudi, are letting organizations simplify their data architectures by not having to manage separate data warehouses that load subsets of their data to achieve governance or performance. In addition, through new open source projects like Delta Sharing, these systems can directly support data mesh architectures not just within the same technology platform, but across clouds and vendors. I’ll talk about some of the recent work to support these patterns in the Delta Lake open source project and how data mesh and lakehouse systems can combine to create dramatically simpler data architectures.