Article Source
Open-source Change Data Capture With Debezium
- CMU Database Group - Vaccination Database Tech Talks - Booster (2022)
- Speakers: Gunnar Morling (Red Hat Debezium)
- March 14, 2022
Abstract
Change Data Capture (CDC) is one big enabler for your data; by reacting to changes in your database in real-time, CDC comes in handy for implementing a wide range of use cases, such as low-latency data updates from OLTP data stores to OLAP systems, caches, or search indexes, data exchange between microservices, building audit logs, and many more.
In this talk you’ll learn about Debezium, a distributed open-source log-based CDC platform for a variety of databases, such as Postgres, MySQL, Cassandra, MongoDB, and Vitess. We’ll not only explore what makes Debezium and CDC so interesting from a user’s perspective, but we’ll also dive into some of the technical challenges we encountered while implementing Debezium, such as preventing an indefinite growth of WAL files in Postgres, keeping track of the schema of captured tables as DDL statements come in, and strategies for snapshotting your initial data set before capturing data changes from transaction logs.