Integrations — Kafka
Integration overview
The tdp-kafka chart supports integration with relational databases through Debezium connectors running on top of Kafka Connect.
What is Kafka Connect and Debezium?
Kafka Connect is a framework in the Kafka ecosystem for moving data between Kafka and external systems in a scalable and fault-tolerant manner, without writing code. Each integration is declared as a connector.
Debezium is a set of Kafka Connect connectors specialized in Change Data Capture (CDC): instead of periodically querying the database, Debezium reads the database transaction log (WAL in PostgreSQL, binlog in MySQL, etc.) and publishes each insert, update, or delete as a Kafka event — in near real-time and with minimal impact on the source database.
Database → transaction log → Debezium → Kafka Topic → consumers
CDC is superior to periodic polling because: it captures all changes (including deletions), operates in real time, does not overload the database with full-scan queries, and preserves the order of changes.
Supported databases
| Database | Chart key |
|---|---|
| PostgreSQL | kafkaConnects.postgres |
| MySQL | kafkaConnects.mysql |
| SQL Server | kafkaConnects.sqlserver |
| Oracle | kafkaConnects.oracle |
Configuration structure
Each integration follows the same hierarchy:
kafkaConnects:
<type>:
enabled: true # enables the KafkaConnect cluster
connect: # Kafka Connect cluster settings
name: ...
replicas: ...
bootstrapServers: ...
kafkaVersion: ...
connectors: # one or more connectors in this cluster
<logical-name>:
enabled: true
name: ... # KafkaConnector resource name
topicPrefix: ... # prefix for generated topics
<type>: # database-specific parameters
...
Credential management
The chart offers two options for database credentials:
| Option | Parameter | When to use |
|---|---|---|
| Create Secret automatically | createSecret: true + credentials in values | Environments without external Secret management |
| Use existing Secret | createSecret: false + secretName: <name> | Environments with Vault, Sealed Secrets, or custom management |
When using createSecret: true, credentials are stored in the values applied to the release. Do not version password in public or shared Git repositories; use private values files or targeted --set flags in CI/CD pipelines.
PostgreSQL
Database prerequisite
Debezium uses PostgreSQL logical replication. Before enabling the connector:
- Configure
wal_level = logicalin PostgreSQL - Create a user with replication permission and access to the monitored tables
- Enable logical publication (
CREATE PUBLICATION) or let Debezium create it automatically
kafkaConnects:
postgres:
enabled: true
connect:
name: connect-debezium-postgres
replicas: 1
bootstrapServers: <kafka-cluster-name>-kafka-bootstrap:9092
kafkaVersion: "4.1.0"
connectors:
appdb:
enabled: true
name: debezium-postgres-appdb
topicPrefix: "pgapp"
postgres:
createSecret: true
host: "<postgresql-host>.<namespace>.svc.cluster.local"
port: "5432"
dbname: "appdb"
user: "debezium"
password: "<database-password>"
Generated topics follow the pattern: <topicPrefix>.<schema>.<table>. For example, with topicPrefix: pgapp, the table public.orders will generate the topic pgapp.public.orders.
MySQL
Database prerequisite
Debezium reads the MySQL binlog. Ensure that:
log_binis enabled in MySQLbinlog_format = ROW- The CDC user has the following permissions:
SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE, REPLICATION CLIENT
kafkaConnects:
mysql:
enabled: true
connect:
name: connect-debezium-mysql
replicas: 1
bootstrapServers: <kafka-cluster-name>-kafka-bootstrap:9092
kafkaVersion: "4.1.0"
connectors:
appdb:
enabled: true
name: debezium-mysql-appdb
topicPrefix: "mysqlapp"
mysql:
createSecret: true
host: "<mysql-host>.<namespace>.svc.cluster.local"
port: "3306"
user: "debezium"
password: "<database-password>"
databaseIncludeList: "appdb"
The databaseIncludeList field restricts capture to the listed databases (comma-separated).
SQL Server
Database prerequisite
Debezium uses the SQL Server native CDC. For each database/table to monitor:
- Enable CDC on the database:
EXEC sys.sp_cdc_enable_db - Enable CDC on the tables:
EXEC sys.sp_cdc_enable_table - The CDC user must have read permission on the CDC tables (
cdc.*)
kafkaConnects:
sqlserver:
enabled: true
connect:
name: connect-debezium-sqlserver
replicas: 1
bootstrapServers: <kafka-cluster-name>-kafka-bootstrap:9092
kafkaVersion: "4.1.0"
connectors:
appdb:
enabled: true
name: debezium-sqlserver-appdb
topicPrefix: "mssqlapp"
sqlserver:
createSecret: true
host: "<sqlserver-host>.<namespace>.svc.cluster.local"
port: "1433"
user: "debezium"
password: "<database-password>"
databaseNames: "AppDb"
The databaseNames field lists the SQL Server databases to monitor (comma-separated).
Oracle
Database prerequisite
Debezium uses Oracle's LogMiner to read the redo log. The following are required:
- Database in
ARCHIVELOGmode - LogMiner enabled and configured
- User with permissions:
LOGMINING,SELECT_CATALOG_ROLE, access to monitored tables and LogMiner views
kafkaConnects:
oracle:
enabled: true
connect:
name: connect-debezium-oracle
replicas: 1
bootstrapServers: <kafka-cluster-name>-kafka-bootstrap:9092
kafkaVersion: "4.1.0"
connectors:
appdb:
enabled: true
name: debezium-oracle-appdb
topicPrefix: "oraapp"
oracle:
createSecret: true
host: "<oracle-host>.<namespace>.svc.cluster.local"
port: "1521"
user: "debezium"
password: "<database-password>"
dbname: "ORCLCDB"
pdbName: "ORCLPDB1"
| Field | Description |
|---|---|
dbname | Container Database (CDB) name |
pdbName | Pluggable Database (PDB) name where the monitored tables reside |
The Oracle JDBC driver is not distributed with Debezium due to license restrictions. Check the chart image documentation to confirm how the driver is made available in your version of tdp-kafka.
Deploy
helm upgrade --install <release> \
oci://registry.tecnisys.com.br/tdp/charts/tdp-kafka \
-n <namespace> \
-f values.yaml \
-f values-connectors.yaml
Verification
# List all connectors in the namespace
kubectl -n <namespace> get kafkaconnectors
# Inspect the status of a specific connector
kubectl -n <namespace> describe kafkaconnector <connector-name>
# View Kafka Connect pod logs
kubectl -n <namespace> logs -l strimzi.io/cluster=connect-debezium-postgres
The status.connectorStatus.connector.state field should show RUNNING for a healthy connector.
Integration with Apache Ranger for authorization policies on Kafka topics is not within the scope configurable by this chart; treat it as an advanced environment configuration, performed directly on the Ranger plugin for Kafka.