Integrations — Delta Lake
Integration overview
The tdp-deltalake chart requires access to S3-compatible storage (Apache Ozone S3 Gateway, MinIO, or another S3 endpoint) and can be integrated with the main processing and orchestration tools in TDP.
S3 / MinIO
Maintenance CronJobs require a Secret with S3 access credentials.
Create the Secret
kubectl -n <namespace> create secret generic s3-credentials \
--from-literal=access-key='<ACCESS_KEY>' \
--from-literal=secret-key='<SECRET_KEY>'
Store credentials in a separate values file (outside Git) or in an existing Kubernetes Secret. Never commit them directly in the repository.
Configure the S3 endpoint
maintenance:
spark:
config:
"spark.hadoop.fs.s3a.endpoint": "http://<s3-host>.<namespace>.svc.cluster.local:9000"
"spark.hadoop.fs.s3a.path.style.access": "true"
S3 parameters
| Parameter | Description | Example |
|---|---|---|
spark.hadoop.fs.s3a.endpoint | S3 endpoint URL | http://ozone-s3g.<ns>.svc.cluster.local:9000 |
spark.hadoop.fs.s3a.path.style.access | Force path-style (required for MinIO/Ozone) | "true" |
Trino
Trino can query Delta Lake tables via the Delta connector. Configuration is done in the tdp-trino chart — see Trino configuration.
Spark
Delta Lake tables can be processed directly by Spark. Configuration is done in the tdp-spark chart — see Spark configuration.
Airflow
Airflow can orchestrate pipelines that read and write Delta Lake tables. Connection configuration is done in the tdp-airflow chart — see Integrations — Airflow.
Combining values files
helm upgrade --install <release> \
oci://registry.tecnisys.com.br/tdp/charts/tdp-deltalake \
-n <namespace> \
-f my-values.yaml \
-f values-integration.yaml