ClickHouse configuration
The tdp-clickhouse chart packages ClickHouse 25.8.11.66 for Kubernetes using the Altinity ClickHouse Operator (ClickHouseInstallation).
This page focuses on base component configuration: installation, main parameters, persistence, resources, and a high availability scenario as a starting example.
Security and integration details are on dedicated pages:
What is ClickHouse?
ClickHouse is a high-performance columnar database for analytics.
Instead of storing data row by row, it stores column by column. That favors fast reads for aggregation, reporting, and analytical queries over large datasets.
In TDP Kubernetes, ClickHouse is typically used as a destination for processed data, then queried by tools such as Superset, CloudBeaver, and Trino.
See ClickHouse — Concepts for a full overview of the tool, its architecture, and how it works.
How ClickHouse runs on Kubernetes
In TDP Kubernetes, ClickHouse is deployed with the Altinity ClickHouse Operator, using the same declarative approach as other platform operators (similar in spirit to Kafka with Strimzi): instead of manually creating Pods and Services, you declare the desired state in a ClickHouseInstallation (CHI) resource, and the operator reconciles it with the cluster.
In the release namespace you will typically see:
| Component | Description |
|---|---|
| Altinity Operator | Manages the ClickHouseInstallation lifecycle |
| ClickHouseInstallation | Resource defining shards, replicas, storage, and cluster configuration |
| ClickHouse pods | Data pods created from the CHI |
| Services | Endpoints to access ClickHouse |
| PVCs | Persistent volumes for data |
| Optional jobs | Auxiliary resources for TDP user, S3, and cleanup |
Prerequisites
- Kubernetes and Helm 3
- Default
StorageClassin the cluster ortdp-clickhouse.clickhouse.persistence.storageClassset explicitly
Installation (OCI)
Use the production registry:
helm upgrade --install <release> \
oci://registry.tecnisys.com.br/tdp/charts/tdp-clickhouse \
-n <namespace> --create-namespace
Configuration
All chart options are in values.yaml. In practice, the most important keys are under:
tdp-clickhouse.commonLabelstdp-clickhouse.operator.*tdp-clickhouse.clickhouse.*tdp-clickhouse.cleanup.*TDPConfigurations.s3Connection.*when integrating with S3
Main parameters
| Parameter | Description | Default (reference) |
|---|---|---|
tdp-clickhouse.clickhouse.enabled | Enable cluster | true |
tdp-clickhouse.clickhouse.persistence.* | PVC: size, storageClass, and accessModes | run helm show values |
tdp-clickhouse.clickhouse.resources | CPU/memory requests and limits | run helm show values |
tdp-clickhouse.clickhouse.defaultUser.password | default user password | "" |
tdp-clickhouse.clickhouse.extraUsers | Extra users and profiles via XML | run helm show values |
tdp-clickhouse.clickhouse.tdpUser.enabled | TDP user with grants job | false |
tdp-clickhouse.operator.enabled | Enable Altinity Operator | true |
tdp-clickhouse.cleanup.enabled | Enable pre-delete cleanup job | true |
TDPConfigurations.s3Connection.* | Optional S3 bucket/Secret integration | disabled by default |
Common labels
tdp-clickhouse.commonLabels adds labels to resources created by the chart. Useful when the cluster uses inventory, observability, or governance conventions.
Operator
tdp-clickhouse.operator.enabled controls installing the Altinity Operator as a chart dependency. In most environments the default true is sufficient.
ClickHouse: persistence and resources
These parameters are usually the first ones installers adjust:
tdp-clickhouse:
clickhouse:
persistence:
enabled: true
size: 5Gi
storageClass: ""
accessModes: ["ReadWriteOnce"]
resources:
requests:
cpu: "1"
memory: "2Gi"
limits:
cpu: "2"
memory: "4Gi"
Use the tdp-clickhouse.clickhouse.resources and tdp-clickhouse.clickhouse.persistence prefixes. Do not use bare tdp-clickhouse.resources.
Optional TDP user
When tdp-clickhouse.clickhouse.tdpUser.enabled: true, the chart creates a Secret (tdp-clickhouse-tdp-user) and runs a post-install/post-upgrade Job to apply grants.
This helps avoid using the default user for TDP ecosystem applications.
High availability (HA)
The chart supports ClickHouse clusters with multiple shards and replicas managed by ClickHouse Keeper.
The reference topology below is a starting example for HA environments: 2 shards × 2 replicas = 4 data pods.
The HA example on this page is not mandatory and should not be copied blindly to every environment. Use it as a base to adapt resources, storage, topology, and access policies to your reality.
Core concepts
| Concept | Purpose | When to use |
|---|---|---|
| Shard | Splits data horizontally across nodes | When a single node is insufficient for data volume |
| Replica | Keeps an identical copy of the shard | When you need fault tolerance and read continuity |
| ClickHouse Keeper | Coordinates replication between pods | Required when replicasCount > 1 |
To benefit from replication, tables must use the ReplicatedMergeTree engine or an equivalent variant. Plain MergeTree tables are not replicated automatically.
2×2 topology (4 pods)
Shard 1 ──┬── Replica 1 (pod: chi-<release>-tdp-clickhouse-0-0)
└── Replica 2 (pod: chi-<release>-tdp-clickhouse-0-1)
Shard 2 ──┬── Replica 1 (pod: chi-<release>-tdp-clickhouse-1-0)
└── Replica 2 (pod: chi-<release>-tdp-clickhouse-1-1)
ClickHouse Keeper ── (pod: tdp-clickhouse-keeper-0)
Each replica pair syncs data via Keeper. Losing one pod does not stop access to the corresponding shard, provided the table uses a replicated engine.
Example values-ha.yaml
Create a separate file for HA, e.g. values-ha.yaml, so this scenario does not mix with base component values.
1. Cluster topology and resources
tdp-clickhouse:
clickhouse:
antiAffinity: false # set true if you have enough nodes
shardsCount: 2
replicasCount: 2
persistence:
enabled: true
size: 5Gi
storageClass: ""
accessModes: ["ReadWriteOnce"]
resources:
requests:
cpu: "1"
memory: "512Mi"
limits:
cpu: "2"
memory: "1Gi"
2. ClickHouse cluster configuration
configuration:
clusters:
- name: tdp-clickhouse
layout:
shardsCount: 2
replicasCount: 2
zookeeper:
nodes:
- host: tdp-clickhouse-keeper-0.tdp-clickhouse-keeper-service.<namespace>.svc.cluster.local
port: 2181
# If keeper.replicaCount=3, add the extra nodes:
# - host: tdp-clickhouse-keeper-1.tdp-clickhouse-keeper-service.<namespace>.svc.cluster.local
# port: 2181
# - host: tdp-clickhouse-keeper-2.tdp-clickhouse-keeper-service.<namespace>.svc.cluster.local
# port: 2181
settings:
distributed_ddl:
enable: true
max_replicated_merges_in_queue: 16
max_replicated_sends_in_queue: 16
background_pool_size: 16
background_schedule_pool_size: 16
distributed_replica_error_cap: 0.1
distributed_replica_error_half_life: 60
macros:
cluster: tdp-clickhouse
shard: '{shard}'
replica: '{replica}'
3. ClickHouse Keeper
keeper:
enabled: true
replicaCount: 1 # use an odd number: 1 or 3
image: registry.tecnisys.com.br/community/images/clickhouse/clickhouse-keeper
tag: "25.8.11.66"
localStorage:
size: 5Gi
storageClass: ""
resources:
requests:
cpu: "1"
memory: "1Gi"
limits:
cpu: "1"
memory: "1Gi"
settings:
coordination:
session_timeout_ms: 30000
operation_timeout_ms: 10000
heart_beat_interval_ms: 3000
logging:
level: INFO
zoneSpread: false # set true in multi-zone clusters
4. default user password via Secret (optional but recommended)
To set a password for the default user in this scenario, use a Kubernetes Secret instead of plain text in values:
kubectl -n <namespace> create secret generic tdp-clickhouse-credentials \
--from-literal=password='<strong-password>'
tdp-clickhouse:
clickhouse:
configuration:
users:
default/password:
valueFrom:
secretKeyRef:
name: tdp-clickhouse-credentials
key: password
For LDAP access control, allowExternalAccess, hostIP, and credential best practices, see Security — ClickHouse.
Apply HA configuration
helm upgrade --install <release> \
oci://registry.tecnisys.com.br/tdp/charts/tdp-clickhouse \
-n <namespace> \
-f values-ha.yaml
Verification after deploy
After deploy you should see ClickHouse pods, the chi resource, and if enabled the Keeper pod:
kubectl -n <namespace> get pods
kubectl -n <namespace> get chi
Verify ClickHouse sees the cluster and all nodes:
POD=$(kubectl -n <namespace> get pods \
-l clickhouse.altinity.com/chi=<chi-name> \
-o jsonpath='{.items[0].metadata.name}')
kubectl -n <namespace> exec -it "$POD" -- \
clickhouse-client -q "SELECT cluster, shard_num, replica_num, host_name FROM system.clusters"
Expected output lists the configured cluster hosts. In the 2×2 example there are 4 hosts.
Main HA parameters
| Parameter | Description | Default |
|---|---|---|
tdp-clickhouse.clickhouse.shardsCount | Number of shards | 1 |
tdp-clickhouse.clickhouse.replicasCount | Replicas per shard | 1 |
tdp-clickhouse.clickhouse.antiAffinity | Spread pods across physical nodes | false |
tdp-clickhouse.keeper.enabled | Enable ClickHouse Keeper | false |
tdp-clickhouse.keeper.replicaCount | Number of Keeper pods | 1 |
tdp-clickhouse.keeper.image | Keeper image | registry.tecnisys.com.br/community/images/clickhouse/clickhouse-keeper |
tdp-clickhouse.keeper.localStorage.size | Keeper storage | 5Gi |
tdp-clickhouse.keeper.zoneSpread | Spread Keeper across zones | false |
Integrations
For S3 bucket, connection Secret, S3 storage policy, and consumption by other TDP tools, see Integrations — ClickHouse.
Security
For LDAP authentication, password management, default user access control, and additional profiles, see Security — ClickHouse.
Access
Port-forward (HTTP)
kubectl -n <namespace> port-forward svc/<service-name> 8123:8123
curl "http://localhost:8123/?query=SELECT%201"
<service-name> is the HTTP Service generated by the chart or release. Confirm with kubectl get svc -n <namespace>.
Troubleshooting
kubectl -n <namespace> get pods
kubectl -n <namespace> get chi
kubectl -n <namespace> logs deployment/<operator-deployment-name>
kubectl -n <namespace> logs -l clickhouse.altinity.com/chi=<chi-name>
Uninstall
helm -n <namespace> uninstall <release>