Skip to main content
Version 3.0.0

ClickHouse configuration

The tdp-clickhouse chart packages ClickHouse 25.8.11.66 for Kubernetes using the Altinity ClickHouse Operator (ClickHouseInstallation).

This page focuses on base component configuration: installation, main parameters, persistence, resources, and a high availability scenario as a starting example.

Security and integration details are on dedicated pages:

What is ClickHouse?

ClickHouse is a high-performance columnar database for analytics.

Instead of storing data row by row, it stores column by column. That favors fast reads for aggregation, reporting, and analytical queries over large datasets.

In TDP Kubernetes, ClickHouse is typically used as a destination for processed data, then queried by tools such as Superset, CloudBeaver, and Trino.

Learn more

See ClickHouse — Concepts for a full overview of the tool, its architecture, and how it works.

How ClickHouse runs on Kubernetes

In TDP Kubernetes, ClickHouse is deployed with the Altinity ClickHouse Operator, using the same declarative approach as other platform operators (similar in spirit to Kafka with Strimzi): instead of manually creating Pods and Services, you declare the desired state in a ClickHouseInstallation (CHI) resource, and the operator reconciles it with the cluster.

In the release namespace you will typically see:

ComponentDescription
Altinity OperatorManages the ClickHouseInstallation lifecycle
ClickHouseInstallationResource defining shards, replicas, storage, and cluster configuration
ClickHouse podsData pods created from the CHI
ServicesEndpoints to access ClickHouse
PVCsPersistent volumes for data
Optional jobsAuxiliary resources for TDP user, S3, and cleanup

Prerequisites

  • Kubernetes and Helm 3
  • Default StorageClass in the cluster or tdp-clickhouse.clickhouse.persistence.storageClass set explicitly

Installation (OCI)

Use the production registry:

Terminal input
helm upgrade --install <release> \
oci://registry.tecnisys.com.br/tdp/charts/tdp-clickhouse \
-n <namespace> --create-namespace

Configuration

All chart options are in values.yaml. In practice, the most important keys are under:

  • tdp-clickhouse.commonLabels
  • tdp-clickhouse.operator.*
  • tdp-clickhouse.clickhouse.*
  • tdp-clickhouse.cleanup.*
  • TDPConfigurations.s3Connection.* when integrating with S3

Main parameters

ParameterDescriptionDefault (reference)
tdp-clickhouse.clickhouse.enabledEnable clustertrue
tdp-clickhouse.clickhouse.persistence.*PVC: size, storageClass, and accessModesrun helm show values
tdp-clickhouse.clickhouse.resourcesCPU/memory requests and limitsrun helm show values
tdp-clickhouse.clickhouse.defaultUser.passworddefault user password""
tdp-clickhouse.clickhouse.extraUsersExtra users and profiles via XMLrun helm show values
tdp-clickhouse.clickhouse.tdpUser.enabledTDP user with grants jobfalse
tdp-clickhouse.operator.enabledEnable Altinity Operatortrue
tdp-clickhouse.cleanup.enabledEnable pre-delete cleanup jobtrue
TDPConfigurations.s3Connection.*Optional S3 bucket/Secret integrationdisabled by default

Common labels

tdp-clickhouse.commonLabels adds labels to resources created by the chart. Useful when the cluster uses inventory, observability, or governance conventions.

Operator

tdp-clickhouse.operator.enabled controls installing the Altinity Operator as a chart dependency. In most environments the default true is sufficient.

ClickHouse: persistence and resources

These parameters are usually the first ones installers adjust:

tdp-clickhouse:
clickhouse:
persistence:
enabled: true
size: 5Gi
storageClass: ""
accessModes: ["ReadWriteOnce"]

resources:
requests:
cpu: "1"
memory: "2Gi"
limits:
cpu: "2"
memory: "4Gi"
Attention

Use the tdp-clickhouse.clickhouse.resources and tdp-clickhouse.clickhouse.persistence prefixes. Do not use bare tdp-clickhouse.resources.

Optional TDP user

When tdp-clickhouse.clickhouse.tdpUser.enabled: true, the chart creates a Secret (tdp-clickhouse-tdp-user) and runs a post-install/post-upgrade Job to apply grants.

This helps avoid using the default user for TDP ecosystem applications.

High availability (HA)

The chart supports ClickHouse clusters with multiple shards and replicas managed by ClickHouse Keeper.

The reference topology below is a starting example for HA environments: 2 shards × 2 replicas = 4 data pods.

Starting example

The HA example on this page is not mandatory and should not be copied blindly to every environment. Use it as a base to adapt resources, storage, topology, and access policies to your reality.

Core concepts

ConceptPurposeWhen to use
ShardSplits data horizontally across nodesWhen a single node is insufficient for data volume
ReplicaKeeps an identical copy of the shardWhen you need fault tolerance and read continuity
ClickHouse KeeperCoordinates replication between podsRequired when replicasCount > 1
Tables must be replicated

To benefit from replication, tables must use the ReplicatedMergeTree engine or an equivalent variant. Plain MergeTree tables are not replicated automatically.

2×2 topology (4 pods)

Shard 1 ──┬── Replica 1  (pod: chi-<release>-tdp-clickhouse-0-0)
└── Replica 2 (pod: chi-<release>-tdp-clickhouse-0-1)

Shard 2 ──┬── Replica 1 (pod: chi-<release>-tdp-clickhouse-1-0)
└── Replica 2 (pod: chi-<release>-tdp-clickhouse-1-1)

ClickHouse Keeper ── (pod: tdp-clickhouse-keeper-0)

Each replica pair syncs data via Keeper. Losing one pod does not stop access to the corresponding shard, provided the table uses a replicated engine.

Example values-ha.yaml

Create a separate file for HA, e.g. values-ha.yaml, so this scenario does not mix with base component values.

1. Cluster topology and resources

tdp-clickhouse:
clickhouse:
antiAffinity: false # set true if you have enough nodes
shardsCount: 2
replicasCount: 2

persistence:
enabled: true
size: 5Gi
storageClass: ""
accessModes: ["ReadWriteOnce"]

resources:
requests:
cpu: "1"
memory: "512Mi"
limits:
cpu: "2"
memory: "1Gi"

2. ClickHouse cluster configuration

    configuration:
clusters:
- name: tdp-clickhouse
layout:
shardsCount: 2
replicasCount: 2

zookeeper:
nodes:
- host: tdp-clickhouse-keeper-0.tdp-clickhouse-keeper-service.<namespace>.svc.cluster.local
port: 2181
# If keeper.replicaCount=3, add the extra nodes:
# - host: tdp-clickhouse-keeper-1.tdp-clickhouse-keeper-service.<namespace>.svc.cluster.local
# port: 2181
# - host: tdp-clickhouse-keeper-2.tdp-clickhouse-keeper-service.<namespace>.svc.cluster.local
# port: 2181

settings:
distributed_ddl:
enable: true
max_replicated_merges_in_queue: 16
max_replicated_sends_in_queue: 16
background_pool_size: 16
background_schedule_pool_size: 16
distributed_replica_error_cap: 0.1
distributed_replica_error_half_life: 60

macros:
cluster: tdp-clickhouse
shard: '{shard}'
replica: '{replica}'

3. ClickHouse Keeper

  keeper:
enabled: true
replicaCount: 1 # use an odd number: 1 or 3

image: registry.tecnisys.com.br/community/images/clickhouse/clickhouse-keeper
tag: "25.8.11.66"

localStorage:
size: 5Gi
storageClass: ""

resources:
requests:
cpu: "1"
memory: "1Gi"
limits:
cpu: "1"
memory: "1Gi"

settings:
coordination:
session_timeout_ms: 30000
operation_timeout_ms: 10000
heart_beat_interval_ms: 3000
logging:
level: INFO

zoneSpread: false # set true in multi-zone clusters

4. default user password via Secret (optional but recommended)

To set a password for the default user in this scenario, use a Kubernetes Secret instead of plain text in values:

Terminal input
kubectl -n <namespace> create secret generic tdp-clickhouse-credentials \
--from-literal=password='<strong-password>'
tdp-clickhouse:
clickhouse:
configuration:
users:
default/password:
valueFrom:
secretKeyRef:
name: tdp-clickhouse-credentials
key: password

For LDAP access control, allowExternalAccess, hostIP, and credential best practices, see Security — ClickHouse.

Apply HA configuration

Terminal input
helm upgrade --install <release> \
oci://registry.tecnisys.com.br/tdp/charts/tdp-clickhouse \
-n <namespace> \
-f values-ha.yaml

Verification after deploy

After deploy you should see ClickHouse pods, the chi resource, and if enabled the Keeper pod:

Terminal input
kubectl -n <namespace> get pods
kubectl -n <namespace> get chi

Verify ClickHouse sees the cluster and all nodes:

Terminal input
POD=$(kubectl -n <namespace> get pods \
-l clickhouse.altinity.com/chi=<chi-name> \
-o jsonpath='{.items[0].metadata.name}')
kubectl -n <namespace> exec -it "$POD" -- \
clickhouse-client -q "SELECT cluster, shard_num, replica_num, host_name FROM system.clusters"

Expected output lists the configured cluster hosts. In the 2×2 example there are 4 hosts.

Main HA parameters

ParameterDescriptionDefault
tdp-clickhouse.clickhouse.shardsCountNumber of shards1
tdp-clickhouse.clickhouse.replicasCountReplicas per shard1
tdp-clickhouse.clickhouse.antiAffinitySpread pods across physical nodesfalse
tdp-clickhouse.keeper.enabledEnable ClickHouse Keeperfalse
tdp-clickhouse.keeper.replicaCountNumber of Keeper pods1
tdp-clickhouse.keeper.imageKeeper imageregistry.tecnisys.com.br/community/images/clickhouse/clickhouse-keeper
tdp-clickhouse.keeper.localStorage.sizeKeeper storage5Gi
tdp-clickhouse.keeper.zoneSpreadSpread Keeper across zonesfalse

Integrations

For S3 bucket, connection Secret, S3 storage policy, and consumption by other TDP tools, see Integrations — ClickHouse.

Security

For LDAP authentication, password management, default user access control, and additional profiles, see Security — ClickHouse.

Access

Port-forward (HTTP)

Terminal input
kubectl -n <namespace> port-forward svc/<service-name> 8123:8123
curl "http://localhost:8123/?query=SELECT%201"

<service-name> is the HTTP Service generated by the chart or release. Confirm with kubectl get svc -n <namespace>.

Troubleshooting

Terminal input
kubectl -n <namespace> get pods
kubectl -n <namespace> get chi
kubectl -n <namespace> logs deployment/<operator-deployment-name>
kubectl -n <namespace> logs -l clickhouse.altinity.com/chi=<chi-name>

Uninstall

Terminal input
helm -n <namespace> uninstall <release>