Skip to main content
Version Next

Airflow Configuration

Learn more

See Apache Airflow — Concepts for a complete overview of the tool, its architecture and how it works.

The tdp-airflow chart is an umbrella chart: upstream Airflow options are nested under the tdp-airflow key.

The project packages Apache Airflow 3.0.2 for Kubernetes, with KubernetesExecutor as the default executor.

Default configuration

Default behavior of the tdp-airflow chart:

  • Executor: KubernetesExecutor
  • Database: embedded PostgreSQL (subchart), with tdp-airflow.postgresql.enabled: true by default
  • DAGs: PVC persistence enabled by default (tdp-airflow.dags.persistence.enabled: true, typical size 5Gi)
  • Logs: PVC persistence disabled by default (tdp-airflow.logs.persistence.enabled: false); enable only if the StorageClass supports the required access mode (generally RWX)
Terminal input
helm upgrade --install <release> oci://registry.tecnisys.com.br/tdp/charts/tdp-airflow \
-n <namespace> --create-namespace \
-f meu-values.yaml
Tip

The first installation may take a few minutes (images, migrations, hooks). Use --wait with an appropriate timeout (e.g. --timeout 15m) to avoid premature Helm failure.

Access

The API Server is exposed via a Service (typically ClusterIP).

To access it from your machine (e.g. during development):

Terminal input
kubectl -n <namespace> port-forward svc/<release>-api-server 8080:8080

Database configuration

Airflow requires a relational database to store its metadata: registered DAGs, execution history, connections, variables, and users.

The chart supports two modes — to understand when to use each one, see Internal versus external PostgreSQL on the General Configuration page.

Embedded PostgreSQL (default)

Recommended for development and testing:

tdp-airflow:
postgresql:
enabled: true
Attention

The embedded PostgreSQL is convenient for development.

For production, prefer an external PostgreSQL.

External PostgreSQL

Disable the subchart and fill in tdp-airflow.data with the external PostgreSQL connection details (structure and comments available via helm show values).

For TDP integration helpers:

tdp-airflow:
postgresql:
enabled: false

data:
metadataSecretName: "<release>-airflow-database"
metadataConnection:
user: airflow
pass: ""
protocol: postgresql
host: "<postgres-service>.<namespace>.svc.cluster.local"
port: 5432
db: airflow
sslmode: disable

TDPConfigurations:
externalDatabase:
enabled: true
recreate: false
externalSecret:
releaseName: "<tdp-postgresql-release>"
area: "<area>"

With TDPConfigurations.externalDatabase.enabled: true, the chart uses the provided release and area to align the database, user, and metadata Secret with the TDP stack.

Tip

Prefer setting metadataSecretName and leaving pass empty when TDP jobs are responsible for generating the connection Secret.

LDAP Authentication

LDAP is optional and disabled by default.

Configuration uses tdp-airflow.ldap and variables injected via tdp-airflow.extraEnv.

Detailed documentation

See Security — Airflow for the bind Secret and LDAP configuration examples.

DAG persistence (PVC)

By default the chart already creates a PVC for DAGs. Adjust the size, StorageClass, or access mode according to your cluster:

tdp-airflow:
dags:
persistence:
enabled: true
size: 5Gi
storageClassName: ""
accessMode: ReadWriteOnce
Access mode

With KubernetesExecutor, multiple pods need to read the same DAGs.

If the scheduler, webserver, and tasks do not share the volume, use a StorageClass with ReadWriteMany (RWX), as permitted by your environment.

Log persistence (PVC)

Chart default: enabled: false. If you enable it, evaluate the volume access mode: logs are written by multiple components and RWO is usually inadequate.

tdp-airflow:
logs:
persistence:
enabled: false
size: 10Gi
storageClassName: ""
Attention

With RWO, log persistence is usually inadequate.

Prefer RWX or another logging strategy (e.g. an external logging stack) suited to your environment.

S3 connection (TDP)

Configure this section when Airflow needs to access the cluster's object storage — for example, so that DAGs are stored in Apache Ozone instead of a PVC, or so that Airflow operators can read and write files to S3-compatible storage.

To understand the concept and when an S3 connection is required, see S3-compatible storage (Ozone) on the General Configuration page.

With TDPConfigurations.s3Connection.enabled: true, the chart creates a Secret with S3-compatible connection parameters for TDP integrations. Use placeholders; do not version credentials in plain text:

TDPConfigurations:
s3Connection:
enabled: true
secretName: "<s3-connection-secret>"
name: "<connection-name>"
type: "aws"
accessKey: "<s3-access-key>"
secretKey: "<s3-secret-key>"
uri: "https://<s3-endpoint>"

Additional environment variables

Use tdp-airflow.extraEnv to reference Secrets or inject other variables (e.g. LDAP password):

tdp-airflow:
extraEnv: |
- name: EXAMPLE_ENV
valueFrom:
secretKeyRef:
name: <secret-name>
key: <secret-key>

Python dependencies and image

With KubernetesExecutor, each task runs in a new pod.

Installing packages via an init container on every deploy can significantly increase startup time.

The standard production approach is a custom image with dependencies pre-installed; define the same image in tdp-airflow.images.airflow and tdp-airflow.images.pod_template.

For additional options (hooks, images), see the package values export (helm show values) and the upstream Apache Airflow chart documentation.

Main parameters

ParameterDescriptionDefault value
tdp-airflow.enabledEnable Airflow deploymenttrue
tdp-airflow.executorExecutorKubernetesExecutor
tdp-airflow.config.core.default_timezoneDefault timezoneAmerica/Sao_Paulo
tdp-airflow.apiServer.service.typeAPI Server Service typeClusterIP
tdp-airflow.postgresql.enabledEmbedded PostgreSQLtrue
tdp-airflow.dags.persistence.enabledDAGs PVCtrue
tdp-airflow.dags.persistence.sizeDAGs PVC size5Gi
tdp-airflow.logs.persistence.enabledLogs PVCfalse
tdp-airflow.logs.persistence.sizeLogs PVC size10Gi
tdp-airflow.ldap.enabledLDAPfalse
tdp-airflow.ldap.apiServerConfigFlask-AppBuilder snippetSee helm show values for your version
tdp-airflow.extraEnvExtra variables (e.g. Secret)""
tdp-airflow.dataMetadata / external DBDisabled by default; structure in helm show values
TDPConfigurations.externalDatabase.enabledTDP external DB helpersfalse
TDPConfigurations.s3Connection.enabledTDP S3 connection Secretfalse