Security — Spark
The tdp-spark chart accesses S3/MinIO storage via the S3A protocol. Access credentials are configured as placeholders in values.yaml and must be supplied through private values files or your environment's Secrets management mechanisms.
S3/S3A credentials
S3/MinIO access credentials are injected via spark.sparkConf or hadoopConfig:
spark:
sparkConf:
"spark.hadoop.fs.s3a.access.key": "<ACCESS_KEY>"
"spark.hadoop.fs.s3a.secret.key": "<SECRET_KEY>"
"spark.hadoop.fs.s3a.endpoint": "http://<s3-endpoint>:<port>"
"spark.hadoop.fs.s3a.path.style.access": "true"
"spark.hadoop.fs.s3a.connection.ssl.enabled": "false"
Alternatively, the same properties can be defined in hadoopConfig (rendered as core-site.xml):
hadoopConfig:
"fs.s3a.access.key": "<ACCESS_KEY>"
"fs.s3a.secret.key": "<SECRET_KEY>"
"fs.s3a.endpoint": "http://<s3-endpoint>:<port>"
"fs.s3a.path.style.access": "true"
Or via customSparkConfig.properties (rendered as spark-defaults.conf):
customSparkConfig:
properties: |
spark.hadoop.fs.s3a.access.key=<ACCESS_KEY>
spark.hadoop.fs.s3a.secret.key=<SECRET_KEY>
spark.hadoop.fs.s3a.endpoint=http://<s3-endpoint>:<port>
spark.hadoop.fs.s3a.path.style.access=true
Integration with Ozone (TDP)
For Ozone installed with TDP, the internal S3 Gateway endpoint follows the pattern:
http://<release>-s3g-rest.<namespace>.svc.cluster.local:9878
The credentials must match the ozone-s3-credentials Secret configured in Ozone. See Security — Apache Ozone for details on the Secret.
Integration with Jupyter and Airflow
The chart renders specific configuration files for integration with other services:
integration.jupyter.sparkConfig→spark-defaults.conffor the JupyterLab environmentintegration.airflow.sparkConfig→ default settings for job submission via Airflow
These configurations also inherit the S3A properties defined above when present.
Best practices
| Aspect | Recommendation |
|---|---|
| Credentials | Do not version access.key and secret.key in a Git repository |
| Environment separation | Use distinct values files for development and production |
| Rotation | Update the values and run helm upgrade to apply new credentials |
Troubleshooting
| Problem | Probable cause | Solution |
|---|---|---|
| Spark jobs fail with S3 access error | Missing credentials or incorrect endpoint | Check spark.hadoop.fs.s3a.* in the applied values |
Connection refused to S3 | Incorrect S3 service address | Verify the Ozone service name and namespace in the cluster |
AccessDeniedException | Credentials without permission on the bucket | Review permissions in MinIO/Ozone for the configured user |
For the full list of parameters, use helm show values on the version of the chart you installed.