Rollback
This guide describes the procedures to revert upgrades of TDP Kubernetes components, both via Helm and via ArgoCD. Rollback allows quickly restoring a previous version in case of failures or unexpected behavior after an upgrade.
Before performing any rollback, back up persistent data. Depending on the component, a rollback may cause data loss if schema migrations are incompatible with previous versions.
Data Backup before Rollback
Before starting the rollback, protect persistent data by creating PVC snapshots.
Identify Component PVCs
kubectl get pvc -n <namespace> -l app.Kubernetes.io/instance=<release>
Create Volume Snapshots
kubectl apply -f - <<EOF
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: <release>-pre-rollback-snapshot
namespace: tdp-project
spec:
volumeSnapshotClassName: <snapshot-class>
source:
persistentVolumeClaimName: <pvc-name>
EOF
Verify Snapshot Status
kubectl get volumesnapshot -n <namespace>
Ensure the snapshot has readyToUse: true status before proceeding.
Rollback via Helm
Check Revision History
Use the helm history command to view all revisions of a release:
helm history <release> -n <namespace>
The output will display a table with columns: REVISION, UPDATED, STATUS, CHART, APP VERSION, and DESCRIPTION. Identify the revision you want to revert to.
Example output:
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
1 2025-01-15 10:30:00 deployed tdp-kafka-1.0.0 3.5.1 Install complete
2 2025-02-10 14:00:00 superseded tdp-kafka-1.0.0 3.5.1 Upgrade complete
3 2025-03-01 09:15:00 deployed tdp-kafka-2.0.0 3.6.0 Upgrade complete
Rollback to a Specific Revision
To revert to a specific revision:
helm rollback <release> <revision> -n <namespace>
For example, to revert tdp-kafka to revision 2:
helm rollback tdp-kafka 2 -n <namespace>
Rollback to the Previous Revision
To revert to the immediately previous revision (without specifying the number):
helm rollback <release> -n <namespace>
Rollback with Pod Recreation
In some cases, it may be necessary to force recreation of all pods during rollback:
helm rollback <release> <revision> -n <namespace> --recreate-pods
Verify Rollback Result
After rollback, confirm the release was reverted successfully:
helm status <release> -n <namespace>
helm history <release> -n <namespace>
The last entry in the history should indicate STATUS: deployed with the description Rollback to <revision>.
Rollback via ArgoCD
Rollback via ArgoCD consists of reverting the Git repository state to the previous version of the manifests.
Revert the Manifest in Git
The recommended approach is to revert the commit that introduced the upgrade:
git log --oneline -10 # Identify the upgrade commit
git revert <commit-hash>
git push origin main
ArgoCD will detect the change in the repository and synchronize the cluster automatically (if automatic synchronization is enabled).
Force Synchronization after Reversion
If automatic synchronization is disabled, force synchronization manually:
argocd app sync tdp-kafka
Rollback via ArgoCD CLI
ArgoCD also allows reverting the last synchronization operation:
argocd app rollback tdp-kafka
ArgoCD CLI rollback only reverts the cluster state. The Git repository is not changed, which may cause a new automatic sync restoring the undesired version. Always revert in Git as well to ensure consistency.
Verify State after Rollback
argocd app get tdp-kafka
The status should indicate Synced and Healthy after rollback completion.
Common Rollback Scenarios
Scenario 1: Pod Initialization Failure
Symptom: pods remain in CrashLoopBackOff or Error state after upgrade.
Diagnosis:
kubectl describe pod -n <namespace> -l app.Kubernetes.io/instance=<release>
kubectl logs -n <namespace> -l app.Kubernetes.io/instance=<release> --previous
Solution: rollback to the previous version:
helm rollback <release> -n <namespace>
Scenario 2: Database Schema Incompatibility
Symptom: schema migration errors in component logs.
Diagnosis:
kubectl logs -n <namespace> -l app.Kubernetes.io/instance=<release> | grep -i "migration\|schema\|database"
Solution:
-
Restore the database backup (PVC snapshot)
-
Perform release rollback:
Terminal inputhelm rollback <release> -n <namespace> -
Verify the component initializes correctly with the restored database
Scenario 3: Inter-Component Connectivity Issues
Symptom: components cannot communicate after partial upgrade.
Diagnosis:
kubectl get endpoints -n <namespace>
kubectl get svc -n <namespace>
Solution: verify all dependent components were upgraded to compatible versions. If necessary, roll back all affected components.
Scenario 4: Performance Degradation
Symptom: elevated latency or excessive resource consumption after upgrade.
Diagnosis:
kubectl top pods -n <namespace> -l app.Kubernetes.io/instance=<release>
kubectl describe pod -n <namespace> -l app.Kubernetes.io/instance=<release>
Solution: check configuration differences between versions and adjust resources in values.yaml. If the problem persists, perform rollback.
Partial Rollback (Individual Component)
In scenarios where only one component has issues after a general upgrade, it is possible to roll back only that component.
Identify the Problematic Component
kubectl get pods -n <namespace> --field-selector=status.phase!=Running
helm list -n <namespace> --failed
Individual Rollback via Helm
helm rollback <problematic-release> -n <namespace>
Verify Compatibility
After partial rollback, verify the reverted component is compatible with the other updated versions:
kubectl logs -n <namespace> -l app.Kubernetes.io/instance=<problematic-release> --tail=50
kubectl get pods -n <namespace>
Partial rollback may cause incompatibilities between components if there are version dependencies. Consult the compatibility matrix before proceeding.
Rollback Best Practices
- Always back up persistent data before starting rollback
- Document the cause of the rollback for future reference
- Test in staging environment before upgrading production, reducing the need for rollbacks
- Monitor logs after rollback to confirm stability
- Keep Git consistent with the cluster state -- if using ArgoCD, always revert in Git
- Do not skip revisions -- in case of multiple rollbacks, revert one revision at a time