Skip to main content
Version 3.0.0

Rollback

This guide describes how to revert upgrades of TDP Kubernetes components, via Helm or ArgoCD. Rollback lets you quickly restore a previous version after failures or unexpected behavior following an upgrade.

Attention

Before any rollback, back up persistent data. Depending on the component, a rollback can cause data loss if schema migrations are incompatible with earlier versions.

Data backup before rollback

Before starting rollback, protect persistent data by creating PVC snapshots.

Identify the component’s PVCs

Terminal input
kubectl get pvc -n <namespace> -l app.kubernetes.io/instance=<release>

Create volume snapshots

Terminal input
kubectl apply -f - <<EOF
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: <release>-pre-rollback-snapshot
namespace: <namespace>
spec:
volumeSnapshotClassName: <snapshot-class>
source:
persistentVolumeClaimName: <pvc-name>
EOF

Check snapshot status

Terminal input
kubectl get volumesnapshot -n <namespace>

Ensure the snapshot has readyToUse: true before proceeding.

Rollback via Helm

List revision history

Use helm history to list all revisions of a release:

Terminal input
helm history <release> -n <namespace>

The output shows columns: REVISION, UPDATED, STATUS, CHART, APP VERSION, and DESCRIPTION. Identify the revision to roll back to.

Example output:

REVISION    UPDATED                     STATUS        CHART              APP VERSION    DESCRIPTION
1 2025-01-15 10:30:00 deployed tdp-kafka-1.0.0 3.5.1 Install complete
2 2025-02-10 14:00:00 superseded tdp-kafka-1.0.0 3.5.1 Upgrade complete
3 2025-03-01 09:15:00 deployed tdp-kafka-2.0.0 3.6.0 Upgrade complete

Roll back to a specific revision

To revert to a specific revision:

Terminal input
helm rollback <release> <revisao> -n <namespace>

For example, to roll tdp-kafka back to revision 2:

Terminal input
helm rollback tdp-kafka 2 -n <namespace>

Roll back to the previous revision

To revert to the immediately previous revision (without specifying the number):

Terminal input
helm rollback <release> -n <namespace>

Rollback with pod recreation

Sometimes you need to force all pods to be recreated during rollback:

Terminal input
helm rollback <release> <revisao> -n <namespace> --recreate-pods

Verify rollback result

After rollback, confirm the release was reverted successfully:

Terminal input
helm status <release> -n <namespace>
helm history <release> -n <namespace>

The latest history entry should show STATUS: deployed with description Rollback to <revisao>.

Rollback via ArgoCD

ArgoCD rollback means reverting the Git repository to the previous manifest state.

Revert the manifest in Git

The recommended approach is to revert the commit that introduced the upgrade:

Terminal input
git log --oneline -10  # Identify the upgrade commit
git revert <commit-hash>
git push origin main

ArgoCD will detect the repository change and sync the cluster automatically (if automatic sync is enabled).

Force sync after revert

If automatic sync is disabled, trigger a manual sync:

Terminal input
argocd app sync tdp-kafka

Rollback via Argo CD CLI

Argo CD can also revert the last sync operation:

Terminal input
argocd app rollback tdp-kafka
Note

Argo CD CLI rollback only reverts cluster state. The Git repository is unchanged, which can cause automatic sync to restore the undesired version. Always revert in Git as well for consistency.

Verify state after rollback

Terminal input
argocd app get tdp-kafka

Status should show Synced and Healthy after rollback completes.

Common rollback scenarios

Scenario 1: Pods fail to start

Symptom: Pods stay in CrashLoopBackOff or Error after upgrade.

Diagnosis:

Terminal input
kubectl describe pod -n <namespace> -l app.kubernetes.io/instance=<release>
kubectl logs -n <namespace> -l app.kubernetes.io/instance=<release> --previous

Solution: Roll back to the previous version:

Terminal input
helm rollback <release> -n <namespace>

Scenario 2: Database schema incompatibility

Symptom: Schema migration errors in component logs.

Diagnosis:

Terminal input
kubectl logs -n <namespace> -l app.kubernetes.io/instance=<release> | grep -i "migration\|schema\|database"

Solution:

  1. Restore the database backup (PVC snapshot)

  2. Roll back the release:

    Terminal input
    helm rollback <release> -n <namespace>
  3. Verify the component starts correctly with the restored database

Scenario 3: Connectivity issues between components

Symptom: Components cannot communicate after a partial upgrade.

Diagnosis:

Terminal input
kubectl get endpoints -n <namespace>
kubectl get svc -n <namespace>

Solution: Ensure all dependent components are on compatible versions. If needed, roll back all affected components.

Scenario 4: Performance degradation

Symptom: High latency or excessive resource usage after upgrade.

Diagnosis:

Terminal input
kubectl top pods -n <namespace> -l app.kubernetes.io/instance=<release>
kubectl describe pod -n <namespace> -l app.kubernetes.io/instance=<release>

Solution: Compare configuration between versions and adjust resources in values.yaml. If the issue persists, perform a rollback.

Partial rollback (single component)

When only one component fails after a broader upgrade, you can roll back just that component.

Identify the failing component

Terminal input
kubectl get pods -n <namespace> --field-selector=status.phase!=Running
helm list -n <namespace> --failed

Partial rollback via Helm

Terminal input
helm rollback <release-problematico> -n <namespace>

Check compatibility

After partial rollback, verify the reverted component is compatible with the other upgraded versions:

Terminal input
kubectl logs -n <namespace> -l app.kubernetes.io/instance=<release-problematico> --tail=50
kubectl get pods -n <namespace>
Attention

Partial rollback can cause version skew between components when dependencies exist. Review the compatibility matrix before proceeding.

Rollback best practices

  1. Always back up persistent data before starting rollback
  2. Document the cause of the rollback for future reference
  3. Test in staging before production upgrades to reduce rollback need
  4. Monitor logs after rollback to confirm stability
  5. Keep Git aligned with cluster state — with ArgoCD, always revert in Git
  6. Do not skip revisions — for multiple rollbacks, revert one revision at a time