Backup and Restore#

This guide covers the backup strategy for a cluster managed by this project.

What needs backing up?#

The key insight of a GitOps-managed cluster is that the Git repository is the backup for all configuration. ArgoCD can fully reconstruct the cluster state from the repo.

What is not in Git and needs separate backup:

Data

Where it lives

Backup approach

Persistent Volume data

Longhorn volumes on NVMe

Longhorn snapshots/backups

Sealed Secrets private key

kube-system namespace

Manual export

Admin passwords

admin-auth secrets (manual)

Re-create from password manager

ArgoCD initial admin secret

argo-cd namespace

Regenerated on install

Longhorn volume snapshots#

Longhorn supports both snapshots (local, on the same nodes) and backups (to external storage like NFS or S3).

Create a snapshot#

Via the Longhorn UI at https://longhorn.your-domain.com:

  1. Navigate to Volumes.

  2. Click on the volume name.

  3. Click Take Snapshot.

Via kubectl:

kubectl apply -f - <<EOF
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: my-snapshot
  namespace: my-namespace
spec:
  volumeSnapshotClassName: longhorn-snapshot
  source:
    persistentVolumeClaimName: my-pvc
EOF

The longhorn-snapshot VolumeSnapshotClass is deployed by this project at kubernetes-services/additions/longhorn/volume-snapshot-class.yaml.

Set up recurring snapshots#

In the Longhorn UI, configure recurring snapshots under Volume → Recurring Jobs. This can be set per-volume or globally.

Back up to NFS#

To back up Longhorn volumes to an NFS target:

  1. In the Longhorn UI, go to Settings → Backup Target.

  2. Set the backup target URL: nfs://nas.local:/backup/longhorn

  3. Save.

Now you can create backups (not just snapshots) that are stored externally.

Sealed Secrets key backup#

If you rebuild the cluster, the sealed-secrets controller generates a new keypair. Existing SealedSecret YAML files in the repo will become undecryptable.

Export the key#

kubectl get secret -n kube-system \
  -l sealedsecrets.bitnami.com/sealed-secrets-key \
  -o yaml > sealed-secrets-key-backup.yaml

Warning

Store this file securely (e.g. password manager, encrypted drive) — never in Git. It can decrypt all your SealedSecrets.

Restore after rebuild#

Before ArgoCD deploys the sealed-secrets controller on a new cluster:

kubectl apply -f sealed-secrets-key-backup.yaml

The new controller will pick up the restored key and can decrypt existing SealedSecrets.

etcd backup and restore#

K3s uses an embedded etcd (or SQLite for single-node) datastore. Backing up etcd preserves the full cluster state including all Kubernetes objects.

Create an etcd snapshot#

ssh node01 sudo k3s etcd-snapshot save --name manual-$(date +%Y%m%d)

Snapshots are stored at /var/lib/rancher/k3s/server/db/snapshots/ on the control plane node.

List snapshots#

ssh node01 sudo k3s etcd-snapshot list

Configure automatic snapshots#

K3s supports automatic etcd snapshots. Add to /etc/rancher/k3s/config.yaml on the control plane:

etcd-snapshot-schedule-cron: "0 */6 * * *"  # every 6 hours
etcd-snapshot-retention: 10

Restart K3s to apply:

ssh node01 sudo systemctl restart k3s

Restore from snapshot#

Warning

Restoring replaces the entire cluster state. All changes since the snapshot are lost.

ssh node01
sudo systemctl stop k3s
sudo k3s server --cluster-reset --cluster-reset-restore-path=/var/lib/rancher/k3s/server/db/snapshots/<snapshot-name>
sudo systemctl start k3s

Disaster recovery#

To rebuild a cluster from scratch:

  1. Flash and provision nodes (see tutorials).

  2. Run ansible-playbook pb_all.yml -e do_flash=true.

  3. Restore the sealed-secrets key (if backed up).

  4. Re-create the admin-auth secrets (see Bootstrap the Cluster).

  5. ArgoCD auto-syncs all services from Git.

  6. Restore Longhorn volumes from NFS backups (if configured).

The cluster will be fully operational within minutes, with only persistent data requiring explicit restoration.