Introduction
Understanding how to Backup Your OpenShift etcd is crucial for OpenShift operators, as etcd holds the state and configuration of the entire cluster. This process ensures that in the event of a disaster, your cluster’s data integrity and availability remain uncompromised. By securing etcd backups, operators safeguard against data loss, enabling quick recovery and minimal downtime.
Procedure
Process Explanation
The backup runs every night using the cronjob mechanism in Kubernetes. This will invoke the oc debug mode and backup in each of the control-plane servers the backup process. This is done in order to have a consistent point in time for all etcd pods running.
NOTE: you must take a snapshot of the control plane servers, or if those servers are physical, make sure the backup files are synced outside to a remote location.
Apply the backup yamls
Let’s create the project for backing up the etcd
$ oc new-project etcd-backup
If project has default node worker, use the following to remove the annotation:
$ oc annotate namespace other-proj openshift.io/node-selector= --overwrite
Create the Service Account
01_sa-etcd-backup.yaml
---
kind: ServiceAccount
apiVersion: v1
metadata:
name: openshift-backup
namespace: etcd-backup
labels:
app: openshift-backup
Apply the SA:
$ oc apply -f 01_sa_etcd-backup.yaml
Create the ClusterRole
02_clusterrole.yaml
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-etcd-backup
rules:
- apiGroups: [""]
resources:
- "nodes"
verbs: ["get", "list"]
- apiGroups: [""]
resources:
- "pods"
- "pods/log"
verbs: ["get", "list", "create", "delete", "watch"]
Apply the Cluster role binding
$ oc apply -f 02_clusterrole.yaml
Create the clusterrolebinding
03_clusterrolebinding.yaml
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: openshift-backup
labels:
app: openshift-backup
subjects:
- kind: ServiceAccount
name: openshift-backup
namespace: etcd-backup
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-etcd-backup
Apply the ClusterRoleBinding yaml
$ oc apply -f 03_clusterrolebinding.yaml
Apply the correct SCC to the user
In order to provide the user access and abilities to run the scripts on the host level, run the following:
$ oc adm policy add-scc-to-user privileged -z openshift-backup
Create the cronjob
04_cronjob.yml
---
kind: CronJob
apiVersion: batch/v1beta1
metadata:
name: openshift-backup
namespace: etcd-backup
labels:
app: openshift-backup
spec:
schedule: "56 23 * * *"
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 5
failedJobsHistoryLimit: 5
jobTemplate:
metadata:
labels:
app: openshift-backup
spec:
backoffLimit: 0
template:
metadata:
labels:
app: openshift-backup
spec:
containers:
- name: backup
image: "registry.redhat.io/openshift4/ose-cli"
command:
- "/bin/bash"
- "-c"
- oc get no -l node-role.kubernetes.io/master --no-headers -o name | xargs -I {} -- oc debug {} -- bash -c 'chroot /host sudo -E /usr/local/bin/cluster-backup.sh /home/core/backup/ && chroot /host sudo -E find /home/core/backup/ -type f -mmin +"1" -delete'
restartPolicy: "Never"
terminationGracePeriodSeconds: 30
activeDeadlineSeconds: 500
dnsPolicy: "ClusterFirst"
serviceAccountName: "openshift-backup"
serviceAccount: "openshift-backup"
Apply the cronjob.
$ oc apply -f 04_cronjob.yml
Test
In order to verify the cronjob is running and without waiting for the middle of the night, we can invoke a single job using the create job
command. This will run our pod in the environment and if all is working well it will finish and be in completed
statue.
Please create it using the following:
$ oc create job backup --from=cronjob/openshift-backup
Verify that the pods are not in a failed state and is Running
and then Completed
$ oc get pods -n etcd-backup
Summary
At Octopus Computer Solutions, we prioritize the resilience and security of your Kubernetes deployments. Through our open-source-driven approach, we emphasize the importance of regular Backup Your OpenShift etcd processes as part of our comprehensive backup solutions. Our expertise ensures that your critical data is always protected, offering peace of mind and reinforcing the value we bring to your Kubernetes operations.
We have additional OpenShift related posts, feel free to see and share.