kubernetes > backup and restore the master
This post covers information about how to backup and restore a Kubernetes master. The prerequisite is to have knowledge of launching a cluster using kubeadm.
Backing up a master involves backing up the etcd server and backing up TLS certificates of the cluster components
On a typical cluster bootstrapped with kubeadm, we could find all the TLS certificates at the following path of the master
networkandcode@master $ ls /etc/kubernetes/pki
apiserver.crt apiserver-kubelet-client.crt etcd front-proxy-client.key
apiserver-etcd-client.crt apiserver-kubelet-client.key front-proxy-ca.crt sa.key
apiserver-etcd-client.key ca.crt front-proxy-ca.key sa.pub
apiserver.key ca.key front-proxy-client.crt
And there is a seperate sub directory for the etcd TLS certificates
networkandcode@master $ ls /etc/kubernetes/pki/etcd
ca.crt ca.key healthcheck-client.crt healthcheck-client.key peer.crt peer.key server.crt server.key
Let’s create a directory for backing up these certificates, and then move these certificates there
networkandcode@master $ mkdir backup-k8s-tls-certs
networkandcode@master $ cp -r /etc/kubernetes/pki backup-k8s-tls-certs/
networkandcode@master $ ls backup-k8s-tls-certs/
pki
Now let’s create a separate directory for backing up etcd
networkandcode@master $ mkdir backup-etcd
We need to install the etcdctl tool to backup etcd easily, we can do it using the go language
# install etcdctl
networkandcode@master$ go get github.com/coreos/etcd/etcdctl
We may now backup the etcd server, using etcdctl
networkandcode@master $ ETCDCTL_API=3 \
etcdctl snapshot save backup-etcd/snapshot.db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt \
--key=/etc/kubernetes/pki/etcd/healthcheck-client.key
{"level":"info","ts":1582355056.9277308,"caller":"snapshot/v3_snapshot.go:110","msg":"created temporary dbfile","path":"backup-etcd/snapshot.db.part"}
{"level":"info","ts":1582355056.9408722,"caller":"snapshot/v3_snapshot.go:121","msg":"fetching snapshot","endpoint":"127.0.0.1:2379"}
{"level":"info","ts":1582355057.017826,"caller":"snapshot/v3_snapshot.go:134","msg":"fetched snapshot","endpoint":"127.0.0.1:2379","took":0.09000424}
{"level":"info","ts":1582355057.0179946,"caller":"snapshot/v3_snapshot.go:143","msg":"saved","path":"backup-etcd/snapshot.db"}
Snapshot saved at backup-etcd/snapshot.db
networkandcode@master $ ls backup-etcd -ltr
total 2592
-rw------- 1 root root 2650144 Feb 22 08:49 snapshot.db
In the command above, we are first setting the ETCDCTL_API as version 3, then we are saving the server’s snapshot at the mentioned path. The endpoints parameter refers to the address and port of the etcd server.
We also need to add three extra parameters which source relevant ETCD certificates from the respective paths
The snapshot is now saved. So by now we have the backup of the cluster TLS certificates and the ETCD server
We are now going to simulate resetting the master
networkandcode@master$ kubeadm reset -f
The kubectl commands wouldnt now work as before
networkandcode@master $ kubectl get nodes
The connection to the server 172.16.0.17:6443 was refused - did you specify the right host or port?
The pki directory would be empty
networkandcode@master $ ls /etc/kubernetes/pki
networkandcode@master $
We may now copy the PKI certificates from the backup directory
networkandcode@master $ cp -r backup-k8s-tls-certs/pki /etc/kubernetes/
networkandcode@master $ ls /etc/kubernetes/pki
apiserver.crt apiserver.key ca.crt front-proxy-ca.crt front-proxy-client.key
apiserver-etcd-client.crt apiserver-kubelet-client.crt ca.key front-proxy-ca.key sa.key
apiserver-etcd-client.key apiserver-kubelet-client.key etcd front-proxy-client.crt sa.pub
We may now restore etcd
networkandcode@master $ ETCDCTL_API=3 \
etcdctl snapshot restore backup-etcd/snapshot.db
{"level":"info","ts":1582365349.7406623,"caller":"snapshot/v3_snapshot.go:287","msg":"restoring snapshot","path":"backup-etcd/snapshot.db","wal-dir":"default.etcd/member/wal","data-dir":"default.etcd","snap-dir":"default.etcd/member/snap"}
{"level":"info","ts":1582365349.82723,"caller":"membership/cluster.go:375","msg":"added member","cluster-id":"cdf818194e3a8c32","local-member-id":"0","added-peer-id":"8e9e05c52164694d","added-peer-peer-urls":["http://localhost:2380"]}
{"level":"info","ts":1582365349.8592656,"caller":"snapshot/v3_snapshot.go:300","msg":"restored snapshot","path":"backup-etcd/snapshot.db","wal-dir":"default.etcd/member/wal","data-dir":"default.etcd","snap-dir":"default.etcd/member/snap"}
The command above should have created the following files
networkandcode@master $ tree default.etcd/
default.etcd/
└ member
├ snap
│ ├ 0000000000000001-0000000000000001.snap
│ └ db
└ wal
└ 0000000000000000-0000000000000000.wal
3 directories, 3 files
We may move these files to a different directory, so that kubeadm would use this for the etcd server it creates
networkandcode@master $ mv default.etcd/member /var/lib/etcd/
networkandcode@master $ tree /var/lib/etcd
/var/lib/etcd
└ member
├ snap
│ ├ 0000000000000001-0000000000000001.snap
│ └ db
└ wal
└ 0000000000000000-0000000000000000.wal
3 directories, 3 files
The cluster could be bootstrapped again using kubeadm, however we need to include a flag to indicate it would use the existing data for etcd
networkandcode@master $ kubeadm init --ignore-preflight-errors=DirAvailable--var-lib-etcd
The master is recovered and it shows the nodes again
master $ kubectl get no
NAME STATUS ROLES AGE VERSION
master Ready master 33m v1.14.0
node01 Ready <none> 31m v1.14.0
So we recovered the master, and when we initialized it using kubeadm, we didnt have to join the node to the cluster using kubeadm join
–end-of-post–