In a previous article I wrote about using Rook to deploy a Ceph storage cluster within Minikube (link below). The original post described what Rook can provide and demonstrated the ease of quickly setting up an all in one Ceph cluster. However I wanted explore Rook further in a multi-node configuration and how it integrates with applications in Kubernetes.
First I needed to set up a base Kubernetes environment which consisted of 1 master and 3 worker nodes. I used the following steps on all nodes to prepare them for Kubernetes: add hostname to host files, disable Selinux and swap, enable br_netfilter, install supporting utilities, enable Kubernetes repo, install docker, install Kubernetes binaries and enable/disable relevant services.
# echo "10.0.0.81 kube-master" >> /etc/hosts # echo "10.0.0.82 kube-node1" >> /etc/hosts # echo "10.0.0.83 kube-node2" >> /etc/hosts # echo "10.0.0.84 kube-node3" >> /etc/hosts # setenforce 0 # sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux # swapoff -a # sed -i.bak -r 's/(.+ swap .+)/#\1/' /etc/fstab # modprobe br_netfilter # echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables # echo 'br_netfilter' > /etc/modules-load.d/netfilter.conf # echo net.bridge.bridge-nf-call-iptables=1 >> /etc/sysctl.d/10-bridge-nf-call-iptables.conf # dnf install -y yum-utils device-mapper-persistent-data lvm2 # dnf install docker # cat > /etc/yum.repos.d/kubernetes.repo <[kubernetes] > name=Kubernetes > baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64 > enabled=1 > gpgcheck=1 > repo_gpgcheck=1 > gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg > https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg > EOF # dnf install -y kubelet kubeadm kubectl # systemctl enable docker ; systemctl start docker ; systemctl enable kubelet ; systemctl start kubelet ; systemctl stop firewalld ; systemctl disable firewalld
Once the prerequisites are met on each node lets initialize the cluster on the master node:
# kubeadm init --apiserver-advertise-address=10.0.0.81 --pod-network-cidr=10.244.0.0/16 [init] Using Kubernetes version: v1.13.2 [preflight] Running pre-flight checks [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Activating the kubelet service [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [kube-master localhost] and IPs [10.0.0.81 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [kube-master localhost] and IPs [10.0.0.81 127.0.0.1 ::1] [certs] Generating "ca" certificate and key [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [kube-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.0.0.81] [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 19.511836 seconds [uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "kube-master" as an annotation [mark-control-plane] Marking the node kube-master as control-plane by adding the label "node-role.kubernetes.io/master=''" [mark-control-plane] Marking the node kube-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: etmucm.238nrw6a48yu0njb [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes master has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of machines by running the following on each node as root: kubeadm join 10.0.0.81:6443 --token etmucm.238nrw6a48yu0njb --discovery-token-ca-cert-hash sha256:963d6d9d31f2db9debfaa600ef802d05c448f7dc9e9cb92aec268cf2a8cfee7b
After the master is up and running you can join the remaining nodes using the following command which was presented in the output when you initialized the master:
# kubeadm join 10.0.0.81:6443 --token etmucm.238nrw6a48yu0njb --discovery-token-ca-cert-hash sha256:963d6d9d31f2db9debfaa600ef802d05c448f7dc9e9cb92aec268cf2a8cfee7b [preflight] Running pre-flight checks [discovery] Trying to connect to API Server "10.0.0.81:6443" [discovery] Created cluster-info discovery client, requesting info from "https://10.0.0.81:6443" [discovery] Requesting info from "https://10.0.0.81:6443" again to validate TLS against the pinned public key [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.0.0.81:6443" [discovery] Successfully established connection with API Server "10.0.0.81:6443" [join] Reading configuration from the cluster... [join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' [kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13" ConfigMap in the kube-system namespace [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Activating the kubelet service [tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap... [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "kube-node1" as an annotation This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the master to see this node join the cluster.
I like to do some housekeeping once all my nodes are joined which includes enabling scheduling on master and labeling the worker nodes as such:
# kubectl taint node kube-master node-role.kubernetes.io/master:NoSchedule- # kubectl label node kube-node1 node-role.kubernetes.io/worker=worker # kubectl label node kube-node2 node-role.kubernetes.io/worker=worker # kubectl label node kube-node3 node-role.kubernetes.io/worker=worker
Once you have joined the nodes you should have a cluster that looks like this:
# kubectl get nodes NAME STATUS ROLES AGE VERSION kube-master Ready master 19h v1.13.2 kube-node1 Ready worker 19h v1.13.2 kube-node2 Ready worker 19h v1.13.2 kube-node3 Ready worker 17h v1.13.2
Next lets deploy Flannel for networking:
# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
And finally lets deploy Rook and Ceph cluster using the familiar steps from my previous article:
# git clone https://github.com/rook/rook.git # cd ./rook/cluster/examples/kubernetes/ceph # sed -i.bak s+/var/lib/rook+/data/rook+g cluster.yaml # kubectl create -f operator.yaml # kubectl create -f cluster.yaml # kubectl create -f toolbox.yaml
Once all the containers have spun up you should have something that looks like the following:
# kubectl get pod -n rook-ceph -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES rook-ceph-mgr-a-8649f78d9b-txsfm 1/1 Running 1 19h 10.244.2.12 kube-node2rook-ceph-mon-a-598b7bd4cd-kpxnx 1/1 Running 0 19h 10.244.0.3 kube-master rook-ceph-mon-c-759b8984f5-ggzjb 1/1 Running 1 19h 10.244.2.15 kube-node2 rook-ceph-mon-d-77d55dcddf-mwnf8 1/1 Running 0 16h 10.244.3.3 kube-node3 rook-ceph-osd-0-77b448bbcc-mdhsw 1/1 Running 1 19h 10.244.2.14 kube-node2 rook-ceph-osd-1-65db4b7c5d-hgfcj 1/1 Running 0 16h 10.244.1.8 kube-node1 rook-ceph-osd-2-5b475cb56c-x5w6n 1/1 Running 0 19h 10.244.0.5 kube-master rook-ceph-osd-3-657789944d-swjxd 1/1 Running 0 16h 10.244.3.6 kube-node3 rook-ceph-osd-prepare-kube-master-tlhxf 0/2 Completed 0 16h 10.244.0.6 kube-master rook-ceph-osd-prepare-kube-node1-lgtrf 0/2 Completed 0 16h 10.244.1.12 kube-node1 rook-ceph-osd-prepare-kube-node2-5tbt6 0/2 Completed 0 16h 10.244.2.17 kube-node2 rook-ceph-osd-prepare-kube-node3-rrp4z 0/2 Completed 0 16h 10.244.3.5 kube-node3 rook-ceph-tools-76c7d559b6-7kprh 1/1 Running 0 16h 10.0.0.84 kube-node3
And of course we can validate the Ceph cluster is up and healthy via the toolbox container as well:
# kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash # ceph status cluster: id: 4be6e204-3d82-4cc4-9ea4-57f0e71f99c5 health: HEALTH_OK services: mon: 3 daemons, quorum d,a,c mgr: a(active) osd: 4 osds: 4 up, 4 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 B usage: 17 GiB used, 123 GiB / 140 GiB avail pgs: # ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.13715 root default -4 0.03429 host kube-master 2 hdd 0.03429 osd.2 up 1.00000 1.00000 -3 0.03429 host kube-node1 1 hdd 0.03429 osd.1 up 1.00000 1.00000 -2 0.03429 host kube-node2 0 hdd 0.03429 osd.0 up 1.00000 1.00000 -9 0.03429 host kube-node3 3 hdd 0.03429 osd.3 up 1.00000 1.00000
Everything we have done up to this point has been very similar to what I did in the previous article with Minikube except instead of a single node we have a multiple node configuration. Now lets take it a step further and get an application to use our Ceph storage cluster.
The first step in Kubernetes will be to created a storageclass.yaml that uses Ceph. Populate the storageclass.yaml with the following:
apiVersion: ceph.rook.io/v1 kind: CephBlockPool metadata: name: replicapool namespace: rook-ceph spec: failureDomain: host replicated: size: 3 --- apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: rook-ceph-block provisioner: ceph.rook.io/block parameters: blockPool: replicapool # The value of "clusterNamespace" MUST be the same as the one in which your rook cluster exist clusterNamespace: rook-ceph # Specify the filesystem type of the volume. If not specified, it will use `ext4`. fstype: xfs # Optional, default reclaimPolicy is "Delete". Other options are: "Retain", "Recycle" as documented in https://kubernetes.io/docs/concepts/storage/storage-classes/
Next lets create the storage class using the yaml we created and set it to default:
# kubectl create -f storageclass.yaml cephblockpool.ceph.rook.io/replicapool created storageclass.storage.k8s.io/rook-ceph-block created # kubectl get storageclass NAME PROVISIONER AGE rook-ceph-block ceph.rook.io/block 61s # kubectl patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' storageclass.storage.k8s.io/rook-ceph-block patched # kubectl get storageclass NAME PROVISIONER AGE rook-ceph-block (default) ceph.rook.io/block 3m30s
Now that we have a storageclass that uses Ceph as the backend we now need an application to consume the storageclass. Thankfully the Rook git repo includes a couple of examples: Wordpress and MySQL. Lets go ahead and create those apps doing the following:
# cd ./rook/cluster/examples/kubernetes # kubectl create -f mysql.yaml service/wordpress-mysql created persistentvolumeclaim/mysql-pv-claim created deployment.apps/wordpress-mysql created # kubectl create -f wordpress.yaml service/wordpress created persistentvolumeclaim/wp-pv-claim created deployment.extensions/wordpress created
We can confirm our two applications are running by the following:
# kubectl get pods -n default -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES wordpress-7b6c4c79bb-7b4dq 1/1 Running 0 68s 10.244.1.14 kube-node1wordpress-mysql-6887bf844f-2m4h4 1/1 Running 0 2m47s 10.244.1.13 kube-node1
Now lets confirm if they are actually using our Ceph storageclass:
# kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE mysql-pv-claim Bound pvc-0c0be0ec-2317-11e9-a462-5254003ede95 20Gi RWO rook-ceph-block 3m36s wp-pv-claim Bound pvc-46b4b266-2317-11e9-a462-5254003ede95 20Gi RWO rook-ceph-block 118s
And lets also confirm Wordpress is up and running from a user perspective. Note in this example we do not have an external IP and can only access the service via the cluster IP:
# kubectl get svc wordpress NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE wordpress LoadBalancer 10.104.120.4780:32592/TCP 19m # curl -v http://10.104.120.47 * About to connect() to 10.104.120.47 port 80 (#0) * Trying 10.104.120.47... * Connected to 10.104.120.47 (10.104.120.47) port 80 (#0) > GET / HTTP/1.1 > User-Agent: curl/7.29.0 > Host: 10.104.120.47 > Accept: */* > < HTTP/1.1 302 Found < Date: Mon, 28 Jan 2019 16:30:16 GMT < Server: Apache/2.4.10 (Debian) < X-Powered-By: PHP/5.6.28 < Expires: Wed, 11 Jan 1984 05:00:00 GMT < Cache-Control: no-cache, must-revalidate, max-age=0 < Location: http://10.104.120.47/wp-admin/install.php < Content-Length: 0 < Content-Type: text/html; charset=UTF-8 < * Connection #0 to host 10.104.120.47 left intact
We can see from the above output we do connect but get a 302 code since Wordpress really needs to be configured first. But it does confirm our applications are up and using the Ceph storageclass.
To clean up the previous exercise lets do the following:
# kubectl delete -f wordpress.yaml service "wordpress" deleted persistentvolumeclaim "wp-pv-claim" deleted deployment.extensions "wordpress" deleted # kubectl delete -f mysql.yaml service "wordpress-mysql" deleted persistentvolumeclaim "mysql-pv-claim" deleted deployment.apps "wordpress-mysql" deleted # kubectl delete -n rook-ceph cephblockpools.ceph.rook.io replicapool cephblockpool.ceph.rook.io "replicapool" deleted # kubectl delete storageclass rook-ceph-block storageclass.storage.k8s.io "rook-ceph-block" deleted
The above example was just a simple demonstration of the capabilities Rook/Ceph bring to Kubernetes from a block storage perspective. But leaves one wondering what other possibilities there might be.
Further Reading:
Rook: https://github.com/rook/rook
Kubernetes: https://kubernetes.io/
Previous Article: https://www.linkedin.com/pulse/deploying-ceph-rook-benjamin-schmaus/