Saturday, May 21, 2022

Check For Expired Certificates on OpenShift


OpenShift has a lot of certificates associated to the services it runs.  With that in mind it might make sense to check on those certificates every once and a while with some kind of simple report.   I have had customers make this request on occasion and it got me thinking about a quick and dirty way to visualize this.   The following blog show the fruits of this simple task in a simple bash script.

First lets go ahead and create the certs-expired.sh script: 

$ cat << EOF > ~/certs-expired.sh
#!/bin/bash

format="%-8s%-8s%-60s%-26s%-60s\n"
printf "$format" STATE DAYS NAME EXPIRY NAMESPACE
printf "$format" ----- ---- ---- ------ ---------

oc get secrets -A -o go-template='{{range .items}}{{if eq .type "kubernetes.io/tls"}}{{.metadata.namespace}}{{" "}}{{.metadata.name}}{{" "}}{{index .data "tls.crt"}}{{"\n"}}{{end}}{{end}}' | while read namespace name cert
do
  certdate=`echo $cert | base64 -d | openssl x509 -noout -enddate|cut -d= -f2`
  epochcertdate=$(date -d "$certdate" +"%s")
  currentdate=$(date +%s)
  if ((epochcertdate > currentdate)); then
    datediff=$((epochcertdate-currentdate))
    state="OK"
  else
    state="EXPIRED"
    datediff=$((currentdate-epochcertdate))
  fi
  days=$((datediff/86400))
  certdate=`echo $cert | base64 -d | openssl x509 -noout -enddate| cut -d= -f2`
  printf "$format" "$state" "$days" "$name" "$certdate" "$namespace" 
done

EOF

The script makes the assumptions that the oc binary is in the current path variable and that the kubeconfig has been set.   This ensures that the oc command inside the script can pull the appropriate data.   If those assumptions are met we can go ahead and run the script.  I chose just to issue a bash against the script but we could have also set the file with the execute permissions.   When we execute the script we can see the output below:

$ bash certs-expired.sh 
STATE   DAYS    NAME                                                        EXPIRY                    NAMESPACE                                                   
-----   ----    ----                                                        ------                    ---------                                                   
OK      715     openshift-apiserver-operator-serving-cert                   May  5 21:33:47 2024 GMT  openshift-apiserver-operator                                
OK      3635    etcd-client                                                 May  3 21:13:54 2032 GMT  openshift-apiserver                                         
OK      715     serving-cert                                                May  5 21:33:52 2024 GMT  openshift-apiserver                                         
OK      715     serving-cert                                                May  5 21:33:59 2024 GMT  openshift-authentication-operator                           
OK      715     v4-0-config-system-serving-cert                             May  5 21:33:49 2024 GMT  openshift-authentication                                    
OK      715     cloud-credential-operator-serving-cert                      May  5 21:33:50 2024 GMT  openshift-cloud-credential-operator                         
OK      715     machine-approver-tls                                        May  5 21:33:48 2024 GMT  openshift-cluster-machine-approver                          
OK      715     node-tuning-operator-tls                                    May  5 21:33:47 2024 GMT  openshift-cluster-node-tuning-operator                      
OK      715     samples-operator-tls                                        May  5 21:37:44 2024 GMT  openshift-cluster-samples-operator                          
OK      715     cluster-storage-operator-serving-cert                       May  5 21:33:55 2024 GMT  openshift-cluster-storage-operator                          
OK      715     csi-snapshot-webhook-secret                                 May  5 21:33:47 2024 GMT  openshift-cluster-storage-operator                          
OK      715     serving-cert                                                May  5 21:33:54 2024 GMT  openshift-cluster-storage-operator                          
OK      715     cluster-version-operator-serving-cert                       May  5 21:33:52 2024 GMT  openshift-cluster-version                                   
OK      15      kube-controller-manager-client-cert-key                     Jun  5 21:33:41 2022 GMT  openshift-config-managed                                    
OK      15      kube-scheduler-client-cert-key                              Jun  5 21:33:34 2022 GMT  openshift-config-managed                                    
OK      715     config-operator-serving-cert                                May  5 21:33:47 2024 GMT  openshift-config-operator                                   
OK      3635    etcd-client                                                 May  3 21:13:54 2032 GMT  openshift-config                                            
OK      3635    etcd-metric-client                                          May  3 21:13:54 2032 GMT  openshift-config                                            
OK      3635    etcd-metric-signer                                          May  3 21:13:54 2032 GMT  openshift-config                                            
OK      3635    etcd-signer                                                 May  3 21:13:54 2032 GMT  openshift-config                                            
OK      715     serving-cert                                                May  5 21:41:37 2024 GMT  openshift-console-operator                                  
OK      715     console-serving-cert                                        May  5 21:42:15 2024 GMT  openshift-console                                           
OK      715     openshift-controller-manager-operator-serving-cert          May  5 21:33:47 2024 GMT  openshift-controller-manager-operator                       
OK      715     serving-cert                                                May  5 21:33:56 2024 GMT  openshift-controller-manager                                
OK      715     metrics-tls                                                 May  5 21:33:58 2024 GMT  openshift-dns-operator                                      
OK      715     dns-default-metrics-tls                                     May  5 21:34:59 2024 GMT  openshift-dns                                               
OK      3635    etcd-client                                                 May  3 21:13:54 2032 GMT  openshift-etcd-operator                                     
OK      715     etcd-operator-serving-cert                                  May  5 21:33:57 2024 GMT  openshift-etcd-operator                                     
OK      3635    etcd-client                                                 May  3 21:13:54 2032 GMT  openshift-etcd                                              
OK      1080    etcd-peer-asus-vm1.kni.schmaustech.com                      May  5 21:51:28 2025 GMT  openshift-etcd                                              
OK      1080    etcd-peer-asus1-vm2.kni.schmaustech.com                     May  5 21:33:23 2025 GMT  openshift-etcd                                              
OK      1080    etcd-peer-asus1-vm3.kni.schmaustech.com                     May  5 21:33:24 2025 GMT  openshift-etcd                                              
OK      1080    etcd-serving-asus-vm1.kni.schmaustech.com                   May  5 21:51:28 2025 GMT  openshift-etcd                                              
OK      1080    etcd-serving-asus1-vm2.kni.schmaustech.com                  May  5 21:33:23 2025 GMT  openshift-etcd                                              
OK      1080    etcd-serving-asus1-vm3.kni.schmaustech.com                  May  5 21:33:24 2025 GMT  openshift-etcd                                              
OK      1080    etcd-serving-metrics-asus-vm1.kni.schmaustech.com           May  5 21:51:27 2025 GMT  openshift-etcd                                              
OK      1080    etcd-serving-metrics-asus1-vm2.kni.schmaustech.com          May  5 21:33:23 2025 GMT  openshift-etcd                                              
OK      1080    etcd-serving-metrics-asus1-vm3.kni.schmaustech.com          May  5 21:33:24 2025 GMT  openshift-etcd                                              
OK      715     serving-cert                                                May  5 21:33:59 2024 GMT  openshift-etcd                                              
OK      715     image-registry-operator-tls                                 May  5 21:33:58 2024 GMT  openshift-image-registry                                    
OK      715     metrics-tls                                                 May  5 21:33:55 2024 GMT  openshift-ingress-operator                                  
OK      715     router-ca                                                   May  5 21:35:59 2024 GMT  openshift-ingress-operator                                  
OK      715     router-certs-default                                        May  5 21:36:01 2024 GMT  openshift-ingress                                           
OK      715     router-metrics-certs-default                                May  5 21:36:00 2024 GMT  openshift-ingress                                           
OK      715     openshift-insights-serving-cert                             May  5 21:33:51 2024 GMT  openshift-insights                                          
OK      15      aggregator-client-signer                                    Jun  6 16:21:59 2022 GMT  openshift-kube-apiserver-operator                           
OK      715     kube-apiserver-operator-serving-cert                        May  5 21:33:54 2024 GMT  openshift-kube-apiserver-operator                           
OK      350     kube-apiserver-to-kubelet-signer                            May  6 21:09:57 2023 GMT  openshift-kube-apiserver-operator                           
OK      350     kube-control-plane-signer                                   May  6 21:09:57 2023 GMT  openshift-kube-apiserver-operator                           
OK      3635    loadbalancer-serving-signer                                 May  3 21:09:52 2032 GMT  openshift-kube-apiserver-operator                           
OK      3635    localhost-recovery-serving-signer                           May  3 21:33:29 2032 GMT  openshift-kube-apiserver-operator                           
OK      3635    localhost-serving-signer                                    May  3 21:09:50 2032 GMT  openshift-kube-apiserver-operator                           
OK      105     node-system-admin-client                                    Sep  3 21:33:40 2022 GMT  openshift-kube-apiserver-operator                           
OK      350     node-system-admin-signer                                    May  6 21:33:29 2023 GMT  openshift-kube-apiserver-operator                           
OK      3635    service-network-serving-signer                              May  3 21:09:51 2032 GMT  openshift-kube-apiserver-operator                           
OK      15      aggregator-client                                           Jun  6 16:21:59 2022 GMT  openshift-kube-apiserver                                    
OK      15      check-endpoints-client-cert-key                             Jun  5 21:33:46 2022 GMT  openshift-kube-apiserver                                    
OK      15      control-plane-node-admin-client-cert-key                    Jun  5 21:33:53 2022 GMT  openshift-kube-apiserver                                    
OK      3635    etcd-client                                                 May  3 21:13:54 2032 GMT  openshift-kube-apiserver                                    
OK      3635    etcd-client-10                                              May  3 21:13:54 2032 GMT  openshift-kube-apiserver                                    
OK      3635    etcd-client-11                                              May  3 21:13:54 2032 GMT  openshift-kube-apiserver                                    
OK      3635    etcd-client-12                                              May  3 21:13:54 2032 GMT  openshift-kube-apiserver                                    
OK      3635    etcd-client-8                                               May  3 21:13:54 2032 GMT  openshift-kube-apiserver                                    
OK      3635    etcd-client-9                                               May  3 21:13:54 2032 GMT  openshift-kube-apiserver                                    
OK      15      external-loadbalancer-serving-certkey                       Jun  5 21:33:52 2022 GMT  openshift-kube-apiserver                                    
OK      15      internal-loadbalancer-serving-certkey                       Jun  5 21:33:34 2022 GMT  openshift-kube-apiserver                                    
OK      15      kubelet-client                                              Jun  5 21:33:34 2022 GMT  openshift-kube-apiserver                                    
OK      3635    localhost-recovery-serving-certkey                          May  3 21:33:29 2032 GMT  openshift-kube-apiserver                                    
OK      3635    localhost-recovery-serving-certkey-10                       May  3 21:33:29 2032 GMT  openshift-kube-apiserver                                    
OK      3635    localhost-recovery-serving-certkey-11                       May  3 21:33:29 2032 GMT  openshift-kube-apiserver                                    
OK      3635    localhost-recovery-serving-certkey-12                       May  3 21:33:29 2032 GMT  openshift-kube-apiserver                                    
OK      3635    localhost-recovery-serving-certkey-8                        May  3 21:33:29 2032 GMT  openshift-kube-apiserver                                    
OK      3635    localhost-recovery-serving-certkey-9                        May  3 21:33:29 2032 GMT  openshift-kube-apiserver                                    
OK      15      localhost-serving-cert-certkey                              Jun  5 21:33:34 2022 GMT  openshift-kube-apiserver                                    
OK      15      service-network-serving-certkey                             Jun  5 21:33:33 2022 GMT  openshift-kube-apiserver                                    
OK      15      csr-signer                                                  Jun  6 16:26:40 2022 GMT  openshift-kube-controller-manager-operator                  
OK      45      csr-signer-signer                                           Jul  6 16:22:14 2022 GMT  openshift-kube-controller-manager-operator                  
OK      715     kube-controller-manager-operator-serving-cert               May  5 21:33:57 2024 GMT  openshift-kube-controller-manager-operator                  
OK      15      csr-signer                                                  Jun  6 16:26:40 2022 GMT  openshift-kube-controller-manager                           
OK      15      kube-controller-manager-client-cert-key                     Jun  5 21:33:41 2022 GMT  openshift-kube-controller-manager                           
OK      715     serving-cert                                                May  5 21:33:51 2024 GMT  openshift-kube-controller-manager                           
OK      715     serving-cert-2                                              May  5 21:33:51 2024 GMT  openshift-kube-controller-manager                           
OK      715     serving-cert-3                                              May  5 21:33:51 2024 GMT  openshift-kube-controller-manager                           
OK      715     serving-cert-4                                              May  5 21:33:51 2024 GMT  openshift-kube-controller-manager                           
OK      715     serving-cert-5                                              May  5 21:33:51 2024 GMT  openshift-kube-controller-manager                           
OK      715     serving-cert-6                                              May  5 21:33:51 2024 GMT  openshift-kube-controller-manager                           
OK      715     serving-cert-7                                              May  5 21:33:51 2024 GMT  openshift-kube-controller-manager                           
OK      715     kube-scheduler-operator-serving-cert                        May  5 21:33:50 2024 GMT  openshift-kube-scheduler-operator                           
OK      15      kube-scheduler-client-cert-key                              Jun  5 21:33:34 2022 GMT  openshift-kube-scheduler                                    
OK      715     serving-cert                                                May  5 21:33:59 2024 GMT  openshift-kube-scheduler                                    
OK      715     serving-cert-3                                              May  5 21:33:59 2024 GMT  openshift-kube-scheduler                                    
OK      715     serving-cert-4                                              May  5 21:33:59 2024 GMT  openshift-kube-scheduler                                    
OK      715     serving-cert-5                                              May  5 21:33:59 2024 GMT  openshift-kube-scheduler                                    
OK      715     serving-cert-6                                              May  5 21:33:59 2024 GMT  openshift-kube-scheduler                                    
OK      715     serving-cert-7                                              May  5 21:33:59 2024 GMT  openshift-kube-scheduler                                    
OK      715     serving-cert                                                May  5 21:34:00 2024 GMT  openshift-kube-storage-version-migrator-operator            
OK      725     diskmaker-metric-serving-cert                               May 15 23:33:46 2024 GMT  openshift-local-storage                                     
OK      715     baremetal-operator-webhook-server-cert                      May  5 21:36:34 2024 GMT  openshift-machine-api                                       
OK      715     cluster-autoscaler-operator-cert                            May  5 21:34:01 2024 GMT  openshift-machine-api                                       
OK      715     cluster-baremetal-operator-tls                              May  5 21:33:58 2024 GMT  openshift-machine-api                                       
OK      715     cluster-baremetal-webhook-server-cert                       May  5 21:33:48 2024 GMT  openshift-machine-api                                       
OK      715     machine-api-controllers-tls                                 May  5 21:33:47 2024 GMT  openshift-machine-api                                       
OK      715     machine-api-operator-tls                                    May  5 21:33:56 2024 GMT  openshift-machine-api                                       
OK      715     machine-api-operator-webhook-cert                           May  5 21:33:53 2024 GMT  openshift-machine-api                                       
OK      715     proxy-tls                                                   May  5 21:34:00 2024 GMT  openshift-machine-config-operator                           
OK      715     marketplace-operator-metrics                                May  5 21:33:50 2024 GMT  openshift-marketplace                                       
OK      715     alertmanager-main-tls                                       May  5 21:45:20 2024 GMT  openshift-monitoring                                        
OK      715     cluster-monitoring-operator-tls                             May  5 21:33:52 2024 GMT  openshift-monitoring                                        
OK      715     grafana-tls                                                 May  5 21:45:20 2024 GMT  openshift-monitoring                                        
OK      715     kube-state-metrics-tls                                      May  5 21:35:59 2024 GMT  openshift-monitoring                                        
OK      715     node-exporter-tls                                           May  5 21:35:59 2024 GMT  openshift-monitoring                                        
OK      715     openshift-state-metrics-tls                                 May  5 21:35:58 2024 GMT  openshift-monitoring                                        
OK      715     prometheus-adapter-tls                                      May  5 21:35:59 2024 GMT  openshift-monitoring                                        
OK      715     prometheus-k8s-thanos-sidecar-tls                           May  5 21:45:22 2024 GMT  openshift-monitoring                                        
OK      715     prometheus-k8s-tls                                          May  5 21:45:21 2024 GMT  openshift-monitoring                                        
OK      715     prometheus-operator-tls                                     May  5 21:35:43 2024 GMT  openshift-monitoring                                        
OK      715     telemeter-client-tls                                        May  5 21:37:44 2024 GMT  openshift-monitoring                                        
OK      715     thanos-querier-tls                                          May  5 21:35:58 2024 GMT  openshift-monitoring                                        
OK      715     metrics-daemon-secret                                       May  5 21:33:56 2024 GMT  openshift-multus                                            
OK      715     multus-admission-controller-secret                          May  5 21:33:48 2024 GMT  openshift-multus                                            
OK      3635    etcd-client                                                 May  3 21:13:54 2032 GMT  openshift-oauth-apiserver                                   
OK      715     serving-cert                                                May  5 21:34:01 2024 GMT  openshift-oauth-apiserver                                   
OK      715     catalog-operator-serving-cert                               May  5 21:33:47 2024 GMT  openshift-operator-lifecycle-manager                        
OK      715     olm-operator-serving-cert                                   May  5 21:33:48 2024 GMT  openshift-operator-lifecycle-manager                        
OK      714     packageserver-service-cert                                  May  4 21:34:44 2024 GMT  openshift-operator-lifecycle-manager                        
OK      0       pprof-cert                                                  May 21 18:30:03 2022 GMT  openshift-operator-lifecycle-manager                        
OK      3635    ovn-ca                                                      May  3 21:27:45 2032 GMT  openshift-ovn-kubernetes                                    
OK      167     ovn-cert                                                    Nov  5 09:27:45 2022 GMT  openshift-ovn-kubernetes                                    
OK      715     ovn-master-metrics-cert                                     May  5 21:33:53 2024 GMT  openshift-ovn-kubernetes                                    
OK      715     ovn-node-metrics-cert                                       May  5 21:33:49 2024 GMT  openshift-ovn-kubernetes                                    
OK      3635    signer-ca                                                   May  3 21:27:46 2032 GMT  openshift-ovn-kubernetes                                    
OK      167     signer-cert                                                 Nov  5 09:27:46 2022 GMT  openshift-ovn-kubernetes                                    
OK      715     serving-cert                                                May  5 21:33:54 2024 GMT  openshift-service-ca-operator                               
OK      775     signing-key                                                 Jul  4 21:33:37 2024 GMT  openshift-service-ca                                        
OK      725     noobaa-db-serving-cert                                      May 15 23:42:26 2024 GMT  openshift-storage                                           
OK      725     noobaa-mgmt-serving-cert                                    May 15 23:42:26 2024 GMT  openshift-storage                                           
OK      725     noobaa-operator-service-cert                                May 16 06:23:29 2024 GMT  openshift-storage                                           
OK      725     noobaa-s3-serving-cert                                      May 15 23:42:26 2024 GMT  openshift-storage                                           
OK      725     ocs-storagecluster-cos-ceph-rgw-tls-cert                    May 15 23:41:32 2024 GMT  openshift-storage                                           
OK      725     odf-console-serving-cert                                    May 15 23:27:38 2024 GMT  openshift-storage   

The output of the script is simple.  The first column contains the state of the certificate.  If its okay then it just says OK and if its expired the field will say EXPIRED.   The next column tells us how many days until the certificate expires and if the number is negative then the certificate is expired and has been for that many days.   The third column tells us the certificates name while the fourth gives us the actual expiry date.   Finally the last column provides the namespace the certificate is in.

Again just a simple script but provides an example of how we can see this type of information.  However if one has a fleet of clusters then configuring a Red Hat Advanced Cluster Management Certificate Policy Controller might be a more effective method at expired certificate management.

Friday, May 20, 2022

Install OpenShift with Agent Installer


There are so many ways to install OpenShift: Assisted Installer, UPI, IPI, Red Hat Advanced Cluster Management and ZTP.  However I have always longed for a single ISO image I could just boot my physical hardware and it would form a OpenShift cluster.  Well that dream is on course to become a reality with the Agent Installer a tool that can generate an ephemeral OpenShift installation image.  In the following blog I will demonstrate how to use this early incarnation of the tool.

As I stated the Agent Installer generates a single ISO image that one would use to boot all of the nodes they would want to be part of a newly deployed cluster.  However this current example may change some as the code gets developed and merged into the mainstream Openshift installer.  However if one is interested in exploring this new method the following can be a preview of what is to come.

The first step in trying out the Agent Installer is to grab the OpenShift installer source code from Github and checkout the agent-installer branch:

$ git clone https://github.com/openshift/installer
Cloning into 'installer'...
remote: Enumerating objects: 204497, done.
remote: Counting objects: 100% (210/210), done.
remote: Compressing objects: 100% (130/130), done.
remote: Total 204497 (delta 99), reused 153 (delta 70), pack-reused 204287
Receiving objects: 100% (204497/204497), 873.44 MiB | 10.53 MiB/s, done.
Resolving deltas: 100% (132947/132947), done.
Updating files: 100% (86883/86883), done.

$ git checkout 88db7ef
Updating files: 100% (23993/23993), done.
Note: switching to '88db7ef'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

  git switch -c <new-branch-name>

Or undo this operation with:

  git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at 88db7eff2 Fix unnecessary delays in start-cluster-installation

$ git branch
* (HEAD detached at 88db7eff2)
  master

Once we have the source code checked out we need to go ahead and build the the OpenShift install binary:

$ hack/build.sh
+ minimum_go_version=1.17
++ go version
++ cut -d ' ' -f 3
+ current_go_version=go1.17.7
++ version 1.17.7
++ IFS=.
++ printf '%03d%03d%03d\n' 1 17 7
++ unset IFS
++ version 1.17
++ IFS=.
++ printf '%03d%03d%03d\n' 1 17
++ unset IFS
+ '[' 001017007 -lt 001017000 ']'
+ make -C terraform all
make: Entering directory '/home/bschmaus/installer/terraform'
cd providers/alicloud; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-alicloud "$path"; \
zip -1j ../../bin/terraform-provider-alicloud.zip ../../bin/terraform-provider-alicloud;
  adding: terraform-provider-alicloud (deflated 81%)
cd providers/aws; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-aws "$path"; \
zip -1j ../../bin/terraform-provider-aws.zip ../../bin/terraform-provider-aws;
  adding: terraform-provider-aws (deflated 75%)
cd providers/azureprivatedns; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-azureprivatedns "$path"; \
zip -1j ../../bin/terraform-provider-azureprivatedns.zip ../../bin/terraform-provider-azureprivatedns;
  adding: terraform-provider-azureprivatedns (deflated 62%)
cd providers/azurerm; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-azurerm "$path"; \
zip -1j ../../bin/terraform-provider-azurerm.zip ../../bin/terraform-provider-azurerm;
  adding: terraform-provider-azurerm (deflated 77%)
cd providers/azurestack; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-azurestack "$path"; \
zip -1j ../../bin/terraform-provider-azurestack.zip ../../bin/terraform-provider-azurestack;
  adding: terraform-provider-azurestack (deflated 64%)
cd providers/google; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-google "$path"; \
zip -1j ../../bin/terraform-provider-google.zip ../../bin/terraform-provider-google;
  adding: terraform-provider-google (deflated 68%)
cd providers/ibm; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-ibm "$path"; \
zip -1j ../../bin/terraform-provider-ibm.zip ../../bin/terraform-provider-ibm;
  adding: terraform-provider-ibm (deflated 67%)
cd providers/ignition; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-ignition "$path"; \
zip -1j ../../bin/terraform-provider-ignition.zip ../../bin/terraform-provider-ignition;
  adding: terraform-provider-ignition (deflated 61%)
cd providers/ironic; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-ironic "$path"; \
zip -1j ../../bin/terraform-provider-ironic.zip ../../bin/terraform-provider-ironic;
  adding: terraform-provider-ironic (deflated 60%)
cd providers/libvirt; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-libvirt "$path"; \
zip -1j ../../bin/terraform-provider-libvirt.zip ../../bin/terraform-provider-libvirt;
  adding: terraform-provider-libvirt (deflated 61%)
cd providers/local; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-local "$path"; \
zip -1j ../../bin/terraform-provider-local.zip ../../bin/terraform-provider-local;
  adding: terraform-provider-local (deflated 59%)
cd providers/nutanix; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-nutanix "$path"; \
zip -1j ../../bin/terraform-provider-nutanix.zip ../../bin/terraform-provider-nutanix;
  adding: terraform-provider-nutanix (deflated 60%)
cd providers/openstack; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-openstack "$path"; \
zip -1j ../../bin/terraform-provider-openstack.zip ../../bin/terraform-provider-openstack;
  adding: terraform-provider-openstack (deflated 62%)
cd providers/ovirt; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-ovirt "$path"; \
zip -1j ../../bin/terraform-provider-ovirt.zip ../../bin/terraform-provider-ovirt;
  adding: terraform-provider-ovirt (deflated 66%)
cd providers/random; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-random "$path"; \
zip -1j ../../bin/terraform-provider-random.zip ../../bin/terraform-provider-random;
  adding: terraform-provider-random (deflated 59%)
cd providers/vsphere; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-vsphere "$path"; \
zip -1j ../../bin/terraform-provider-vsphere.zip ../../bin/terraform-provider-vsphere;
  adding: terraform-provider-vsphere (deflated 68%)
cd providers/vsphereprivate; \
if [ -f main.go ]; then path="."; else path=./vendor/`grep _ tools.go|awk '{ print $2 }'|sed 's|"||g'`; fi; \
go build -ldflags "-s -w" -o ../../bin/terraform-provider-vsphereprivate "$path"; \
zip -1j ../../bin/terraform-provider-vsphereprivate.zip ../../bin/terraform-provider-vsphereprivate;
  adding: terraform-provider-vsphereprivate (deflated 69%)
cd terraform; \
go build -ldflags "-s -w" -o ../bin/terraform ./vendor/github.com/hashicorp/terraform
make: Leaving directory '/home/bschmaus/installer/terraform'
+ copy_terraform_to_mirror
++ go env GOOS
++ go env GOARCH
+ TARGET_OS_ARCH=linux_amd64
+ rm -rf '/home/bschmaus/installer/pkg/terraform/providers/mirror/*/'
+ find /home/bschmaus/installer/terraform/bin/ -maxdepth 1 -name 'terraform-provider-*.zip' -exec bash -c '
      providerName="$(basename "$1" | cut -d - -f 3 | cut -d . -f 1)"
      targetOSArch="$2"
      dstDir="${PWD}/pkg/terraform/providers/mirror/openshift/local/$providerName"
      mkdir -p "$dstDir"
      echo "Copying $providerName provider to mirror"
      cp "$1" "$dstDir/terraform-provider-${providerName}_1.0.0_${targetOSArch}.zip"
    ' shell '{}' linux_amd64 ';'
Copying alicloud provider to mirror
Copying aws provider to mirror
Copying azureprivatedns provider to mirror
Copying azurerm provider to mirror
Copying azurestack provider to mirror
Copying google provider to mirror
Copying ibm provider to mirror
Copying ignition provider to mirror
Copying ironic provider to mirror
Copying libvirt provider to mirror
Copying local provider to mirror
Copying nutanix provider to mirror
Copying openstack provider to mirror
Copying ovirt provider to mirror
Copying random provider to mirror
Copying vsphere provider to mirror
Copying vsphereprivate provider to mirror
+ mkdir -p /home/bschmaus/installer/pkg/terraform/providers/mirror/terraform/
+ cp /home/bschmaus/installer/terraform/bin/terraform /home/bschmaus/installer/pkg/terraform/providers/mirror/terraform/
+ MODE=release
++ git rev-parse --verify 'HEAD^{commit}'
+ GIT_COMMIT=d74e210f30edf110764d87c8223a18b8a9952253
++ git describe --always --abbrev=40 --dirty
+ GIT_TAG=unreleased-master-6040-gd74e210f30edf110764d87c8223a18b8a9952253
+ DEFAULT_ARCH=amd64
+ GOFLAGS=-mod=vendor
+ LDFLAGS=' -X github.com/openshift/installer/pkg/version.Raw=unreleased-master-6040-gd74e210f30edf110764d87c8223a18b8a9952253 -X github.com/openshift/installer/pkg/version.Commit=d74e210f30edf110764d87c8223a18b8a9952253 -X github.com/openshift/installer/pkg/version.defaultArch=amd64'
+ TAGS=
+ OUTPUT=bin/openshift-install
+ export CGO_ENABLED=0
+ CGO_ENABLED=0
+ case "${MODE}" in
+ LDFLAGS=' -X github.com/openshift/installer/pkg/version.Raw=unreleased-master-6040-gd74e210f30edf110764d87c8223a18b8a9952253 -X github.com/openshift/installer/pkg/version.Commit=d74e210f30edf110764d87c8223a18b8a9952253 -X github.com/openshift/installer/pkg/version.defaultArch=amd64 -s -w'
+ TAGS=' release'
+ test '' '!=' y
+ go generate ./data
writing assets_vfsdata.go
+ echo ' release'
+ grep -q libvirt
+ go build -mod=vendor -ldflags ' -X github.com/openshift/installer/pkg/version.Raw=unreleased-master-6040-gd74e210f30edf110764d87c8223a18b8a9952253 -X github.com/openshift/installer/pkg/version.Commit=d74e210f30edf110764d87c8223a18b8a9952253 -X github.com/openshift/installer/pkg/version.defaultArch=amd64 -s -w' -tags ' release' -o bin/openshift-install ./cmd/openshift-install 

Once the OpenShift install binary is built we next need to create a manifests directory under the installer directory.  In this manifest directory we will be creating six files that basically give the Agent Installer the blueprints of what our cluster should look like.  First lets create the directory:

$ pwd 
/home/bschmaus/installer

$ mkdir manifests

With the directory created we can move onto creating the agent cluster install resource file.  This file specifies the clusters configuration such as number of control plane and/or worker nodes, the api and ingress vip and the cluster networking.   In my example I will be deploying a 3 node compact cluster which referenced a cluster deployment named kni22:

$ cat << EOF > ./manifests/agent-cluster-install.yaml
apiVersion: extensions.hive.openshift.io/v1beta1
kind: AgentClusterInstall
metadata:
  name: kni22
  namespace: kni22
spec:
  apiVIP: 192.168.0.125
  ingressVIP: 192.168.0.126
  clusterDeploymentRef:
    name: kni22
  imageSetRef:
    name: openshift-v4.10.0
  networking:
    clusterNetwork:
    - cidr: 10.128.0.0/14
      hostPrefix: 23
    serviceNetwork:
    - 172.30.0.0/16
  provisionRequirements:
    controlPlaneAgents: 3
    workerAgents: 0 
  sshPublicKey: 'INSERT PUBLIC SSH KEY HERE'
EOF

Next we will create the cluster deployment resource file which defines the cluster name, domain, and other details:

$ cat << EOF > ./manifests/cluster-deployment.yaml
apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
  name: kni22
  namespace: kni22
spec:
  baseDomain: schmaustech.com
  clusterInstallRef:
    group: extensions.hive.openshift.io
    kind: AgentClusterInstall
    name: kni22-agent-cluster-install
    version: v1beta1
  clusterName: kni22
  controlPlaneConfig:
    servingCertificates: {}
  platform:
    agentBareMetal:
      agentSelector:
        matchLabels:
          bla: aaa
  pullSecretRef:
    name: pull-secret
EOF

Moving on we now create the cluster image set resource file which contains OpenShift image information such as the repository and image name.  This will be the version of the cluster that gets deployed in our 3 node compact cluster.  In this example we are using 4.10.10:

$ cat << EOF > ./manifests/cluster-image-set.yaml
apiVersion: hive.openshift.io/v1
kind: ClusterImageSet
metadata:
  name: ocp-release-4.10.10-x86-64-for-4.10.0-0-to-4.11.0-0
spec:
  releaseImage: quay.io/openshift-release-dev/ocp-release:4.10.10-x86_64
EOF

Next we define the infrastructure environment file which  contains information for pulling OpenShift onto the target host nodes we are deploying to:

$ cat << EOF > ./manifests/infraenv.yaml 
apiVersion: agent-install.openshift.io/v1beta1
kind: InfraEnv
metadata:
  name: kni22
  namespace: kni22
spec:
  clusterRef:
    name: kni22  
    namespace: kni22
  pullSecretRef:
    name: pull-secret
  sshAuthorizedKey: 'INSERT PUBLIC SSH KEY HERE'
  nmStateConfigLabelSelector:
    matchLabels:
      kni22-nmstate-label-name: kni22-nmstate-label-value
EOF

The next file is the nmstate configuration file and this file provides all the details for all of the host that will be booted using the ISO image we are going to create.   Since we have a 3 node compact cluster to deploy we notice that in the file below we have specified three nmstate configurations.  Each configuration is for a node and generates a static IP address on the nodes enp2s0 interface that matches the MAC address defined.   This enables the ISO to boot up and not necessarily require DHCP in the environment which is what a lot of customers are looking for.   Again my example has 3 configurations but if we had worker nodes we would add those in too.   Lets go ahead and create the file:

$ cat << EOF > ./manifests/nmstateconfig.yaml
---
apiVersion: agent-install.openshift.io/v1beta1
kind: NMStateConfig
metadata:
  name: mynmstateconfig01
  namespace: openshift-machine-api
  labels:
    kni22-nmstate-label-name: kni22-nmstate-label-value
spec:
  config:
    interfaces:
      - name: enp2s0
        type: ethernet
        state: up
        mac-address: 52:54:00:e7:05:72
        ipv4:
          enabled: true
          address:
            - ip: 192.168.0.116
              prefix-length: 24
          dhcp: false
    dns-resolver:
      config:
        server:
          - 192.168.0.10
    routes:
      config:
        - destination: 0.0.0.0/0
          next-hop-address: 192.168.0.1
          next-hop-interface: enp2s0
          table-id: 254
  interfaces:
    - name: "enp2s0"
      macAddress: 52:54:00:e7:05:72
---
apiVersion: agent-install.openshift.io/v1beta1
kind: NMStateConfig
metadata:
  name: mynmstateconfig02
  namespace: openshift-machine-api
  labels:
    kni22-nmstate-label-name: kni22-nmstate-label-value
spec:
  config:
    interfaces:
      - name: enp2s0
        type: ethernet
        state: up
        mac-address: 52:54:00:95:fd:f3
        ipv4:
          enabled: true
          address:
            - ip: 192.168.0.117
              prefix-length: 24
          dhcp: false
    dns-resolver:
      config:
        server:
          - 192.168.0.10
    routes:
      config:
        - destination: 0.0.0.0/0
          next-hop-address: 192.168.0.1
          next-hop-interface: enp2s0
          table-id: 254
  interfaces:
    - name: "enp2s0"
      macAddress: 52:54:00:95:fd:f3
---
apiVersion: agent-install.openshift.io/v1beta1
kind: NMStateConfig
metadata:
  name: mynmstateconfig03
  namespace: openshift-machine-api
  labels:
    kni22-nmstate-label-name: kni22-nmstate-label-value
spec:
  config:
    interfaces:
      - name: enp2s0
        type: ethernet
        state: up
        mac-address: 52:54:00:e8:b9:18
        ipv4:
          enabled: true
          address:
            - ip: 192.168.0.118
              prefix-length: 24
          dhcp: false
    dns-resolver:
      config:
        server:
          - 192.168.0.10
    routes:
      config:
        - destination: 0.0.0.0/0
          next-hop-address: 192.168.0.1
          next-hop-interface: enp2s0
          table-id: 254
  interfaces:
    - name: "enp2s0"
      macAddress: 52:54:00:e8:b9:18
EOF

The final file we need to create is the pull-secret resource file which contains the pull-secret values so that our cluster can pull in the required OpenShift images to instantiate the cluster:

$ cat << EOF > ./manifests/pull-secret.yaml 
apiVersion: v1
kind: Secret
type: kubernetes.io/dockerconfigjson
metadata:
  name: pull-secret
  namespace: kni22
stringData:
  .dockerconfigjson: 'INSERT JSON FORMATTED PULL-SECRET'
EOF

At this point we should now have our six required files defined to build our Agent Installer ISO:

$ ls -1 ./manifests/
agent-cluster-install.yaml
cluster-deployment.yaml
cluster-image-set.yaml
infraenv.yaml
nmstateconfig.yaml
pull-secret.yaml 

We are now ready to use the Openshift install binary we compiled earlier with the Agent Installer code to generate our ephemeral OpenShift ISO.   We do this by issuing the following command which introduces the agent option.  This in turn will read in the manifest details we generated and download the corresponding RHCOS image and then inject our details into the image writing out a file called agent.iso:

$ bin/openshift-install agent create image 
INFO adding MAC interface map to host static network config - Name:  enp2s0  MacAddress: 52:54:00:e7:05:72 
INFO adding MAC interface map to host static network config - Name:  enp2s0  MacAddress: 52:54:00:95:fd:f3 
INFO adding MAC interface map to host static network config - Name:  enp2s0  MacAddress: 52:54:00:e8:b9:18 
INFO[0000] Adding NMConnection file <enp2s0 .nmconnection="">  pkg=manifests
INFO[0000] Adding NMConnection file <enp2s0 .nmconnection="">  pkg=manifests
INFO[0001] Adding NMConnection file <enp2s0 .nmconnection="">  pkg=manifests
INFO[0001] Start configuring static network for 3 hosts  pkg=manifests
INFO[0001] Adding NMConnection file <enp2s0 .nmconnection="">  pkg=manifests
INFO[0001] Adding NMConnection file <enp2s0 .nmconnection="">  pkg=manifests
INFO[0001] Adding NMConnection file <enp2s0 .nmconnection="">  pkg=manifests
INFO Obtaining RHCOS image file from 'https://rhcos-redirector.apps.art.xq1c.p1.openshiftapps.com/art/storage/releases/rhcos-4.11/411.85.202203181601-0/x86_64/rhcos-411.85.202203181601-0-live.x86_64.iso' 
INFO   

Once the agent create image command completes we are left with a agent.iso image which is in fact our OpenShift install ISO:

$ ls -l ./output/
total 1073152
-rw-rw-r--. 1 bschmaus bschmaus 1098907648 May 20 08:55 agent.iso

Since the nodes I will be using to demonstrate this 3 node compact cluster are virtual machines all on the same KVM hypervisor I will go ahead and copy the agent.iso image over to that host:

$ scp ./output/agent.iso root@192.168.0.22:/var/lib/libvirt/images/
root@192.168.0.22's password: 
agent.iso   

With the image moved over to the hypervisor host I went ahead and ensured each virtual machine we are using (asus3-vm[1-3]) has the image set.  Further the hosts are designed boot off the ISO if the disk is empty.  We can confirm everything is ready with the following output:

# virsh list --all
 Id   Name        State
----------------------------
 -    asus3-vm1   shut off
 -    asus3-vm2   shut off
 -    asus3-vm3   shut off
 -    asus3-vm4   shut off
 -    asus3-vm5   shut off
 -    asus3-vm6   shut off

# virsh domblklist asus3-vm1
 Target   Source
---------------------------------------------------
 sda      /var/lib/libvirt/images/asus3-vm1.qcow2
 sdb      /var/lib/libvirt/images/agent.iso

# virsh domblklist asus3-vm2
 Target   Source
---------------------------------------------------
 sda      /var/lib/libvirt/images/asus3-vm2.qcow2
 sdb      /var/lib/libvirt/images/agent.iso

# virsh domblklist asus3-vm3
 Target   Source
---------------------------------------------------
 sda      /var/lib/libvirt/images/asus3-vm3.qcow2
 sdb      /var/lib/libvirt/images/agent.iso
 

# virsh start asus3-vm1
Domain asus3-vm1 started

Once the first virtual machine is started we can switch over to the console and watch it boot up:


During the boot process the system will come up to a standard login prompt on the console.  Then in the background on the host it will start pulling in the required containers to run the familiar Assisted Installer UI.  I gave this process about 5 minutes before I attempted to access the web UI.   To access the web UI we can point our browser to the ipaddress of node we just booted and port 8080:


We should see a visible kni22 cluster with a status of draft because no nodes have been associated to it yet.  Next we will click on kni22 to bring us into the configuration:


We can see the familiar Assisted Installer discovery screen and we can also see our first host is listed.   At this point lets turn on the other two nodes that will make up our 3 node compact cluster and let them also boot from the agent ISO we created.

After the other two  nodes have booted we should see them appear in the web UI.  We also can see that the node names are all localhost.  This was due to the fact I set static IP addresses in the nmstate.yaml above.   If we had gone with DHCP the names would have been set by DHCP.  Nevertheless though we can go ahead and edit each hostname and set it to the proper name and click next to continue:


There will be an additional configuration page where other configuration items will be set and could be changed if needed but we will click next through that screen to bring us to the summary page:



If everything looks correct we can go ahead and click on the install cluster button to start the deployment:




At this point the cluster installation begins.  I should point out however we will not be able to watch the installation complete from the web UI.  The reason being is that the other two nodes will get their RHCOS images written to disk, reboot and then instantiate part of the cluster.  At that point the first node, the one running the web UI, will also get its RHCOS image written to disk and reboot.  After that the web UI is not longer available to watch.   With that in mind I recommend grabbing the kubeconfig for the cluster by clicking on the download kubeconfig button.

Once the web UI is no longer accessible we can monitor the installation from command line using the kubeconfig we downloaded.   First lets see where the nodes are at:

$ export KUBECONFIG=/home/bschmaus/kubeconfig-kni22

$ oc get nodes
NAME        STATUS   ROLES           AGE   VERSION
asus3-vm1   Ready    master,worker   2m    v1.23.5+9ce5071
asus3-vm2   Ready    master,worker   29m   v1.23.5+9ce5071
asus3-vm3   Ready    master,worker   29m   v1.23.5+9ce5071

All the nodes are in a ready state and marked as both a control node and worker.   Now lets see where the cluster operators are at:

$ oc get co 
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.10.10   False       True          True       18m     WellKnownAvailable: The well-known endpoint is not yet available: need at least 3 kube-apiservers, got 2
baremetal                                  4.10.10   True        False         False      17m     
cloud-controller-manager                   4.10.10   True        False         False      29m     
cloud-credential                           4.10.10   True        False         False      34m     
cluster-autoscaler                         4.10.10   True        False         False      16m     
config-operator                            4.10.10   True        False         False      18m     
console                                    4.10.10   True        False         False      4m33s   
csi-snapshot-controller                    4.10.10   True        False         False      18m     
dns                                        4.10.10   True        False         False      17m     
etcd                                       4.10.10   True        True          False      16m     NodeInstallerProgressing: 1 nodes are at revision 0; 2 nodes are at revision 4; 0 nodes have achieved new revision 5
image-registry                             4.10.10   True        False         False      9m44s   
ingress                                    4.10.10   True        False         False      11m     
insights                                   4.10.10   True        False         False      12m     
kube-apiserver                             4.10.10   True        True          False      4m29s   NodeInstallerProgressing: 1 nodes are at revision 0; 2 nodes are at revision 6
kube-controller-manager                    4.10.10   True        True          False      14m     NodeInstallerProgressing: 1 nodes are at revision 0; 2 nodes are at revision 7
kube-scheduler                             4.10.10   True        True          False      14m     NodeInstallerProgressing: 1 nodes are at revision 0; 2 nodes are at revision 6
kube-storage-version-migrator              4.10.10   True        False         False      18m     
machine-api                                4.10.10   True        False         False      7m53s   
machine-approver                           4.10.10   True        False         False      17m     
machine-config                             4.10.10   True        False         False      17m     
marketplace                                4.10.10   True        False         False      16m     
monitoring                                 4.10.10   True        False         False      5m52s   
network                                    4.10.10   True        True          False      19m     DaemonSet "openshift-multus/network-metrics-daemon" is not available (awaiting 1 nodes)...
node-tuning                                4.10.10   True        False         False      15m     
openshift-apiserver                        4.10.10   True        False         False      4m45s   
openshift-controller-manager               4.10.10   True        False         False      15m     
openshift-samples                          4.10.10   True        False         False      7m37s   
operator-lifecycle-manager                 4.10.10   True        False         False      17m     
operator-lifecycle-manager-catalog         4.10.10   True        False         False      17m     
operator-lifecycle-manager-packageserver   4.10.10   True        False         False      11m     
service-ca                                 4.10.10   True        False         False      19m     
storage                                    4.10.10   True        False         False      19m  

The cluster operators are still rolling out so lets give it a few more minutes and we will check again:

$ oc get co
NAME                                       VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
authentication                             4.10.10   True        False         False      13m     
baremetal                                  4.10.10   True        False         False      31m     
cloud-controller-manager                   4.10.10   True        False         False      43m     
cloud-credential                           4.10.10   True        False         False      49m     
cluster-autoscaler                         4.10.10   True        False         False      31m     
config-operator                            4.10.10   True        False         False      33m     
console                                    4.10.10   True        False         False      19m     
csi-snapshot-controller                    4.10.10   True        False         False      33m     
dns                                        4.10.10   True        False         False      32m     
etcd                                       4.10.10   True        False         False      31m     
image-registry                             4.10.10   True        False         False      24m     
ingress                                    4.10.10   True        False         False      26m     
insights                                   4.10.10   True        False         False      27m     
kube-apiserver                             4.10.10   True        False         False      19m     
kube-controller-manager                    4.10.10   True        False         False      29m     
kube-scheduler                             4.10.10   True        False         False      28m     
kube-storage-version-migrator              4.10.10   True        False         False      33m     
machine-api                                4.10.10   True        False         False      22m     
machine-approver                           4.10.10   True        False         False      32m     
machine-config                             4.10.10   True        False         False      32m     
marketplace                                4.10.10   True        False         False      31m     
monitoring                                 4.10.10   True        False         False      20m     
network                                    4.10.10   True        False         False      34m     
node-tuning                                4.10.10   True        False         False      30m     
openshift-apiserver                        4.10.10   True        False         False      19m     
openshift-controller-manager               4.10.10   True        False         False      29m     
openshift-samples                          4.10.10   True        False         False      22m     
operator-lifecycle-manager                 4.10.10   True        False         False      32m     
operator-lifecycle-manager-catalog         4.10.10   True        False         False      32m     
operator-lifecycle-manager-packageserver   4.10.10   True        False         False      26m     
service-ca                                 4.10.10   True        False         False      34m     
storage                                    4.10.10   True        False         False      34m   

At this point our cluster installation is completed.  However I forgot to mention that while the web UI was up we should have ssh'd to the bootstrap node and shelled into the assisted installer container running to retrieve our kubeadmin password under the /data directory.   However I purposely skipped that part so I could show how we can just reset the kubeadmin password instead.

First I want to thank Andrew Block and his write up on how to do this here.  So lets go ahead and create the kubeadmin-rotate.go file here in the kuberotate directory we create:

 $ mkdir ~/kuberotate
$cd ~/kuberotate

$ cat << EOF > ./kubeadmin-rotate.go 
package main import ( "fmt" "crypto/rand" "golang.org/x/crypto/bcrypt" b64 "encoding/base64" "math/big" ) // generateRandomPasswordHash generates a hash of a random ASCII password // 5char-5char-5char-5char func generateRandomPasswordHash(length int) (string, string, error) { const ( lowerLetters = "abcdefghijkmnopqrstuvwxyz" upperLetters = "ABCDEFGHIJKLMNPQRSTUVWXYZ" digits = "23456789" all = lowerLetters + upperLetters + digits ) var password string for i := 0; i < length; i++ { n, err := rand.Int(rand.Reader, big.NewInt(int64(len(all)))) if err != nil { return "", "", err } newchar := string(all[n.Int64()]) if password == "" { password = newchar } if i < length-1 { n, err = rand.Int(rand.Reader, big.NewInt(int64(len(password)+1))) if err != nil { return "", "",err } j := n.Int64() password = password[0:j] + newchar + password[j:] } } pw := []rune(password) for _, replace := range []int{5, 11, 17} { pw[replace] = '-' } bytes, err := bcrypt.GenerateFromPassword([]byte(string(pw)), bcrypt.DefaultCost) if err != nil { return "", "",err } return string(pw), string(bytes), nil } func main() { password, hash, err := generateRandomPasswordHash(23) if err != nil { fmt.Println(err.Error()) return } fmt.Printf("Actual Password: %s\n", password) fmt.Printf("Hashed Password: %s\n", hash) fmt.Printf("Data to Change in Secret: %s\n", b64.StdEncoding.EncodeToString([]byte(hash))) } EOF

Next lets go ahead and initialize our go project:

$ go mod init kuberotate
go: creating new go.mod: module kuberotate

With the project initialized lets go ahead and pull in the module dependencies by executing a go mod tidy which will pull in the bcrypt module:

$ go mod tidy
go: finding module for package golang.org/x/crypto/bcrypt
go: found golang.org/x/crypto/bcrypt in golang.org/x/crypto v0.0.0-20220518034528-6f7dac969898

And finally since I just want to run the program instead of compile it I will just run a go run kubeadmin-rotate.go which will print out the password, a hashed password and a base64 encoded version of the hashed password:

$ go run kubeadmin-rotate.go 
Actual Password: gWdYr-62GLh-QIynG-Boj7n
Hashed Password: $2a$10$DN48Jp4YkuEEVMWZNyOR2.LkLn1ZZOJOtzR8c9detf1lVAQ2iVQGK
Data to Change in Secret: JDJhJDEwJERONDhKcDRZa3VFRVZNV1pOeU9SMi5Ma0xuMVpaT0pPdHpSOGM5ZGV0ZjFsVkFRMmlWUUdL

The last step is to patch the kubeadmin secret with the hashed password that was base64 encoded:

$ oc patch secret -n kube-system kubeadmin --type json -p '[{"op": "replace", "path": "/data/kubeadmin", "value": "JDJhJDEwJERONDhKcDRZa3VFRVZNV1pOeU9SMi5Ma0xuMVpaT0pPdHpSOGM5ZGV0ZjFsVkFRMmlWUUdL"}]'
secret/kubeadmin patched

Now we can go over to the OpenShift console and see if we can login.   And sure enough with the password we had above we can and confirm our 3 node OpenShift cluster installed by the agent installer is ready to be used for workloads:


Hopefully this blog was useful to provide a preview of what the agent installer will look like.  Keep in mind the code is under rapid development and so things could change but change is always good!

Tuesday, May 17, 2022

Rook on Singe Node OpenShift


Recently a lot of customers have been asking about how to configure storage on a Single Node OpenShift (SNO) deployment.   When dealing with a compact or multi-node OpenShift deployment I myself have always relied on using OpenShift Data Foundation (ODF) as the underpinning of my storage requirements after all it provides the ability to do block, object and file all from the same deployment.  However in a SNO deployment ODF does not seem to be an option due to the way the operator has been designed.  However there is a way to at least get some resemblance to ODF without a lot of hassle in a SNO environment.  The following blog demonstrates a non-supported way on how I go about getting the dynamic block storage I need in my SNO cluster using Rook.

Before we begin lets quickly go over the environment of this SNO cluster.   As with any SNO cluster it is a single node acting as both the control plane and worker node.  This particular deployment was based on OpenShift 4.10.11 and deployed using the Assisted Installer at cloud.redhat.com

$ oc get nodes
NAME                            STATUS   ROLES           AGE   VERSION
master-0.sno3.schmaustech.com   Ready    master,worker   3h    v1.23.5+9ce5071

Inside the node via the debug pod we can see that we have an extra 160GB disk available to use toward our Rook deployment:

$ oc debug node/master-0.sno3.schmaustech.com
Starting pod/master-0sno3schmaustechcom-debug ...
To use host binaries, run `chroot /host`
Pod IP: 192.168.0.206
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0   120G  0 disk 
|-sda1   8:1    0     1M  0 part 
|-sda2   8:2    0   127M  0 part 
|-sda3   8:3    0   384M  0 part /boot
`-sda4   8:4    0 119.5G  0 part /sysroot
sdb      8:16   0   160G  0 disk 
sr0     11:0    1   999M  0 rom more

Now that we have provided the environment background lets go ahead and start to configure Rook on the SNO node.   The first step will be to configure the custom resource definitions that Rook requires before the operator and the Ceph cluster can be deployed.   Since I have not changed the crds.yaml file from its defaults we can consume it directly from the Rook Github repository and apply it to our SNO node:

$ oc create -f https://raw.githubusercontent.com/rook/rook/master/deploy/examples/crds.yaml 
customresourcedefinition.apiextensions.k8s.io/cephblockpoolradosnamespaces.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephblockpools.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephbucketnotifications.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephbuckettopics.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephclients.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephclusters.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephfilesystemmirrors.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephfilesystems.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephfilesystemsubvolumegroups.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephnfses.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectrealms.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstores.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectstoreusers.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectzonegroups.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephobjectzones.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/cephrbdmirrors.ceph.rook.io created
customresourcedefinition.apiextensions.k8s.io/objectbucketclaims.objectbucket.io created
customresourcedefinition.apiextensions.k8s.io/objectbuckets.objectbucket.io created

With the custom resource definitions applied we can move onto adding in the common resources that are necessary to start the operator and the Ceph cluster.  Again since I am not changing anything in the defaults from the Rook Github repository we can apply directly from the source to the SNO node:
  
$ oc create -f https://raw.githubusercontent.com/rook/rook/master/deploy/examples/common.yaml 
namespace/rook-ceph created
clusterrole.rbac.authorization.k8s.io/cephfs-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/cephfs-external-provisioner-runner created
clusterrole.rbac.authorization.k8s.io/psp:rook created
clusterrole.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
clusterrole.rbac.authorization.k8s.io/rbd-external-provisioner-runner created
clusterrole.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
clusterrole.rbac.authorization.k8s.io/rook-ceph-global created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
clusterrole.rbac.authorization.k8s.io/rook-ceph-mgr-system created
clusterrole.rbac.authorization.k8s.io/rook-ceph-object-bucket created
clusterrole.rbac.authorization.k8s.io/rook-ceph-osd created
clusterrole.rbac.authorization.k8s.io/rook-ceph-system created
clusterrolebinding.rbac.authorization.k8s.io/cephfs-csi-nodeplugin created
clusterrolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
clusterrolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-global created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-cluster created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-object-bucket created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-osd created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-system created
clusterrolebinding.rbac.authorization.k8s.io/rook-ceph-system-psp created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-cephfs-plugin-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-cephfs-provisioner-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-rbd-plugin-sa-psp created
clusterrolebinding.rbac.authorization.k8s.io/rook-csi-rbd-provisioner-sa-psp created
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy/00-rook-privileged created
role.rbac.authorization.k8s.io/cephfs-external-provisioner-cfg created
role.rbac.authorization.k8s.io/rbd-csi-nodeplugin created
role.rbac.authorization.k8s.io/rbd-external-provisioner-cfg created
role.rbac.authorization.k8s.io/rook-ceph-cmd-reporter created
role.rbac.authorization.k8s.io/rook-ceph-mgr created
role.rbac.authorization.k8s.io/rook-ceph-osd created
role.rbac.authorization.k8s.io/rook-ceph-purge-osd created
role.rbac.authorization.k8s.io/rook-ceph-rgw created
role.rbac.authorization.k8s.io/rook-ceph-system created
rolebinding.rbac.authorization.k8s.io/cephfs-csi-provisioner-role-cfg created
rolebinding.rbac.authorization.k8s.io/rbd-csi-nodeplugin-role-cfg created
rolebinding.rbac.authorization.k8s.io/rbd-csi-provisioner-role-cfg created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cluster-mgmt created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cmd-reporter created
rolebinding.rbac.authorization.k8s.io/rook-ceph-cmd-reporter-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-default-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-mgr-system created
rolebinding.rbac.authorization.k8s.io/rook-ceph-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-osd-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-purge-osd created
rolebinding.rbac.authorization.k8s.io/rook-ceph-purge-osd-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-rgw created
rolebinding.rbac.authorization.k8s.io/rook-ceph-rgw-psp created
rolebinding.rbac.authorization.k8s.io/rook-ceph-system created
serviceaccount/rook-ceph-cmd-reporter created
serviceaccount/rook-ceph-mgr created
serviceaccount/rook-ceph-osd created
serviceaccount/rook-ceph-purge-osd created
serviceaccount/rook-ceph-rgw created
serviceaccount/rook-ceph-system created
serviceaccount/rook-csi-cephfs-plugin-sa created
serviceaccount/rook-csi-cephfs-provisioner-sa created
serviceaccount/rook-csi-rbd-plugin-sa created
serviceaccount/rook-csi-rbd-provisioner-sa created

Next we need to create the Rook operator.yaml file which will be used to configure the Rook operator:

$ cat << EOF > ~/operator.yaml
kind: SecurityContextConstraints
apiVersion: security.openshift.io/v1
metadata:
  name: rook-ceph
allowPrivilegedContainer: true
allowHostDirVolumePlugin: true
allowHostPID: false
allowHostNetwork: false
allowHostPorts: false
priority:
allowedCapabilities: ["MKNOD"]
allowHostIPC: true
readOnlyRootFilesystem: false
requiredDropCapabilities: []
defaultAddCapabilities: []
runAsUser:
  type: RunAsAny
seLinuxContext:
  type: MustRunAs
fsGroup:
  type: MustRunAs
supplementalGroups:
  type: RunAsAny
volumes:
  - configMap
  - downwardAPI
  - emptyDir
  - hostPath
  - persistentVolumeClaim
  - projected
  - secret
users:
  - system:serviceaccount:rook-ceph:rook-ceph-system 
  - system:serviceaccount:rook-ceph:default 
  - system:serviceaccount:rook-ceph:rook-ceph-mgr 
  - system:serviceaccount:rook-ceph:rook-ceph-osd 
  - system:serviceaccount:rook-ceph:rook-ceph-rgw 
---
kind: SecurityContextConstraints
apiVersion: security.openshift.io/v1
metadata:
  name: rook-ceph-csi
allowPrivilegedContainer: true
allowHostNetwork: true
allowHostDirVolumePlugin: true
priority:
allowedCapabilities: ["SYS_ADMIN"]
allowHostPorts: true
allowHostPID: true
allowHostIPC: true
readOnlyRootFilesystem: false
runAsUser:
  type: RunAsAny
seLinuxContext:
  type: RunAsAny
fsGroup:
  type: RunAsAny
supplementalGroups:
  type: RunAsAny
volumes:
  - configMap
  - projected
  - emptyDir
  - hostPath
users:
  - system:serviceaccount:rook-ceph:rook-csi-rbd-plugin-sa 
  - system:serviceaccount:rook-ceph:rook-csi-rbd-provisioner-sa 
  - system:serviceaccount:rook-ceph:rook-csi-cephfs-plugin-sa 
  - system:serviceaccount:rook-ceph:rook-csi-cephfs-provisioner-sa 
  - system:serviceaccount:rook-ceph:rook-csi-nfs-plugin-sa 
  - system:serviceaccount:rook-ceph:rook-csi-nfs-provisioner-sa 
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: rook-ceph-operator-config
  namespace: rook-ceph 
data:
  ROOK_LOG_LEVEL: "INFO"
  ROOK_CSI_ENABLE_CEPHFS: "true"
  ROOK_CSI_ENABLE_RBD: "true"
  ROOK_CSI_ENABLE_NFS: "false"
  ROOK_CSI_ENABLE_GRPC_METRICS: "false"
  CSI_ENABLE_ENCRYPTION: "false"
  CSI_PROVISIONER_REPLICAS: "2"
  CSI_ENABLE_CEPHFS_SNAPSHOTTER: "true"
  CSI_ENABLE_RBD_SNAPSHOTTER: "true"
  CSI_FORCE_CEPHFS_KERNEL_CLIENT: "true"
  CSI_RBD_FSGROUPPOLICY: "ReadWriteOnceWithFSType"
  CSI_CEPHFS_FSGROUPPOLICY: "ReadWriteOnceWithFSType"
  CSI_NFS_FSGROUPPOLICY: "ReadWriteOnceWithFSType"
  ROOK_CSI_ALLOW_UNSUPPORTED_VERSION: "false"
  CSI_PLUGIN_ENABLE_SELINUX_HOST_MOUNT: "false"
  CSI_PLUGIN_PRIORITY_CLASSNAME: "system-node-critical"
  CSI_PROVISIONER_PRIORITY_CLASSNAME: "system-cluster-critical"
  ROOK_OBC_WATCH_OPERATOR_NAMESPACE: "true"
  ROOK_ENABLE_DISCOVERY_DAEMON: "false"
  CSI_ENABLE_VOLUME_REPLICATION: "false"
  ROOK_CEPH_COMMANDS_TIMEOUT_SECONDS: "15"
  CSI_ENABLE_CSIADDONS: "false"
  CSI_GRPC_TIMEOUT_SECONDS: "150"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: rook-ceph-operator
  namespace: rook-ceph 
  labels:
    operator: rook
    storage-backend: ceph
    app.kubernetes.io/name: rook-ceph
    app.kubernetes.io/instance: rook-ceph
    app.kubernetes.io/component: rook-ceph-operator
    app.kubernetes.io/part-of: rook-ceph-operator
spec:
  selector:
    matchLabels:
      app: rook-ceph-operator
  replicas: 1
  template:
    metadata:
      labels:
        app: rook-ceph-operator
    spec:
      serviceAccountName: rook-ceph-system
      containers:
        - name: rook-ceph-operator
          image: rook/ceph:v1.9.3
          args: ["ceph", "operator"]
          securityContext:
            runAsNonRoot: true
            runAsUser: 2016
            runAsGroup: 2016
          volumeMounts:
            - mountPath: /var/lib/rook
              name: rook-config
            - mountPath: /etc/ceph
              name: default-config-dir
            - mountPath: /etc/webhook
              name: webhook-cert
          ports:
            - containerPort: 9443
              name: https-webhook
              protocol: TCP
          env:
            - name: ROOK_CURRENT_NAMESPACE_ONLY
              value: "false"
            - name: ROOK_DISCOVER_DEVICES_INTERVAL
              value: "60m"
            - name: ROOK_HOSTPATH_REQUIRES_PRIVILEGED
              value: "true"
            - name: ROOK_ENABLE_SELINUX_RELABELING
              value: "true"
            - name: ROOK_ENABLE_FSGROUP
              value: "true"
            - name: ROOK_DISABLE_DEVICE_HOTPLUG
              value: "false"
            - name: DISCOVER_DAEMON_UDEV_BLACKLIST
              value: "(?i)dm-[0-9]+,(?i)rbd[0-9]+,(?i)nbd[0-9]+"
            - name: ROOK_ENABLE_MACHINE_DISRUPTION_BUDGET
              value: "false"
            - name: ROOK_UNREACHABLE_NODE_TOLERATION_SECONDS
              value: "5"
            - name: NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
      volumes:
        - name: rook-config
          emptyDir: {}
        - name: default-config-dir
          emptyDir: {}
        - name: webhook-cert
          emptyDir: {}

EOF

With the Rook operator.yaml saved we can now apply it to the SNO node and after a few minutes validate that the Rook operator is running:

$ oc create -f operator.yaml 
securitycontextconstraints.security.openshift.io/rook-ceph created
securitycontextconstraints.security.openshift.io/rook-ceph-csi created
configmap/rook-ceph-operator-config created
deployment.apps/rook-ceph-operator created

$ oc get pods -n rook-ceph
NAME                                                              READY   STATUS      RESTARTS   AGE
rook-ceph-operator-84bf68d9bd-lv9l9                               1/1     Running     0          1m

Finally we get the heart of this configuration which is the cluster.yaml file.   In this file we need to make some modification since we only have a single node for the Ceph deployment via Rook.   Here are things I modified from the default:
  • osd_pool_default_size needs to be 1
  • mon count needs to be 1 and allowMultiplePerNode needs to be true
  • mgr count needs to be 1 and allowMultiplePerNode needs to be true
  • storage device needs to be set to extra disk available (in my case sdb)
  • osdsPerDevice needs to be 1
  • managePodBudgets and manageMachineDisruptionBudgets both set to false
We also need to keep in mind that this configuration is not a redundant configuration with OSD replication across many nodes so its really just a configuration of convenience to provide dynamic storage for the potential of multiple applications requiring persistent volume claims.  We can proceed by saving out the cluster.yaml with the updates mentioned above:

$ cat << EOF > ~/cluster.yaml
kind: ConfigMap
apiVersion: v1
metadata:
  name: rook-config-override
  namespace: rook-ceph
data:
  config: |
    [global]
    osd_pool_default_size = 1
---
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
  name: rook-ceph
  namespace: rook-ceph 
spec:
  cephVersion:
    image: quay.io/ceph/ceph:v16.2.7
    allowUnsupported: false
  dataDirHostPath: /var/lib/rook
  skipUpgradeChecks: false
  continueUpgradeAfterChecksEvenIfNotHealthy: false
  waitTimeoutForHealthyOSDInMinutes: 10
  mon:
    count: 1
    allowMultiplePerNode: true
  mgr:
    count: 1
    allowMultiplePerNode: true
    modules:
      - name: pg_autoscaler
        enabled: true
  dashboard:
    enabled: true
    ssl: true
  monitoring:
    enabled: false
  network:
    connections:
      encryption:
        enabled: false
      compression:
        enabled: false
  crashCollector:
    disable: false
  cleanupPolicy:
    confirmation: ""
    sanitizeDisks:
      method: quick
      dataSource: zero
      iteration: 1
    allowUninstallWithVolumes: false
  annotations:
  labels:
  resources:
  removeOSDsIfOutAndSafeToRemove: false
  priorityClassNames:
    mon: system-node-critical
    osd: system-node-critical
    mgr: system-cluster-critical
  storage: 
    useAllNodes: true
    useAllDevices: false
    devices:
    - name: "sdb"
    config:
      osdsPerDevice: "1"
    onlyApplyOSDPlacement: false
  disruptionManagement:
    managePodBudgets: false
    osdMaintenanceTimeout: 30
    pgHealthCheckTimeout: 0
    manageMachineDisruptionBudgets: false
    machineDisruptionBudgetNamespace: openshift-machine-api
  healthCheck:
    daemonHealth:
      mon:
        disabled: false
        interval: 45s
      osd:
        disabled: false
        interval: 60s
      status:
        disabled: false
        interval: 60s
    livenessProbe:
      mon:
        disabled: false
      mgr:
        disabled: false
      osd:
        disabled: false
    startupProbe:
      mon:
        disabled: false
      mgr:
        disabled: false
      osd:
        disabled: false

EOF

Now with the cluster.yaml saved we can apply it to the cluster and let the Rook operator do the work of creating the Ceph cluster on our SNO node:

$ oc create -f cluster.yaml 
configmap/rook-config-override created
cephcluster.ceph.rook.io/rook-ceph created

After a few minutes, depending on the speed of the SNO deployment, we can validate that the Ceph cluster is up and deployed on our SNO node:

$ oc get pods -n rook-ceph
NAME                                                              READY   STATUS      RESTARTS   AGE
csi-cephfsplugin-provisioner-7577bb4d59-kxmq8                     6/6     Running     0          118s
csi-cephfsplugin-x2njd                                            3/3     Running     0          118s
csi-rbdplugin-provisioner-847b498845-7z5qc                        6/6     Running     0          119s
csi-rbdplugin-tlw5d                                               3/3     Running     0          119s
rook-ceph-crashcollector-master-0.sno3.schmaustech.com-858mmbx2   1/1     Running     0          48s
rook-ceph-mgr-a-57fbb7fb47-9rjl5                                  1/1     Running     0          81s
rook-ceph-mon-a-d94d79bb5-l6f8p                                   1/1     Running     0          110s
rook-ceph-operator-84bf68d9bd-qkj6k                               1/1     Running     0          17m
rook-ceph-osd-0-6c98c84f66-96l5q                                  1/1     Running     0          48s
rook-ceph-osd-prepare-master-0.sno3.schmaustech.com-l5s8t         0/1     Completed   0          60s

We can see from the above output that the pods for both the single mon and single osd are up and running along with the additional services for our single node Ceph cluster.   We can further validate that the Ceph cluster is up and running by deploying a Ceph toolbox pod.   For that we will just use the toolbox.yaml from the Rook Github repository.  Once we create the pod we can validate it is running by filtering for the rook-ceph-tools pod in the rook-ceph namespace:

$ oc create -f https://raw.githubusercontent.com/rook/rook/master/deploy/examples/toolbox.yaml
deployment.apps/rook-ceph-tools created

$ oc get pods -n rook-ceph| grep rook-ceph-tools
rook-ceph-tools-d6d7c985c-6zwc7                                   1/1     Running     0          54s

Now lets use the running toolbox to check on the Ceph cluster by issuing an exec command to it and passing in a ceph status:

$ oc -n rook-ceph exec -it rook-ceph-tools-d6d7c985c-6zwc7 -- ceph status
  cluster:
    id:     c54ad01e-e9f8-48c9-806b-a7d4748eb977
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum a (age 7m)
    mgr: a(active, since 5m)
    osd: 1 osds: 1 up (since 5m), 1 in (since 6m)
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 B
    usage:   4.8 MiB used, 160 GiB / 160 GiB avail
    pgs: 

Sure enough our Ceph cluster is up and running and in a healthy state.   Lets move on now to confirm we can consume storage from it.   To do that we need to setup a storageclass configuration like the example below which will create a Ceph RBD block storageclass:

$ cat << EOF > ~/storageclass.yaml
apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  failureDomain: host
  replicated:
    size: 1
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: rook-ceph-block
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
    clusterID: rook-ceph
    pool: replicapool
    imageFormat: "2"
    imageFeatures: layering
    csi.storage.k8s.io/provisioner-secret-name: rook-csi-rbd-provisioner
    csi.storage.k8s.io/provisioner-secret-namespace: rook-ceph
    csi.storage.k8s.io/controller-expand-secret-name: rook-csi-rbd-provisioner
    csi.storage.k8s.io/controller-expand-secret-namespace: rook-ceph
    csi.storage.k8s.io/node-stage-secret-name: rook-csi-rbd-node
    csi.storage.k8s.io/node-stage-secret-namespace: rook-ceph
    csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Delete
allowVolumeExpansion: true
EOF

Once we have saved the storageclass.yaml file lets go ahead and apply it to the SNO node and then check that the storageclass was created:

$ oc create -f storageclass.yaml
cephblockpool.ceph.rook.io/replicapool created
storageclass.storage.k8s.io/rook-ceph-block created

$ oc get sc
NAME              PROVISIONER                  RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
rook-ceph-block   rook-ceph.rbd.csi.ceph.com   Delete          Immediate           true                   3s
oc get 

Now that we have a storageclass created I like to do one more thing to ensure any outstanding persistent volume claims get fulfilled by the storageclass automatically.  To do this I will patch the storageclass to be the default:

$ oc patch storageclass  rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
storageclass.storage.k8s.io/rook-ceph-block patched

$ oc get sc
NAME                        PROVISIONER                  RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
rook-ceph-block (default)   rook-ceph.rbd.csi.ceph.com   Delete          Immediate           true                   8m41s

At this point everything is configured to consume storage.  What I did in the example below was to kick off an installation of Red Hat Advanced Cluster Management on my SNO node because I knew it would need a PV.  Once it completed installation I confirmed by looking at the persistent volumes that indeed one had been created from our storageclass which gets its storage from the Ceph cluster:

$ oc get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                             STORAGECLASS      REASON   AGE
pvc-791551c6-cdc4-4c9e-9692-c9622dbef4e8   10Gi       RWO            Delete           Bound    open-cluster-management/search-redisgraph-pvc-0   rook-ceph-block            45s


Hopefully this was a helpful blog in providing a dynamic almost ODF like experience for a SNO node deployment!