Thursday, May 27, 2021

Deploying Single Node OpenShift (SNO) from Existing OpenShift Cluster


 

Making SNO in the summer has never been easier with a little help from Hive and the Assisted Installer operators in OpenShift.   If this sounds like something of interested then please read further on as I step through the method to get a Single Node OpenShift (SNO) deployed from an existing OpenShift cluster.

The first thing I will need to perform the procedure will be to have an existing OpenShift cluster running on 4.8.  In my case I am using a pre-release version of 4.8.0-fc3 running on an existing SNO deployed cluster which is a virtual machine.  Further I will need another unused virtual node that will become my new SNO OpenShift cluster. 

Now that I have identified my environment lets go ahead and start the configuration process.  First we need to enable and configure the Local-Storage operator so that we can provide some PVs that can be consumed by the AI operator for the Postgres and bucket requirements of that operator.  Note that any dynamic storage provider can be used for this but in my environment Local-Storage made the most sense.   First lets create the local-storage-operator.yaml:

$ cat << EOF > ~/local-storage-operator.yaml
---
apiVersion: v1
kind: Namespace
metadata:
  name: openshift-local-storage
spec: {}
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: openshift-local-storage
  namespace: openshift-local-storage
spec:
  targetNamespaces:
  - openshift-local-storage
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: local-storage-operator
  namespace: openshift-local-storage
spec:
  channel: "4.7"
  installPlanApproval: Automatic
  name: local-storage-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
EOF

Now lets use the local-storage-operator.yaml file we created to install the operator:

$ oc create -f ~/local-storage-operator.yaml 
namespace/openshift-local-storage created
operatorgroup.operators.coreos.com/openshift-local-storage created
subscription.operators.coreos.com/local-storage-operator created

Once the operator is created in a few minutes we should see a running pod in the openshift-local-storage namespace:

$ oc get pods -n openshift-local-storage
NAME                                      READY   STATUS    RESTARTS   AGE
local-storage-operator-845457cd85-ttb8g   1/1     Running   0          37s

Now that the operator is installed and running we can go ahead and configure a hive-local-storage.yaml to consume any of the disks we have assigned on our worker nodes.  In my example since I have a single master/worker virtual machine I went ahead and added a bunch of small qcow2 disks.   The devices paths might vary depending on the environment but the rest of the content should be similar to the following:

$ cat << EOF > ~/hive-local-storage.yaml
apiVersion: local.storage.openshift.io/v1
kind: LocalVolume
metadata:
  name: fs
  namespace: openshift-local-storage
spec:
  logLevel: Normal
  managementState: Managed
  storageClassDevices:
    - devicePaths:
        - /dev/sdb
        - /dev/sdc
        - /dev/sdd
        - /dev/sde
        - /dev/sdf
        - /dev/sdg
        - /dev/sdh
        - /dev/sdi
        - /dev/sdj
        - /dev/sdk
        - /dev/sdl
        - /dev/sdm
      fsType: ext4
      storageClassName: local-storage
      volumeMode: Filesystem
EOF

With the hive-local-storage.yaml created we can now create the resource:

$ oc create -f hive-local-storage.yaml 
localvolume.local.storage.openshift.io/fs created

Once it has created we can verify everything is working properly by looking at the additional pods that are running in the openshift-local-storage namespace:

$ oc get pods -n openshift-local-storage
NAME                                      READY   STATUS    RESTARTS   AGE
fs-local-diskmaker-nv5xr                  1/1     Running   0          46s
fs-local-provisioner-9dt2m                1/1     Running   0          46s
local-storage-operator-845457cd85-ttb8g   1/1     Running   0          4m25s


We can also confirm if our disks were picked up by looking at the PVs available on the cluster and the local-storage storageclass that is now defined:

$ oc get pv
NAME                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS    REASON   AGE
local-pv-188fc254   20Gi       RWO            Delete           Available           local-storage            33s
local-pv-6d45f357   20Gi       RWO            Delete           Available           local-storage            33s
local-pv-96d2cc66   10Gi       RWO            Delete           Available           local-storage            33s
local-pv-99a52316   20Gi       RWO            Delete           Available           local-storage            33s
local-pv-9e0442ea   10Gi       RWO            Delete           Available           local-storage            33s
local-pv-c061aa19   20Gi       RWO            Delete           Available           local-storage            33s
local-pv-c26659da   20Gi       RWO            Delete           Available           local-storage            33s
local-pv-d08519a8   10Gi       RWO            Delete           Available           local-storage            33s
local-pv-d2f2a467   10Gi       RWO            Delete           Available           local-storage            33s
local-pv-d4a12edd   20Gi       RWO            Delete           Available           local-storage            33s
local-pv-f5e1ca69   10Gi       RWO            Delete           Available           local-storage            33s
local-pv-ffdb70b    10Gi       RWO            Delete           Available           local-storage            33s

$ oc get sc
NAME            PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-storage   kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  72s

Because I want PVCs to automatically get their storage from the local-storage storageclass I am going to go ahead and patch the storageclass setting it to default.   I can confirm this by looking at the storageclasses again:

$ oc patch storageclass local-storage -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
storageclass.storage.k8s.io/local-storage patched
$ oc get sc
NAME                      PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-storage (default)   kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  2m14s

Now that we have the local-storage configured we can move onto getting Hive installed.   Lets go ahead and create the hive-operator.yaml below:

$ cat << EOF > ~/hive-operator.yaml
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: hive-operator
  namespace: openshift-operators
spec:
  channel: alpha
  installPlanApproval: Automatic
  name: hive-operator
  source: community-operators
  sourceNamespace: openshift-marketplace
  startingCSV: hive-operator.v1.1.4
EOF

And then lets use oc create with the yaml we created to install the Hive operator:

$ oc create -f hive-operator.yaml
subscription.operators.coreos.com/hive-operator created

We can confirm the Hive operator is installed by looking at the operators and specifically grabbing the Hive operator.   If we look at the pods under the hive namespace we can see there are no pods and this is completely normal:

$ oc get operators hive-operator.openshift-operators
NAME                                AGE
hive-operator.openshift-operators   2m28s
$ oc get pods -n hive
No resources found in hive namespace.

One thing the Hive operator does seem to do is create an assisted-installer namespace.  This namespace creates and issue once the assisted -installer operator is installed for postgres as identified in this BZ#1951812.  Because of that we are going to delete the assisted-installer namespace.  It will get recreated in the next steps:

$ oc delete namespace assisted-installer
namespace "assisted-installer" deleted

Now we are ready to install the Assisted-Installer operator.  Before we can install the operator though we need to create a catalog resource file like the one below:

$ cat << EOF > ~/assisted-installer-catsource.yaml
---
apiVersion: v1
kind: Namespace
metadata:
  name: assisted-installer
  labels:
    name: assisted-installer
---
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: assisted-service
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/ocpmetal/assisted-service-index:latest
EOF

We also need to create the Assisted-Installer operator subscription yaml:

$ cat << EOF > ~/assisted-installer-operator.yaml
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: assisted-service-operator
  namespace: assisted-installer
spec:
  targetNamespaces:
  - assisted-installer
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: assisted-service-operator
  namespace: assisted-installer 
spec:
  channel: alpha
  installPlanApproval: Automatic
  name: assisted-service-operator
  source: assisted-service
  sourceNamespace: openshift-marketplace
EOF

With both files created we can go ahead and run the oc create commands against them.  First doing the Assisted-Installer catalog source file and then the Assisted-Installer subscription file that will install the operator:

$ oc create -f assisted-installer-catsource.yaml
namespace/assisted-installer created
catalogsource.operators.coreos.com/assisted-service created

$ oc create -f assisted-installer-operator.yaml
operatorgroup.operators.coreos.com/assisted-service-operator created
subscription.operators.coreos.com/assisted-service-operator created

We can confirm the operator is installed by looking at the running pods under the assisted-installer namespace:

$ oc get pods -n assisted-installer
NAME                                         READY   STATUS    RESTARTS   AGE
assisted-service-operator-579679d899-x982l   1/1     Running   0          56s

Finally to complete the installation of the Assisted-Installer we need to configure the agent service config file like the example one below.   The storage sizes can be larger if needed but I am using 20GB as that is what volume sizes are available from the local-storage I configured in my environment:

$ cat << EOF > ~/assisted-installer-agentserviceconfig.yaml
apiVersion: agent-install.openshift.io/v1beta1
kind: AgentServiceConfig
metadata:
  name: agent
spec:
  databaseStorage:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 20Gi
  filesystemStorage:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 20Gi
  osImages:
    - openshiftVersion: '4.8'
      rootFSUrl: >-
        https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/4.8.0-fc.3/rhcos-live-rootfs.x86_64.img
      url: >-
        https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/4.8.0-fc.3/rhcos-4.8.0-fc.3-x86_64-live.x86_64.iso
      version: 48.84.202105062123-0
EOF

Once we have created the agent service config we can go ahead and apply it to the hub cluster:

$ oc create -f ~/assisted-installer-agentserviceconfig.yaml
agentserviceconfig.agent-install.openshift.io/agent created

We can confirm everything is running by looking at both the pods under the assisted-installer namespace and also by looking at the PVCs consumed by the assisted-installer namespace:

$ oc get pods -n assisted-installer
NAME                                         READY   STATUS    RESTARTS   AGE
assisted-service-b7dc8b8d7-2cztd             1/2     Running   1          53s
assisted-service-operator-579679d899-x982l   1/1     Running   0          3m50s

$ oc get pvc -n assisted-installer
NAME               STATUS   VOLUME              CAPACITY   ACCESS MODES   STORAGECLASS    AGE
assisted-service   Bound    local-pv-99a52316   20Gi       RWO            local-storage   87s
postgres           Bound    local-pv-6d45f357   20Gi       RWO            local-storage   87s

At this point we have configured and confirmed all the required service operators needed to enable us to do a deployment of OpenShift with the Assisted-Installer.  Now this configuration will allow us to deploy any one of the OpenShift deployments: Multi-Node-IPv4, SNO-IPv4, Multi-Node-IPv6, SNO-IPv6 and SNO-Dual-Stack.   For demonstration purposes I will be using the SNO-IPv4 deployment type.

Before we start the deployment I need to create some resource yamls that we will apply to the hub cluster to enable the deployment process.  The first file is the cluster imageset yaml which tells the Assisted-Installer which OpenShift release we are going to use.  In my example we will be using 4.8.0-fc.3.   Create the following assisted-installer-clusterimageset.yaml below and then apply it to the hub cluster:

$ cat << EOF > ~/assisted-installer-clusterimageset.yaml
apiVersion: hive.openshift.io/v1
kind: ClusterImageSet
metadata:
  name: openshift-v4.8.0
  namespace: assisted-installer
spec:
  releaseImage: quay.io/openshift-release-dev/ocp-release:4.8.0-fc.3-x86_64
EOF

$ oc create -f ~/assisted-installer-clusterimageset.yaml
clusterimageset.hive.openshift.io/openshift-v4.8.0 created

The next resource file we need is the Assisted-Installer pullsecret.   This contains the pull-secret used to authenticate to pull down the images from Quay during deployment.   Note that the "OPENSHIFT-PULL-SECRET-HERE" should be replaced with a real pull secret from cloud.redhat.com.  Create the following assisted-installer-secrets.yaml and then apply it to the hub cluster:

$ cat << EOF > ~/assisted-installer-secrets.yaml
apiVersion: v1
kind: Secret
metadata:
  name: assisted-deployment-pull-secret
  namespace: assisted-installer
stringData:
  .dockerconfigjson: 'OPENSHIFT-PULL-SECRET-HERE'
EOF

$ oc create -f ~/assisted-installer-secrets.yaml
secret/assisted-deployment-pull-secret created

Next we need a resource file that defines the ssh private key to be used.  This private key will enable us to login to the OpenShift nodes we deploy should we ever need to do troubleshooting of the cluster nodes.   Create the assisted-installer-sshprivate.yaml and then apply it to the hub cluster:

$ cat << EOF > ~/assisted-installer-sshprivate.yaml
apiVersion: v1
kind: Secret
metadata:
  name: assisted-deployment-ssh-private-key
  namespace: assisted-installer
stringData:
  ssh-privatekey: |-
    -----BEGIN OPENSSH PRIVATE KEY-----
    b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAABlwAAAAdzc2gtcn
    NhAAAAAwEAAQAAAYEA7uOSmvd8CgAUDaqGheAUcBsEOOoFAZYqtLKL9N0HameO6Fhv1t/l
    a4tG8BQMiu3pm5DWpRrq/O12OjjVDOHHSjwcMX/qfn8OKNVtPVq0SMZRbbkkpnK2WLMwLg
    ...
    8QT4AK4mb7H8tHo1RQkOB4foAQwPLXHvRBHrEGXnIugAeCszn8twZruRtcoX2jRiw7MS8B
    R+AuTLBeBwEXYGoxFhsaLhiCVUueEKJDUt66tVCr3ovvz8eapWv1LUM2QGeP56Z5QUsIrl
    wJwTtficCtwxK0XL+gJro9qYslbX2XxVD67goxVecIfNVmxtZ8KHeo6ICLkhOJjTAveAm+
    tF77qty2d0d0UAAAAXYnNjaG1hdXNAcmhlbDgtb2NwLWF1dG8BAgME
    -----END OPENSSH PRIVATE KEY-----
type: Opaque
EOF

$ oc create -f ~/assisted-installer-sshprivate.yaml
secret/assisted-deployment-ssh-private-key created

Next we need an agent cluster install resource configured.  This file contains some of the networking details one might find in the install-config.yaml when doing a OpenShift IPI installation.   Generate the assisted-installer-agentclusterinstall.yaml file and then apply it to the hub cluster:

$ cat << EOF > ~/assisted-installer-agentclusterinstall.yaml
---
apiVersion: extensions.hive.openshift.io/v1beta1
kind: AgentClusterInstall
metadata:
  name: test-cluster-virtual-aci
  namespace: assisted-installer
spec:
  clusterDeploymentRef:
    name: test-cluster-virtual
  imageSetRef:
    name: openshift-v4.8.0
  networking:
    clusterNetwork:
      - cidr: "10.128.0.0/14"
        hostPrefix: 23
    serviceNetwork:
      - "172.30.0.0/16"
    machineNetwork:
      - cidr: "192.168.0.0/24"
  provisionRequirements:
    controlPlaneAgents: 1
  sshPublicKey: 'ssh-rsa AAB3NzaC1yc2EAAAADAQABAAABgQDu45Ka93wKABQNqoaF4BRwGwQ46gUBliq0sov03QdqZ47oWG/W3+Vri0bwFAyK7embkNalGur87XY6ONUM4cdKPBwxf+p+fw4o1W09WrRIxlFtuSSmcrZYszAuD3NlHDS2WbXRHbeaMz8QsA9qNceZZj7PyB+fNULJYS2iNtlIING4DZTvoEr6KCe2cOIdDnjOW0xNng+ejKMe3vszwutLqgPMwQXS2tqJSHOMIS1kDdLFd3ZIT25xvORoNC/PWy5NAkxp9gJbrYEDlcvy7eUxqLAM9lTCWJQ0l31gqIQDOkGmS7M41x3nJ6kG1ilLn1miMfwrGKDramp7+8ejfr0QBspikp9S4Tmj6s0PA/lEL6EST4N12sr+GsJkHqhWa+HwutcMbg0qIFwhUMFN689Q84Tz9abSPnQipn09xNv1YtAEQLJxypZj3NB6/ZYZXWjB/IGgbJ4tifD1cTYdSA1A4UeNyvhIFj9yyjRF8mucGUR873xSMTlcLIq3V0aZqXU= bschmaus@rhel8-ocp-auto'
EOF

$ oc create -f ~/assisted-installer-agentclusterinstall.yaml
agentclusterinstall.extensions.hive.openshift.io/test-cluster-virtual-aci created

Finally we need the cluster deployment yaml which will enable the deployment of the cluster we are going to deploy.   Create the following assisted-installer-clusterdeployment.yaml file and then apply it to the hub cluster:

$ cat << EOF > ~/assisted-installer-clusterdeployment.yaml
---
apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
  name: test-cluster-virtual
  namespace: assisted-installer
spec:
  baseDomain: schmaustech.com
  clusterName: kni3
  controlPlaneConfig:
    servingCertificates: {}
  installed: false
  clusterInstallRef:
    group: extensions.hive.openshift.io
    kind: AgentClusterInstall
    name: test-cluster-virtual-aci
    version: v1beta1
  platform:
    agentBareMetal: 
      agentSelector:
        matchLabels:
          bla: "aaa"
  pullSecretRef:
    name: assisted-deployment-pull-secret
EOF

$ oc create -f ~/assisted-installer-clusterdeployment.yaml
clusterdeployment.hive.openshift.io/test-cluster-virtual created

Last but not least we have an infrastructure environment file which binds a lot of the previous files together.   Create the assisted-installer-infraenv.yaml file below and then apply it to the hub cluster: 

$ cat << EOF > ~/assisted-installer-infraenv.yaml
---
apiVersion: agent-install.openshift.io/v1beta1 
kind: InfraEnv
metadata:
  name: test-cluster-virtual-infraenv
  namespace: assisted-installer
spec:
  clusterRef:
    name: test-cluster-virtual
    namespace: assisted-installer
  sshAuthorizedKey: 'ssh-rsa AAB3NzaC1yc2EAAAADAQABAAABgQDu45Ka93wKABQNqoaF4BRwGwQ46gUBliq0sov03QdqZ47oWG/W3+Vri0bwFAyK7embkNalGur87XY6ONUM4cdKPBwxf+p+fw4o1W09WrRIxlFtuSSmcrZYszAuD3NlHDS2WbXRHbeaMz8QsA9qNceZZj7PyB+fNULJYS2iNtlIING4DZTvoEr6KCe2cOIdDnjOW0xNng+ejKMe3vszwutLqgPMwQXS2tqJSHOMIS1kDdLFd3ZIT25xvORoNC/PWy5NAkxp9gJbrYEDlcvy7eUxqLAM9lTCWJQ0l31gqIQDOkGmS7M41x3nJ6kG1ilLn1miMfwrGKDramp7+8ejfr0QBspikp9S4Tmj6s0PA/lEL6EST4N12sr+GsJkHqhWa+HwutcMbg0qIFwhUMFN689Q84Tz9abSPnQipn09xNv1YtAEQLJxypZj3NB6/ZYZXWjB/IGgbJ4tifD1cTYdSA1A4UeNyvhIFj9yyjRF8mucGUR873xSMTlcLIq3V0aZqXU= bschmaus@rhel8-ocp-auto'
  agentLabelSelector:
    matchLabels:
      bla: aaa
  pullSecretRef:
    name: assisted-deployment-pull-secret
EOF

$ oc create -f ~/assisted-installer-infraenv.yaml
infraenv.agent-install.openshift.io/test-cluster-virtual-infraenv created

Once all of the resource files have been applied to the hub cluster we should now be able to extract the RHCOS LiveOS ISO download URL for the image we will use to boot up our single node for our  spoke SNO IPv4 deployment.  We can do that by running the following command:

$ oc get infraenv test-cluster-virtual-infraenv -o jsonpath='{.status.isoDownloadURL}' -n assisted-installer
https://assisted-service-assisted-installer.apps.kni1.schmaustech.com/api/assisted-install/v1/clusters/b38c1d3e-e460-4111-a35f-4a8d79203585/downloads/image.iso?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbHVzdGVyX2lkIjoiYjM4YzFkM2UtZTQ2MC00MTExLWEzNWYtNGE4ZDc5MjAzNTg1In0.0sjy-0I9DstyaRA8oIUF9ByyUe31Kl6rUpVzBXSsO9mFfqLCDtF-Rh2NCWvVtjKyd4BZ7Zo5ZUIMsEtHX5sKWg

Now that we know the URL to the ISO image we can pull that image down to a location that can be accessed by our remote physical node via virtual media (iDrac/BMC).   In my case since the spoke SNO node I am using is a virtual machine I will be running a wget command on the hypervisor hosts where my virtual machine resides and storing the ISO under the /var/lib/libvirt/images path on that host:

# pwd
/var/lib/libvirt/images

# wget --no-check-certificate https://assisted-service-assisted-installer.apps.kni1.schmaustech.com/api/assisted-install/v1/clusters/b38c1d3e-e460-4111-a35f-4a8d79203585/downloads/image.iso?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbHVzdGVyX2lkIjoiYjM4YzFkM2UtZTQ2MC00MTExLWEzNWYtNGE4ZDc5MjAzNTg1In0.0sjy-0I9DstyaRA8oIUF9ByyUe31Kl6rUpVzBXSsO9mFfqLCDtF-Rh2NCWvVtjKyd4BZ7Zo5ZUIMsEtHX5sKWg -O discover.iso
--2021-05-26 15:16:13--  https://assisted-service-assisted-installer.apps.kni1.schmaustech.com/api/assisted-install/v1/clusters/b38c1d3e-e460-4111-a35f-4a8d79203585/downloads/image.iso?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbHVzdGVyX2lkIjoiYjM4YzFkM2UtZTQ2MC00MTExLWEzNWYtNGE4ZDc5MjAzNTg1In0.0sjy-0I9DstyaRA8oIUF9ByyUe31Kl6rUpVzBXSsO9mFfqLCDtF-Rh2NCWvVtjKyd4BZ7Zo5ZUIMsEtHX5sKWg
Resolving assisted-service-assisted-installer.apps.kni1.schmaustech.com (assisted-service-assisted-installer.apps.kni1.schmaustech.com)... 192.168.0.204
Connecting to assisted-service-assisted-installer.apps.kni1.schmaustech.com (assisted-service-assisted-installer.apps.kni1.schmaustech.com)|192.168.0.204|:443... connected.
WARNING: The certificate of ‘assisted-service-assisted-installer.apps.kni1.schmaustech.com’ is not trusted.
WARNING: The certificate of ‘assisted-service-assisted-installer.apps.kni1.schmaustech.com’ hasn't got a known issuer.
HTTP request sent, awaiting response... 200 OK
Length: 109111296 (104M) [application/octet-stream]
Saving to: ‘discover.iso’

discover.iso                                         100%[=====================================================================================================================>] 104.06M   104MB/s    in 1.0s    

2021-05-26 15:16:14 (104 MB/s) - ‘discover.iso’ saved [109111296/109111296]

# ls -l *.iso
-rw-r--r--. 1 root root 109111296 May 26 15:16 discover.iso

Now that I have the image on the local hypervisor I can edit the virtual machine node to make sure the CDROM has the ISO image set to that path so it will boot to the RHCOS LiveOS ISO:

# virsh list --all
 Id   Name       State
---------------------------
 -    nuc4-vm1   shut off

# virsh dumpxml nuc4-vm1 | sed "/^    <disk device="cdrom" type="file">/a \ \ \ \ <source file="/var/lib/libvirt/images/discover.iso"></source>" | virsh define /dev/stdin

# virsh start nuc4-vm1
Domain nuc4-vm1 started

At this point we can watch the RHCOS LiveOS ISO boot from the virtual machines console.  If this is being done on a real server one could watch from the real servers BMC interface and/or iDRAC console if it is a Dell server.



Once the RHCOS Live ISO boots it will pull down an RHCOS image that will be applied to the local disk of the node.   At this point we can shift back to the cli and watch the progress of the install from there by watching the status of the agent cluster install using the syntax below.  One of the first things we notice is that during the initial install there seems to be an agent approval that is required:

$ oc get agentclusterinstalls test-cluster-virtual-aci -o json -n assisted-installer | jq '.status.conditions[]'
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The Spec has been successfully applied",
  "reason": "SyncOK",
  "status": "True",
  "type": "SpecSynced"
}
{
  "lastProbeTime": "2021-05-27T17:43:30Z",
  "lastTransitionTime": "2021-05-27T17:43:30Z",
  "message": "The installation is pending on the approval of 1 agents",
  "reason": "UnapprovedAgents",
  "status": "False",
  "type": "RequirementsMet"
}
{
  "lastProbeTime": "2021-05-27T17:43:30Z",
  "lastTransitionTime": "2021-05-27T17:43:30Z",
  "message": "The cluster's validations are passing",
  "reason": "ValidationsPassing",
  "status": "True",
  "type": "Validated"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation has not yet started",
  "reason": "InstallationNotStarted",
  "status": "False",
  "type": "Completed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation has not failed",
  "reason": "InstallationNotFailed",
  "status": "False",
  "type": "Failed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation is waiting to start or in progress",
  "reason": "InstallationNotStopped",
  "status": "False",
  "type": "Stopped"
}

We can view that approval requirement by looking at the agent installer from another view point like the syntax below.  Notice it says the agent is not approved and until it is the installation will wait and not continue.

$ oc get agents.agent-install.openshift.io -n assisted-installer  -o=jsonpath='{range .items[*]}{"\n"}{.spec.clusterDeploymentName.name}{"\n"}{.status.inventory.hostname}{"\n"}{range .status.conditions[*]}{.type}{"\t"}{.message}{"\n"}{end}'

test-cluster-virtual
master-0.kni5.schmaustech.com
SpecSynced	The Spec has been successfully applied
Connected	The agent's connection to the installation service is unimpaired
ReadyForInstallation	The agent is not approved
Validated	The agent's validations are passing
Installed	The installation has not yet started

We can view what cluster agents approved states yet another way by looking at the cluster status list:

$ oc get agents.agent-install.openshift.io -n assisted-installer
NAME                                   CLUSTER                APPROVED
e4117b8b-a2ef-45df-baf0-2ebc6ae1bf8e   test-cluster-virtual   false

Lets go ahead and approve this cluster by patching the approval to true using the syntax below:

$ oc -n assisted-installer patch agents.agent-install.openshift.io e4117b8b-a2ef-45df-baf0-2ebc6ae1bf8e -p '{"spec":{"approved":true}}' --type merge
agent.agent-install.openshift.io/e4117b8b-a2ef-45df-baf0-2ebc6ae1bf8e patched

Now that the approval has been made the cluster can continue on the installation process:

$ oc get agents.agent-install.openshift.io -n assisted-installer  -o=jsonpath='{range .items[*]}{"\n"}{.spec.clusterDeploymentName.name}{"\n"}{.status.inventory.hostname}{"\n"}{range .status.conditions[*]}{.type}{"\t"}{.message}{"\n"}{end}'

test-cluster-virtual
master-0.kni5.schmaustech.com
SpecSynced	The Spec has been successfully applied
Connected	The agent's connection to the installation service is unimpaired
ReadyForInstallation	The agent cannot begin the installation because it has already started
Validated	The agent's validations are passing
Installed	The installation is in progress: Host is preparing for installation

We can now see that the cluster is being prepared for installation:

$ oc get agentclusterinstalls test-cluster-virtual-aci -o json -n assisted-installer | jq '.status.conditions[]'
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The Spec has been successfully applied",
  "reason": "SyncOK",
  "status": "True",
  "type": "SpecSynced"
}
{
  "lastProbeTime": "2021-05-27T17:50:12Z",
  "lastTransitionTime": "2021-05-27T17:50:12Z",
  "message": "The cluster requirements are met",
  "reason": "ClusterAlreadyInstalling",
  "status": "True",
  "type": "RequirementsMet"
}
{
  "lastProbeTime": "2021-05-27T17:43:30Z",
  "lastTransitionTime": "2021-05-27T17:43:30Z",
  "message": "The cluster's validations are passing",
  "reason": "ValidationsPassing",
  "status": "True",
  "type": "Validated"
}
{
  "lastProbeTime": "2021-05-27T17:50:12Z",
  "lastTransitionTime": "2021-05-27T17:50:12Z",
  "message": "The installation is in progress: Preparing cluster for installation",
  "reason": "InstallationInProgress",
  "status": "False",
  "type": "Completed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation has not failed",
  "reason": "InstallationNotFailed",
  "status": "False",
  "type": "Failed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation is waiting to start or in progress",
  "reason": "InstallationNotStopped",
  "status": "False",
  "type": "Stopped"
}

As we wait a little longer we can now see the installation process has begun.   This took about 70 minutes in my virtualized environment:

$ oc get agentclusterinstalls test-cluster-virtual-aci -o json -n assisted-installer | jq '.status.conditions[]'
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The Spec has been successfully applied",
  "reason": "SyncOK",
  "status": "True",
  "type": "SpecSynced"
}
{
  "lastProbeTime": "2021-05-27T17:50:12Z",
  "lastTransitionTime": "2021-05-27T17:50:12Z",
  "message": "The cluster requirements are met",
  "reason": "ClusterAlreadyInstalling",
  "status": "True",
  "type": "RequirementsMet"
}
{
  "lastProbeTime": "2021-05-27T17:43:30Z",
  "lastTransitionTime": "2021-05-27T17:43:30Z",
  "message": "The cluster's validations are passing",
  "reason": "ValidationsPassing",
  "status": "True",
  "type": "Validated"
}
{
  "lastProbeTime": "2021-05-27T17:52:00Z",
  "lastTransitionTime": "2021-05-27T17:52:00Z",
  "message": "The installation is in progress: Installation in progress",
  "reason": "InstallationInProgress",
  "status": "False",
  "type": "Completed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation has not failed",
  "reason": "InstallationNotFailed",
  "status": "False",
  "type": "Failed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation is waiting to start or in progress",
  "reason": "InstallationNotStopped",
  "status": "False",
  "type": "Stopped"
}

As we continue to watch the status of the cluster installation via the agent cluster install we can see that the installation process is in the finalization phase:

$ oc get agentclusterinstalls test-cluster-virtual-aci -o json -n assisted-installer | jq '.status.conditions[]'
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The Spec has been successfully applied",
  "reason": "SyncOK",
  "status": "True",
  "type": "SpecSynced"
}
{
  "lastProbeTime": "2021-05-27T17:50:12Z",
  "lastTransitionTime": "2021-05-27T17:50:12Z",
  "message": "The cluster requirements are met",
  "reason": "ClusterAlreadyInstalling",
  "status": "True",
  "type": "RequirementsMet"
}
{
  "lastProbeTime": "2021-05-27T17:43:30Z",
  "lastTransitionTime": "2021-05-27T17:43:30Z",
  "message": "The cluster's validations are passing",
  "reason": "ValidationsPassing",
  "status": "True",
  "type": "Validated"
}
{
  "lastProbeTime": "2021-05-27T18:37:20Z",
  "lastTransitionTime": "2021-05-27T18:37:20Z",
  "message": "The installation is in progress: Finalizing cluster installation",
  "reason": "InstallationInProgress",
  "status": "False",
  "type": "Completed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation has not failed",
  "reason": "InstallationNotFailed",
  "status": "False",
  "type": "Failed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation is waiting to start or in progress",
  "reason": "InstallationNotStopped",
  "status": "False",
  "type": "Stopped"
}

And finally after 70 minutes we can see the cluster completed installation:

$ oc get agentclusterinstalls test-cluster-virtual-aci -o json -n assisted-installer | jq '.status.conditions[]'
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The Spec has been successfully applied",
  "reason": "SyncOK",
  "status": "True",
  "type": "SpecSynced"
}
{
  "lastProbeTime": "2021-05-27T18:50:00Z",
  "lastTransitionTime": "2021-05-27T18:50:00Z",
  "message": "The cluster installation stopped",
  "reason": "ClusterInstallationStopped",
  "status": "True",
  "type": "RequirementsMet"
}
{
  "lastProbeTime": "2021-05-27T17:43:30Z",
  "lastTransitionTime": "2021-05-27T17:43:30Z",
  "message": "The cluster's validations are passing",
  "reason": "ValidationsPassing",
  "status": "True",
  "type": "Validated"
}
{
  "lastProbeTime": "2021-05-27T18:50:00Z",
  "lastTransitionTime": "2021-05-27T18:50:00Z",
  "message": "The installation has completed: Cluster is installed",
  "reason": "InstallationCompleted",
  "status": "True",
  "type": "Completed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation has not failed",
  "reason": "InstallationNotFailed",
  "status": "False",
  "type": "Failed"
}
{
  "lastProbeTime": "2021-05-27T18:50:00Z",
  "lastTransitionTime": "2021-05-27T18:50:00Z",
  "message": "The installation has stopped because it completed successfully",
  "reason": "InstallationCompleted",
  "status": "True",
  "type": "Stopped"
}

Now lets validate that the cluster is indeed installed and functioning correctly.   To do this we need to first extract the kubeconfig secret from our hub cluster and then set it as the KUBECONFIG variable:

$ oc get secret -n assisted-installer test-cluster-virtual-admin-kubeconfig -o json | jq -r '.data.kubeconfig' | base64 -d > /tmp/sno-spoke-kubeconfig 
$ export KUBECONFIG=/tmp/sno-spoke-kubeconfig

Now lets run some oc commands.  First we will look at the node count with a wide view:

$ oc get nodes -o wide
NAME                            STATUS   ROLES           AGE   VERSION                INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION          CONTAINER-RUNTIME
master-0.kni5.schmaustech.com   Ready    master,worker   53m   v1.21.0-rc.0+291e731   192.168.0.200   <none>        Red Hat Enterprise Linux CoreOS 48.84.202105062123-0 (Ootpa)   4.18.0-293.el8.x86_64   cri-o://1.21.0-90.rhaos4.8.git07becf8.el8

Next we will confirm all the cluster operators are up and available:

$ oc get co
NAME                                       VERSION      AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.8.0-fc.3   True        False         False      9m35s
baremetal                                  4.8.0-fc.3   True        False         False      30m
cloud-credential                           4.8.0-fc.3   True        False         False      48m
cluster-autoscaler                         4.8.0-fc.3   True        False         False      30m
config-operator                            4.8.0-fc.3   True        False         False      50m
console                                    4.8.0-fc.3   True        False         False      9m46s
csi-snapshot-controller                    4.8.0-fc.3   True        False         False      9m23s
dns                                        4.8.0-fc.3   True        False         False      22m
etcd                                       4.8.0-fc.3   True        False         False      31m
image-registry                             4.8.0-fc.3   True        False         False      21m
ingress                                    4.8.0-fc.3   True        False         False      14m
insights                                   4.8.0-fc.3   True        False         False      13m
kube-apiserver                             4.8.0-fc.3   True        False         False      22m
kube-controller-manager                    4.8.0-fc.3   True        False         False      22m
kube-scheduler                             4.8.0-fc.3   True        False         False      29m
kube-storage-version-migrator              4.8.0-fc.3   True        False         False      31m
machine-api                                4.8.0-fc.3   True        False         False      30m
machine-approver                           4.8.0-fc.3   True        False         False      48m
machine-config                             4.8.0-fc.3   True        False         False      20m
marketplace                                4.8.0-fc.3   True        False         False      30m
monitoring                                 4.8.0-fc.3   True        False         False      9m24s
network                                    4.8.0-fc.3   True        False         False      51m
node-tuning                                4.8.0-fc.3   True        False         False      22m
openshift-apiserver                        4.8.0-fc.3   True        False         False      22m
openshift-controller-manager               4.8.0-fc.3   True        False         False      30m
openshift-samples                          4.8.0-fc.3   True        False         False      21m
operator-lifecycle-manager                 4.8.0-fc.3   True        False         False      30m
operator-lifecycle-manager-catalog         4.8.0-fc.3   True        False         False      48m
operator-lifecycle-manager-packageserver   4.8.0-fc.3   True        False         False      6m34s
service-ca                                 4.8.0-fc.3   True        False         False      50m
storage                                    4.8.0-fc.3   True        False         False      30m

And finally we can check the cluster version:

$ oc get clusterversion
NAME      VERSION      AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-fc.3   True        False         6m26s   Cluster version is 4.8.0-fc.3

Everything in the installation appears to be working for this SNO based deployment!