Thursday, May 27, 2021

Deploying Single Node OpenShift (SNO) from Existing OpenShift Cluster


 

Making SNO in the summer has never been easier with a little help from Hive and the Assisted Installer operators in OpenShift.   If this sounds like something of interested then please read further on as I step through the method to get a Single Node OpenShift (SNO) deployed from an existing OpenShift cluster.

The first thing I will need to perform the procedure will be to have an existing OpenShift cluster running on 4.8.  In my case I am using a pre-release version of 4.8.0-fc3 running on an existing SNO deployed cluster which is a virtual machine.  Further I will need another unused virtual node that will become my new SNO OpenShift cluster. 

Now that I have identified my environment lets go ahead and start the configuration process.  First we need to enable and configure the Local-Storage operator so that we can provide some PVs that can be consumed by the AI operator for the Postgres and bucket requirements of that operator.  Note that any dynamic storage provider can be used for this but in my environment Local-Storage made the most sense.   First lets create the local-storage-operator.yaml:

$ cat << EOF > ~/local-storage-operator.yaml
---
apiVersion: v1
kind: Namespace
metadata:
  name: openshift-local-storage
spec: {}
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: openshift-local-storage
  namespace: openshift-local-storage
spec:
  targetNamespaces:
  - openshift-local-storage
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: local-storage-operator
  namespace: openshift-local-storage
spec:
  channel: "4.7"
  installPlanApproval: Automatic
  name: local-storage-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
EOF

Now lets use the local-storage-operator.yaml file we created to install the operator:

$ oc create -f ~/local-storage-operator.yaml 
namespace/openshift-local-storage created
operatorgroup.operators.coreos.com/openshift-local-storage created
subscription.operators.coreos.com/local-storage-operator created

Once the operator is created in a few minutes we should see a running pod in the openshift-local-storage namespace:

$ oc get pods -n openshift-local-storage
NAME                                      READY   STATUS    RESTARTS   AGE
local-storage-operator-845457cd85-ttb8g   1/1     Running   0          37s

Now that the operator is installed and running we can go ahead and configure a hive-local-storage.yaml to consume any of the disks we have assigned on our worker nodes.  In my example since I have a single master/worker virtual machine I went ahead and added a bunch of small qcow2 disks.   The devices paths might vary depending on the environment but the rest of the content should be similar to the following:

$ cat << EOF > ~/hive-local-storage.yaml
apiVersion: local.storage.openshift.io/v1
kind: LocalVolume
metadata:
  name: fs
  namespace: openshift-local-storage
spec:
  logLevel: Normal
  managementState: Managed
  storageClassDevices:
    - devicePaths:
        - /dev/sdb
        - /dev/sdc
        - /dev/sdd
        - /dev/sde
        - /dev/sdf
        - /dev/sdg
        - /dev/sdh
        - /dev/sdi
        - /dev/sdj
        - /dev/sdk
        - /dev/sdl
        - /dev/sdm
      fsType: ext4
      storageClassName: local-storage
      volumeMode: Filesystem
EOF

With the hive-local-storage.yaml created we can now create the resource:

$ oc create -f hive-local-storage.yaml 
localvolume.local.storage.openshift.io/fs created

Once it has created we can verify everything is working properly by looking at the additional pods that are running in the openshift-local-storage namespace:

$ oc get pods -n openshift-local-storage
NAME                                      READY   STATUS    RESTARTS   AGE
fs-local-diskmaker-nv5xr                  1/1     Running   0          46s
fs-local-provisioner-9dt2m                1/1     Running   0          46s
local-storage-operator-845457cd85-ttb8g   1/1     Running   0          4m25s


We can also confirm if our disks were picked up by looking at the PVs available on the cluster and the local-storage storageclass that is now defined:

$ oc get pv
NAME                CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS    REASON   AGE
local-pv-188fc254   20Gi       RWO            Delete           Available           local-storage            33s
local-pv-6d45f357   20Gi       RWO            Delete           Available           local-storage            33s
local-pv-96d2cc66   10Gi       RWO            Delete           Available           local-storage            33s
local-pv-99a52316   20Gi       RWO            Delete           Available           local-storage            33s
local-pv-9e0442ea   10Gi       RWO            Delete           Available           local-storage            33s
local-pv-c061aa19   20Gi       RWO            Delete           Available           local-storage            33s
local-pv-c26659da   20Gi       RWO            Delete           Available           local-storage            33s
local-pv-d08519a8   10Gi       RWO            Delete           Available           local-storage            33s
local-pv-d2f2a467   10Gi       RWO            Delete           Available           local-storage            33s
local-pv-d4a12edd   20Gi       RWO            Delete           Available           local-storage            33s
local-pv-f5e1ca69   10Gi       RWO            Delete           Available           local-storage            33s
local-pv-ffdb70b    10Gi       RWO            Delete           Available           local-storage            33s

$ oc get sc
NAME            PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-storage   kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  72s

Because I want PVCs to automatically get their storage from the local-storage storageclass I am going to go ahead and patch the storageclass setting it to default.   I can confirm this by looking at the storageclasses again:

$ oc patch storageclass local-storage -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
storageclass.storage.k8s.io/local-storage patched
$ oc get sc
NAME                      PROVISIONER                    RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-storage (default)   kubernetes.io/no-provisioner   Delete          WaitForFirstConsumer   false                  2m14s

Now that we have the local-storage configured we can move onto getting Hive installed.   Lets go ahead and create the hive-operator.yaml below:

$ cat << EOF > ~/hive-operator.yaml
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: hive-operator
  namespace: openshift-operators
spec:
  channel: alpha
  installPlanApproval: Automatic
  name: hive-operator
  source: community-operators
  sourceNamespace: openshift-marketplace
  startingCSV: hive-operator.v1.1.4
EOF

And then lets use oc create with the yaml we created to install the Hive operator:

$ oc create -f hive-operator.yaml
subscription.operators.coreos.com/hive-operator created

We can confirm the Hive operator is installed by looking at the operators and specifically grabbing the Hive operator.   If we look at the pods under the hive namespace we can see there are no pods and this is completely normal:

$ oc get operators hive-operator.openshift-operators
NAME                                AGE
hive-operator.openshift-operators   2m28s
$ oc get pods -n hive
No resources found in hive namespace.

One thing the Hive operator does seem to do is create an assisted-installer namespace.  This namespace creates and issue once the assisted -installer operator is installed for postgres as identified in this BZ#1951812.  Because of that we are going to delete the assisted-installer namespace.  It will get recreated in the next steps:

$ oc delete namespace assisted-installer
namespace "assisted-installer" deleted

Now we are ready to install the Assisted-Installer operator.  Before we can install the operator though we need to create a catalog resource file like the one below:

$ cat << EOF > ~/assisted-installer-catsource.yaml
---
apiVersion: v1
kind: Namespace
metadata:
  name: assisted-installer
  labels:
    name: assisted-installer
---
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: assisted-service
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/ocpmetal/assisted-service-index:latest
EOF

We also need to create the Assisted-Installer operator subscription yaml:

$ cat << EOF > ~/assisted-installer-operator.yaml
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: assisted-service-operator
  namespace: assisted-installer
spec:
  targetNamespaces:
  - assisted-installer
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: assisted-service-operator
  namespace: assisted-installer 
spec:
  channel: alpha
  installPlanApproval: Automatic
  name: assisted-service-operator
  source: assisted-service
  sourceNamespace: openshift-marketplace
EOF

With both files created we can go ahead and run the oc create commands against them.  First doing the Assisted-Installer catalog source file and then the Assisted-Installer subscription file that will install the operator:

$ oc create -f assisted-installer-catsource.yaml
namespace/assisted-installer created
catalogsource.operators.coreos.com/assisted-service created

$ oc create -f assisted-installer-operator.yaml
operatorgroup.operators.coreos.com/assisted-service-operator created
subscription.operators.coreos.com/assisted-service-operator created

We can confirm the operator is installed by looking at the running pods under the assisted-installer namespace:

$ oc get pods -n assisted-installer
NAME                                         READY   STATUS    RESTARTS   AGE
assisted-service-operator-579679d899-x982l   1/1     Running   0          56s

Finally to complete the installation of the Assisted-Installer we need to configure the agent service config file like the example one below.   The storage sizes can be larger if needed but I am using 20GB as that is what volume sizes are available from the local-storage I configured in my environment:

$ cat << EOF > ~/assisted-installer-agentserviceconfig.yaml
apiVersion: agent-install.openshift.io/v1beta1
kind: AgentServiceConfig
metadata:
  name: agent
spec:
  databaseStorage:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 20Gi
  filesystemStorage:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 20Gi
  osImages:
    - openshiftVersion: '4.8'
      rootFSUrl: >-
        https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/4.8.0-fc.3/rhcos-live-rootfs.x86_64.img
      url: >-
        https://mirror.openshift.com/pub/openshift-v4/dependencies/rhcos/pre-release/4.8.0-fc.3/rhcos-4.8.0-fc.3-x86_64-live.x86_64.iso
      version: 48.84.202105062123-0
EOF

Once we have created the agent service config we can go ahead and apply it to the hub cluster:

$ oc create -f ~/assisted-installer-agentserviceconfig.yaml
agentserviceconfig.agent-install.openshift.io/agent created

We can confirm everything is running by looking at both the pods under the assisted-installer namespace and also by looking at the PVCs consumed by the assisted-installer namespace:

$ oc get pods -n assisted-installer
NAME                                         READY   STATUS    RESTARTS   AGE
assisted-service-b7dc8b8d7-2cztd             1/2     Running   1          53s
assisted-service-operator-579679d899-x982l   1/1     Running   0          3m50s

$ oc get pvc -n assisted-installer
NAME               STATUS   VOLUME              CAPACITY   ACCESS MODES   STORAGECLASS    AGE
assisted-service   Bound    local-pv-99a52316   20Gi       RWO            local-storage   87s
postgres           Bound    local-pv-6d45f357   20Gi       RWO            local-storage   87s

At this point we have configured and confirmed all the required service operators needed to enable us to do a deployment of OpenShift with the Assisted-Installer.  Now this configuration will allow us to deploy any one of the OpenShift deployments: Multi-Node-IPv4, SNO-IPv4, Multi-Node-IPv6, SNO-IPv6 and SNO-Dual-Stack.   For demonstration purposes I will be using the SNO-IPv4 deployment type.

Before we start the deployment I need to create some resource yamls that we will apply to the hub cluster to enable the deployment process.  The first file is the cluster imageset yaml which tells the Assisted-Installer which OpenShift release we are going to use.  In my example we will be using 4.8.0-fc.3.   Create the following assisted-installer-clusterimageset.yaml below and then apply it to the hub cluster:

$ cat << EOF > ~/assisted-installer-clusterimageset.yaml
apiVersion: hive.openshift.io/v1
kind: ClusterImageSet
metadata:
  name: openshift-v4.8.0
  namespace: assisted-installer
spec:
  releaseImage: quay.io/openshift-release-dev/ocp-release:4.8.0-fc.3-x86_64
EOF

$ oc create -f ~/assisted-installer-clusterimageset.yaml
clusterimageset.hive.openshift.io/openshift-v4.8.0 created

The next resource file we need is the Assisted-Installer pullsecret.   This contains the pull-secret used to authenticate to pull down the images from Quay during deployment.   Note that the "OPENSHIFT-PULL-SECRET-HERE" should be replaced with a real pull secret from cloud.redhat.com.  Create the following assisted-installer-secrets.yaml and then apply it to the hub cluster:

$ cat << EOF > ~/assisted-installer-secrets.yaml
apiVersion: v1
kind: Secret
metadata:
  name: assisted-deployment-pull-secret
  namespace: assisted-installer
stringData:
  .dockerconfigjson: 'OPENSHIFT-PULL-SECRET-HERE'
EOF

$ oc create -f ~/assisted-installer-secrets.yaml
secret/assisted-deployment-pull-secret created

Next we need a resource file that defines the ssh private key to be used.  This private key will enable us to login to the OpenShift nodes we deploy should we ever need to do troubleshooting of the cluster nodes.   Create the assisted-installer-sshprivate.yaml and then apply it to the hub cluster:

$ cat << EOF > ~/assisted-installer-sshprivate.yaml
apiVersion: v1
kind: Secret
metadata:
  name: assisted-deployment-ssh-private-key
  namespace: assisted-installer
stringData:
  ssh-privatekey: |-
    -----BEGIN OPENSSH PRIVATE KEY-----
    b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAABlwAAAAdzc2gtcn
    NhAAAAAwEAAQAAAYEA7uOSmvd8CgAUDaqGheAUcBsEOOoFAZYqtLKL9N0HameO6Fhv1t/l
    a4tG8BQMiu3pm5DWpRrq/O12OjjVDOHHSjwcMX/qfn8OKNVtPVq0SMZRbbkkpnK2WLMwLg
    ...
    8QT4AK4mb7H8tHo1RQkOB4foAQwPLXHvRBHrEGXnIugAeCszn8twZruRtcoX2jRiw7MS8B
    R+AuTLBeBwEXYGoxFhsaLhiCVUueEKJDUt66tVCr3ovvz8eapWv1LUM2QGeP56Z5QUsIrl
    wJwTtficCtwxK0XL+gJro9qYslbX2XxVD67goxVecIfNVmxtZ8KHeo6ICLkhOJjTAveAm+
    tF77qty2d0d0UAAAAXYnNjaG1hdXNAcmhlbDgtb2NwLWF1dG8BAgME
    -----END OPENSSH PRIVATE KEY-----
type: Opaque
EOF

$ oc create -f ~/assisted-installer-sshprivate.yaml
secret/assisted-deployment-ssh-private-key created

Next we need an agent cluster install resource configured.  This file contains some of the networking details one might find in the install-config.yaml when doing a OpenShift IPI installation.   Generate the assisted-installer-agentclusterinstall.yaml file and then apply it to the hub cluster:

$ cat << EOF > ~/assisted-installer-agentclusterinstall.yaml
---
apiVersion: extensions.hive.openshift.io/v1beta1
kind: AgentClusterInstall
metadata:
  name: test-cluster-virtual-aci
  namespace: assisted-installer
spec:
  clusterDeploymentRef:
    name: test-cluster-virtual
  imageSetRef:
    name: openshift-v4.8.0
  networking:
    clusterNetwork:
      - cidr: "10.128.0.0/14"
        hostPrefix: 23
    serviceNetwork:
      - "172.30.0.0/16"
    machineNetwork:
      - cidr: "192.168.0.0/24"
  provisionRequirements:
    controlPlaneAgents: 1
  sshPublicKey: 'ssh-rsa AAB3NzaC1yc2EAAAADAQABAAABgQDu45Ka93wKABQNqoaF4BRwGwQ46gUBliq0sov03QdqZ47oWG/W3+Vri0bwFAyK7embkNalGur87XY6ONUM4cdKPBwxf+p+fw4o1W09WrRIxlFtuSSmcrZYszAuD3NlHDS2WbXRHbeaMz8QsA9qNceZZj7PyB+fNULJYS2iNtlIING4DZTvoEr6KCe2cOIdDnjOW0xNng+ejKMe3vszwutLqgPMwQXS2tqJSHOMIS1kDdLFd3ZIT25xvORoNC/PWy5NAkxp9gJbrYEDlcvy7eUxqLAM9lTCWJQ0l31gqIQDOkGmS7M41x3nJ6kG1ilLn1miMfwrGKDramp7+8ejfr0QBspikp9S4Tmj6s0PA/lEL6EST4N12sr+GsJkHqhWa+HwutcMbg0qIFwhUMFN689Q84Tz9abSPnQipn09xNv1YtAEQLJxypZj3NB6/ZYZXWjB/IGgbJ4tifD1cTYdSA1A4UeNyvhIFj9yyjRF8mucGUR873xSMTlcLIq3V0aZqXU= bschmaus@rhel8-ocp-auto'
EOF

$ oc create -f ~/assisted-installer-agentclusterinstall.yaml
agentclusterinstall.extensions.hive.openshift.io/test-cluster-virtual-aci created

Finally we need the cluster deployment yaml which will enable the deployment of the cluster we are going to deploy.   Create the following assisted-installer-clusterdeployment.yaml file and then apply it to the hub cluster:

$ cat << EOF > ~/assisted-installer-clusterdeployment.yaml
---
apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
  name: test-cluster-virtual
  namespace: assisted-installer
spec:
  baseDomain: schmaustech.com
  clusterName: kni3
  controlPlaneConfig:
    servingCertificates: {}
  installed: false
  clusterInstallRef:
    group: extensions.hive.openshift.io
    kind: AgentClusterInstall
    name: test-cluster-virtual-aci
    version: v1beta1
  platform:
    agentBareMetal: 
      agentSelector:
        matchLabels:
          bla: "aaa"
  pullSecretRef:
    name: assisted-deployment-pull-secret
EOF

$ oc create -f ~/assisted-installer-clusterdeployment.yaml
clusterdeployment.hive.openshift.io/test-cluster-virtual created

Last but not least we have an infrastructure environment file which binds a lot of the previous files together.   Create the assisted-installer-infraenv.yaml file below and then apply it to the hub cluster: 

$ cat << EOF > ~/assisted-installer-infraenv.yaml
---
apiVersion: agent-install.openshift.io/v1beta1 
kind: InfraEnv
metadata:
  name: test-cluster-virtual-infraenv
  namespace: assisted-installer
spec:
  clusterRef:
    name: test-cluster-virtual
    namespace: assisted-installer
  sshAuthorizedKey: 'ssh-rsa AAB3NzaC1yc2EAAAADAQABAAABgQDu45Ka93wKABQNqoaF4BRwGwQ46gUBliq0sov03QdqZ47oWG/W3+Vri0bwFAyK7embkNalGur87XY6ONUM4cdKPBwxf+p+fw4o1W09WrRIxlFtuSSmcrZYszAuD3NlHDS2WbXRHbeaMz8QsA9qNceZZj7PyB+fNULJYS2iNtlIING4DZTvoEr6KCe2cOIdDnjOW0xNng+ejKMe3vszwutLqgPMwQXS2tqJSHOMIS1kDdLFd3ZIT25xvORoNC/PWy5NAkxp9gJbrYEDlcvy7eUxqLAM9lTCWJQ0l31gqIQDOkGmS7M41x3nJ6kG1ilLn1miMfwrGKDramp7+8ejfr0QBspikp9S4Tmj6s0PA/lEL6EST4N12sr+GsJkHqhWa+HwutcMbg0qIFwhUMFN689Q84Tz9abSPnQipn09xNv1YtAEQLJxypZj3NB6/ZYZXWjB/IGgbJ4tifD1cTYdSA1A4UeNyvhIFj9yyjRF8mucGUR873xSMTlcLIq3V0aZqXU= bschmaus@rhel8-ocp-auto'
  agentLabelSelector:
    matchLabels:
      bla: aaa
  pullSecretRef:
    name: assisted-deployment-pull-secret
EOF

$ oc create -f ~/assisted-installer-infraenv.yaml
infraenv.agent-install.openshift.io/test-cluster-virtual-infraenv created

Once all of the resource files have been applied to the hub cluster we should now be able to extract the RHCOS LiveOS ISO download URL for the image we will use to boot up our single node for our  spoke SNO IPv4 deployment.  We can do that by running the following command:

$ oc get infraenv test-cluster-virtual-infraenv -o jsonpath='{.status.isoDownloadURL}' -n assisted-installer
https://assisted-service-assisted-installer.apps.kni1.schmaustech.com/api/assisted-install/v1/clusters/b38c1d3e-e460-4111-a35f-4a8d79203585/downloads/image.iso?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbHVzdGVyX2lkIjoiYjM4YzFkM2UtZTQ2MC00MTExLWEzNWYtNGE4ZDc5MjAzNTg1In0.0sjy-0I9DstyaRA8oIUF9ByyUe31Kl6rUpVzBXSsO9mFfqLCDtF-Rh2NCWvVtjKyd4BZ7Zo5ZUIMsEtHX5sKWg

Now that we know the URL to the ISO image we can pull that image down to a location that can be accessed by our remote physical node via virtual media (iDrac/BMC).   In my case since the spoke SNO node I am using is a virtual machine I will be running a wget command on the hypervisor hosts where my virtual machine resides and storing the ISO under the /var/lib/libvirt/images path on that host:

# pwd
/var/lib/libvirt/images

# wget --no-check-certificate https://assisted-service-assisted-installer.apps.kni1.schmaustech.com/api/assisted-install/v1/clusters/b38c1d3e-e460-4111-a35f-4a8d79203585/downloads/image.iso?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbHVzdGVyX2lkIjoiYjM4YzFkM2UtZTQ2MC00MTExLWEzNWYtNGE4ZDc5MjAzNTg1In0.0sjy-0I9DstyaRA8oIUF9ByyUe31Kl6rUpVzBXSsO9mFfqLCDtF-Rh2NCWvVtjKyd4BZ7Zo5ZUIMsEtHX5sKWg -O discover.iso
--2021-05-26 15:16:13--  https://assisted-service-assisted-installer.apps.kni1.schmaustech.com/api/assisted-install/v1/clusters/b38c1d3e-e460-4111-a35f-4a8d79203585/downloads/image.iso?api_key=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJjbHVzdGVyX2lkIjoiYjM4YzFkM2UtZTQ2MC00MTExLWEzNWYtNGE4ZDc5MjAzNTg1In0.0sjy-0I9DstyaRA8oIUF9ByyUe31Kl6rUpVzBXSsO9mFfqLCDtF-Rh2NCWvVtjKyd4BZ7Zo5ZUIMsEtHX5sKWg
Resolving assisted-service-assisted-installer.apps.kni1.schmaustech.com (assisted-service-assisted-installer.apps.kni1.schmaustech.com)... 192.168.0.204
Connecting to assisted-service-assisted-installer.apps.kni1.schmaustech.com (assisted-service-assisted-installer.apps.kni1.schmaustech.com)|192.168.0.204|:443... connected.
WARNING: The certificate of ‘assisted-service-assisted-installer.apps.kni1.schmaustech.com’ is not trusted.
WARNING: The certificate of ‘assisted-service-assisted-installer.apps.kni1.schmaustech.com’ hasn't got a known issuer.
HTTP request sent, awaiting response... 200 OK
Length: 109111296 (104M) [application/octet-stream]
Saving to: ‘discover.iso’

discover.iso                                         100%[=====================================================================================================================>] 104.06M   104MB/s    in 1.0s    

2021-05-26 15:16:14 (104 MB/s) - ‘discover.iso’ saved [109111296/109111296]

# ls -l *.iso
-rw-r--r--. 1 root root 109111296 May 26 15:16 discover.iso

Now that I have the image on the local hypervisor I can edit the virtual machine node to make sure the CDROM has the ISO image set to that path so it will boot to the RHCOS LiveOS ISO:

# virsh list --all
 Id   Name       State
---------------------------
 -    nuc4-vm1   shut off

# virsh dumpxml nuc4-vm1 | sed "/^    <disk device="cdrom" type="file">/a \ \ \ \ <source file="/var/lib/libvirt/images/discover.iso"></source>" | virsh define /dev/stdin

# virsh start nuc4-vm1
Domain nuc4-vm1 started

At this point we can watch the RHCOS LiveOS ISO boot from the virtual machines console.  If this is being done on a real server one could watch from the real servers BMC interface and/or iDRAC console if it is a Dell server.



Once the RHCOS Live ISO boots it will pull down an RHCOS image that will be applied to the local disk of the node.   At this point we can shift back to the cli and watch the progress of the install from there by watching the status of the agent cluster install using the syntax below.  One of the first things we notice is that during the initial install there seems to be an agent approval that is required:

$ oc get agentclusterinstalls test-cluster-virtual-aci -o json -n assisted-installer | jq '.status.conditions[]'
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The Spec has been successfully applied",
  "reason": "SyncOK",
  "status": "True",
  "type": "SpecSynced"
}
{
  "lastProbeTime": "2021-05-27T17:43:30Z",
  "lastTransitionTime": "2021-05-27T17:43:30Z",
  "message": "The installation is pending on the approval of 1 agents",
  "reason": "UnapprovedAgents",
  "status": "False",
  "type": "RequirementsMet"
}
{
  "lastProbeTime": "2021-05-27T17:43:30Z",
  "lastTransitionTime": "2021-05-27T17:43:30Z",
  "message": "The cluster's validations are passing",
  "reason": "ValidationsPassing",
  "status": "True",
  "type": "Validated"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation has not yet started",
  "reason": "InstallationNotStarted",
  "status": "False",
  "type": "Completed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation has not failed",
  "reason": "InstallationNotFailed",
  "status": "False",
  "type": "Failed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation is waiting to start or in progress",
  "reason": "InstallationNotStopped",
  "status": "False",
  "type": "Stopped"
}

We can view that approval requirement by looking at the agent installer from another view point like the syntax below.  Notice it says the agent is not approved and until it is the installation will wait and not continue.

$ oc get agents.agent-install.openshift.io -n assisted-installer  -o=jsonpath='{range .items[*]}{"\n"}{.spec.clusterDeploymentName.name}{"\n"}{.status.inventory.hostname}{"\n"}{range .status.conditions[*]}{.type}{"\t"}{.message}{"\n"}{end}'

test-cluster-virtual
master-0.kni5.schmaustech.com
SpecSynced	The Spec has been successfully applied
Connected	The agent's connection to the installation service is unimpaired
ReadyForInstallation	The agent is not approved
Validated	The agent's validations are passing
Installed	The installation has not yet started

We can view what cluster agents approved states yet another way by looking at the cluster status list:

$ oc get agents.agent-install.openshift.io -n assisted-installer
NAME                                   CLUSTER                APPROVED
e4117b8b-a2ef-45df-baf0-2ebc6ae1bf8e   test-cluster-virtual   false

Lets go ahead and approve this cluster by patching the approval to true using the syntax below:

$ oc -n assisted-installer patch agents.agent-install.openshift.io e4117b8b-a2ef-45df-baf0-2ebc6ae1bf8e -p '{"spec":{"approved":true}}' --type merge
agent.agent-install.openshift.io/e4117b8b-a2ef-45df-baf0-2ebc6ae1bf8e patched

Now that the approval has been made the cluster can continue on the installation process:

$ oc get agents.agent-install.openshift.io -n assisted-installer  -o=jsonpath='{range .items[*]}{"\n"}{.spec.clusterDeploymentName.name}{"\n"}{.status.inventory.hostname}{"\n"}{range .status.conditions[*]}{.type}{"\t"}{.message}{"\n"}{end}'

test-cluster-virtual
master-0.kni5.schmaustech.com
SpecSynced	The Spec has been successfully applied
Connected	The agent's connection to the installation service is unimpaired
ReadyForInstallation	The agent cannot begin the installation because it has already started
Validated	The agent's validations are passing
Installed	The installation is in progress: Host is preparing for installation

We can now see that the cluster is being prepared for installation:

$ oc get agentclusterinstalls test-cluster-virtual-aci -o json -n assisted-installer | jq '.status.conditions[]'
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The Spec has been successfully applied",
  "reason": "SyncOK",
  "status": "True",
  "type": "SpecSynced"
}
{
  "lastProbeTime": "2021-05-27T17:50:12Z",
  "lastTransitionTime": "2021-05-27T17:50:12Z",
  "message": "The cluster requirements are met",
  "reason": "ClusterAlreadyInstalling",
  "status": "True",
  "type": "RequirementsMet"
}
{
  "lastProbeTime": "2021-05-27T17:43:30Z",
  "lastTransitionTime": "2021-05-27T17:43:30Z",
  "message": "The cluster's validations are passing",
  "reason": "ValidationsPassing",
  "status": "True",
  "type": "Validated"
}
{
  "lastProbeTime": "2021-05-27T17:50:12Z",
  "lastTransitionTime": "2021-05-27T17:50:12Z",
  "message": "The installation is in progress: Preparing cluster for installation",
  "reason": "InstallationInProgress",
  "status": "False",
  "type": "Completed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation has not failed",
  "reason": "InstallationNotFailed",
  "status": "False",
  "type": "Failed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation is waiting to start or in progress",
  "reason": "InstallationNotStopped",
  "status": "False",
  "type": "Stopped"
}

As we wait a little longer we can now see the installation process has begun.   This took about 70 minutes in my virtualized environment:

$ oc get agentclusterinstalls test-cluster-virtual-aci -o json -n assisted-installer | jq '.status.conditions[]'
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The Spec has been successfully applied",
  "reason": "SyncOK",
  "status": "True",
  "type": "SpecSynced"
}
{
  "lastProbeTime": "2021-05-27T17:50:12Z",
  "lastTransitionTime": "2021-05-27T17:50:12Z",
  "message": "The cluster requirements are met",
  "reason": "ClusterAlreadyInstalling",
  "status": "True",
  "type": "RequirementsMet"
}
{
  "lastProbeTime": "2021-05-27T17:43:30Z",
  "lastTransitionTime": "2021-05-27T17:43:30Z",
  "message": "The cluster's validations are passing",
  "reason": "ValidationsPassing",
  "status": "True",
  "type": "Validated"
}
{
  "lastProbeTime": "2021-05-27T17:52:00Z",
  "lastTransitionTime": "2021-05-27T17:52:00Z",
  "message": "The installation is in progress: Installation in progress",
  "reason": "InstallationInProgress",
  "status": "False",
  "type": "Completed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation has not failed",
  "reason": "InstallationNotFailed",
  "status": "False",
  "type": "Failed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation is waiting to start or in progress",
  "reason": "InstallationNotStopped",
  "status": "False",
  "type": "Stopped"
}

As we continue to watch the status of the cluster installation via the agent cluster install we can see that the installation process is in the finalization phase:

$ oc get agentclusterinstalls test-cluster-virtual-aci -o json -n assisted-installer | jq '.status.conditions[]'
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The Spec has been successfully applied",
  "reason": "SyncOK",
  "status": "True",
  "type": "SpecSynced"
}
{
  "lastProbeTime": "2021-05-27T17:50:12Z",
  "lastTransitionTime": "2021-05-27T17:50:12Z",
  "message": "The cluster requirements are met",
  "reason": "ClusterAlreadyInstalling",
  "status": "True",
  "type": "RequirementsMet"
}
{
  "lastProbeTime": "2021-05-27T17:43:30Z",
  "lastTransitionTime": "2021-05-27T17:43:30Z",
  "message": "The cluster's validations are passing",
  "reason": "ValidationsPassing",
  "status": "True",
  "type": "Validated"
}
{
  "lastProbeTime": "2021-05-27T18:37:20Z",
  "lastTransitionTime": "2021-05-27T18:37:20Z",
  "message": "The installation is in progress: Finalizing cluster installation",
  "reason": "InstallationInProgress",
  "status": "False",
  "type": "Completed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation has not failed",
  "reason": "InstallationNotFailed",
  "status": "False",
  "type": "Failed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation is waiting to start or in progress",
  "reason": "InstallationNotStopped",
  "status": "False",
  "type": "Stopped"
}

And finally after 70 minutes we can see the cluster completed installation:

$ oc get agentclusterinstalls test-cluster-virtual-aci -o json -n assisted-installer | jq '.status.conditions[]'
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The Spec has been successfully applied",
  "reason": "SyncOK",
  "status": "True",
  "type": "SpecSynced"
}
{
  "lastProbeTime": "2021-05-27T18:50:00Z",
  "lastTransitionTime": "2021-05-27T18:50:00Z",
  "message": "The cluster installation stopped",
  "reason": "ClusterInstallationStopped",
  "status": "True",
  "type": "RequirementsMet"
}
{
  "lastProbeTime": "2021-05-27T17:43:30Z",
  "lastTransitionTime": "2021-05-27T17:43:30Z",
  "message": "The cluster's validations are passing",
  "reason": "ValidationsPassing",
  "status": "True",
  "type": "Validated"
}
{
  "lastProbeTime": "2021-05-27T18:50:00Z",
  "lastTransitionTime": "2021-05-27T18:50:00Z",
  "message": "The installation has completed: Cluster is installed",
  "reason": "InstallationCompleted",
  "status": "True",
  "type": "Completed"
}
{
  "lastProbeTime": "2021-05-26T20:07:00Z",
  "lastTransitionTime": "2021-05-26T20:07:00Z",
  "message": "The installation has not failed",
  "reason": "InstallationNotFailed",
  "status": "False",
  "type": "Failed"
}
{
  "lastProbeTime": "2021-05-27T18:50:00Z",
  "lastTransitionTime": "2021-05-27T18:50:00Z",
  "message": "The installation has stopped because it completed successfully",
  "reason": "InstallationCompleted",
  "status": "True",
  "type": "Stopped"
}

Now lets validate that the cluster is indeed installed and functioning correctly.   To do this we need to first extract the kubeconfig secret from our hub cluster and then set it as the KUBECONFIG variable:

$ oc get secret -n assisted-installer test-cluster-virtual-admin-kubeconfig -o json | jq -r '.data.kubeconfig' | base64 -d > /tmp/sno-spoke-kubeconfig 
$ export KUBECONFIG=/tmp/sno-spoke-kubeconfig

Now lets run some oc commands.  First we will look at the node count with a wide view:

$ oc get nodes -o wide
NAME                            STATUS   ROLES           AGE   VERSION                INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                                                       KERNEL-VERSION          CONTAINER-RUNTIME
master-0.kni5.schmaustech.com   Ready    master,worker   53m   v1.21.0-rc.0+291e731   192.168.0.200   <none>        Red Hat Enterprise Linux CoreOS 48.84.202105062123-0 (Ootpa)   4.18.0-293.el8.x86_64   cri-o://1.21.0-90.rhaos4.8.git07becf8.el8

Next we will confirm all the cluster operators are up and available:

$ oc get co
NAME                                       VERSION      AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.8.0-fc.3   True        False         False      9m35s
baremetal                                  4.8.0-fc.3   True        False         False      30m
cloud-credential                           4.8.0-fc.3   True        False         False      48m
cluster-autoscaler                         4.8.0-fc.3   True        False         False      30m
config-operator                            4.8.0-fc.3   True        False         False      50m
console                                    4.8.0-fc.3   True        False         False      9m46s
csi-snapshot-controller                    4.8.0-fc.3   True        False         False      9m23s
dns                                        4.8.0-fc.3   True        False         False      22m
etcd                                       4.8.0-fc.3   True        False         False      31m
image-registry                             4.8.0-fc.3   True        False         False      21m
ingress                                    4.8.0-fc.3   True        False         False      14m
insights                                   4.8.0-fc.3   True        False         False      13m
kube-apiserver                             4.8.0-fc.3   True        False         False      22m
kube-controller-manager                    4.8.0-fc.3   True        False         False      22m
kube-scheduler                             4.8.0-fc.3   True        False         False      29m
kube-storage-version-migrator              4.8.0-fc.3   True        False         False      31m
machine-api                                4.8.0-fc.3   True        False         False      30m
machine-approver                           4.8.0-fc.3   True        False         False      48m
machine-config                             4.8.0-fc.3   True        False         False      20m
marketplace                                4.8.0-fc.3   True        False         False      30m
monitoring                                 4.8.0-fc.3   True        False         False      9m24s
network                                    4.8.0-fc.3   True        False         False      51m
node-tuning                                4.8.0-fc.3   True        False         False      22m
openshift-apiserver                        4.8.0-fc.3   True        False         False      22m
openshift-controller-manager               4.8.0-fc.3   True        False         False      30m
openshift-samples                          4.8.0-fc.3   True        False         False      21m
operator-lifecycle-manager                 4.8.0-fc.3   True        False         False      30m
operator-lifecycle-manager-catalog         4.8.0-fc.3   True        False         False      48m
operator-lifecycle-manager-packageserver   4.8.0-fc.3   True        False         False      6m34s
service-ca                                 4.8.0-fc.3   True        False         False      50m
storage                                    4.8.0-fc.3   True        False         False      30m

And finally we can check the cluster version:

$ oc get clusterversion
NAME      VERSION      AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.8.0-fc.3   True        False         6m26s   Cluster version is 4.8.0-fc.3

Everything in the installation appears to be working for this SNO based deployment!

Saturday, May 01, 2021

Configuring Noobaa S3 Storage for Red Hat Advanced Cluster Management Observability

 


Nevermind other storage vendors, Noobaa in OpenShift Container Storage (OCS) can provide all the object storage needs Red Hat Advanced Cluster Management Observability ever needed.   In the following blog I will demonstrate how to configure the Noobaa backend in OCS to be used by Red Hat Advanced Cluster Management Observability.

Red Hat Advanced Cluster Management consists of several multicluster components, which are used to access and manage a fleet of OpenShift clusters.  With the observability service enabled, you can use Red Hat Advanced Cluster Management to gain insight about and optimize a fleet managed clusters.

First lets discuss some assumptions I make about the setup:

-This is a 3 master 3 (or more) worker OpenShift cluster

-OCP 4.6.19 (or higher) w/ OCS 4.6.4 in a hyperconverged configuration

-RHACM 2.2.2 is installed on the same cluster

With the stated above assumptions lets move onto configuring a Noobaa object bucket.   The first thing we need to do is create a resource yaml file that will create our object bucket.  The below is an example:

$ cat << EOF > ~/noobaa-object-storage.yaml
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
  name: obc-schmaustech
spec:
  generateBucketName: obc-schmaustech-bucket
  storageClassName: openshift-storage.noobaa.io
EOF

Once we have created our object bucket resource yaml we need to go ahead and create in our OpenShift cluster with following command:

$ oc create -f ~/noobaa-object-storage.yaml
objectbucketclaim.objectbucket.io/obc-schmaustech created


Once the object bucket resource is created we can see it by listing current object buckets:

$ oc get objectbucket
NAME                          STORAGE-CLASS                 CLAIM-NAMESPACE   CLAIM-NAME        RECLAIM-POLICY   PHASE   AGE
obc-default-obc-schmaustech   openshift-storage.noobaa.io   default           obc-schmaustech   Delete           Bound   30s

There are some bits of information we need to gather from the object bucket that we created which we will need to configure our thanos-object-bucket resource yaml required for our Observability configuration.  Those bits are found by describing the object bucket we created and the object buckets secret.   First lets look at the object bucket itself:

$ oc describe objectbucket obc-default-obc-schmaustech
Name:         obc-default-obc-schmaustech
Namespace:    
Labels:       app=noobaa
              bucket-provisioner=openshift-storage.noobaa.io-obc
              noobaa-domain=openshift-storage.noobaa.io
Annotations:  <none>
API Version:  objectbucket.io/v1alpha1
Kind:         ObjectBucket
Metadata:
  Creation Timestamp:  2021-05-01T00:12:54Z
  Finalizers:
    objectbucket.io/finalizer
  Generation:  1
  Managed Fields:
    API Version:  objectbucket.io/v1alpha1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .:
          v:"objectbucket.io/finalizer":
        f:labels:
          .:
          f:app:
          f:bucket-provisioner:
          f:noobaa-domain:
      f:spec:
        .:
        f:additionalState:
          .:
          f:account:
          f:bucketclass:
          f:bucketclassgeneration:
        f:claimRef:
          .:
          f:apiVersion:
          f:kind:
          f:name:
          f:namespace:
          f:uid:
        f:endpoint:
          .:
          f:additionalConfig:
          f:bucketHost:
          f:bucketName:
          f:bucketPort:
          f:region:
          f:subRegion:
        f:reclaimPolicy:
        f:storageClassName:
      f:status:
        .:
        f:phase:
    Manager:         noobaa-operator
    Operation:       Update
    Time:            2021-05-01T00:12:54Z
  Resource Version:  4864265
  Self Link:         /apis/objectbucket.io/v1alpha1/objectbuckets/obc-default-obc-schmaustech
  UID:               9c7eddae-4453-439b-826f-f226513d78f4
Spec:
  Additional State:
    Account:                obc-account.obc-schmaustech-bucket-f6508472-4ba6-405d-9e39-881b45a7344e.608c9d05@noobaa.io
    Bucketclass:            noobaa-default-bucket-class
    Bucketclassgeneration:  1
  Claim Ref:
    API Version:  objectbucket.io/v1alpha1
    Kind:         ObjectBucketClaim
    Name:         obc-schmaustech
    Namespace:    default
    UID:          e123d2c8-2f9d-4f39-9a83-ede316b8a5fe
  Endpoint:
    Additional Config:
    Bucket Host:       s3.openshift-storage.svc
    Bucket Name:       obc-schmaustech-bucket-f6508472-4ba6-405d-9e39-881b45a7344e
    Bucket Port:       443
    Region:            
    Sub Region:        
  Reclaim Policy:      Delete
  Storage Class Name:  openshift-storage.noobaa.io
Status:
  Phase:  Bound
Events:   <none>

In the object bucket describe output we are specifically interested in the bucket name and the bucket host.  Below lets capture the bucket name and assign it to a variable and then echo it out to confirm the variable was set correctly:

$ BUCKET_NAME=`oc describe objectbucket obc-default-obc-schmaustech|grep 'Bucket Name'|cut -d: -f2|tr -d " "`
$echo $BUCKET_NAME
obc-schmaustech-bucket-f6508472-4ba6-405d-9e39-881b45a7344e

Lets do the same thing for the bucket host information.  Again we will assign it to a variable and then echo the variable to confirm it was set correctly:

$ BUCKET_HOST=`oc describe objectbucket obc-default-obc-schmaustech|grep 'Bucket Host'|cut -d: -f2|tr -d " "`
$ echo $BUCKET_HOST
s3.openshift-storage.svc

After we gathered the bucket name and bucket host name we need to also get the access and secret keys for our bucket.  These are stored in a secret file which will have the same name as the metadata name defined in our original object bucket resource file we created above.  In our example the metadata name was obc-schmaustech.  Lets show that secret below:

$ oc get secret obc-schmaustech
NAME              TYPE     DATA   AGE
obc-schmaustech   Opaque   2      117s

The access and secret keys will be visible in the contents of the secret resource and we can visually see them if we get the secret but also ask for the yaml version of the output as we have done below:

$ oc get secret obc-schmaustech -o yaml
apiVersion: v1
data:
  AWS_ACCESS_KEY_ID: V3M2TmpGdWVLd3Vjb2VoTHZVTUo=
  AWS_SECRET_ACCESS_KEY: ck4vOTBaM2NkZWJvOVJLQStaYlBsK3VveWZOYmFpN0s0OU5KRFVKag==
kind: Secret
metadata:
  creationTimestamp: "2021-05-01T00:12:54Z"
  finalizers:
  - objectbucket.io/finalizer
  labels:
    app: noobaa
    bucket-provisioner: openshift-storage.noobaa.io-obc
    noobaa-domain: openshift-storage.noobaa.io
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:data:
        .: {}
        f:AWS_ACCESS_KEY_ID: {}
        f:AWS_SECRET_ACCESS_KEY: {}
      f:metadata:
        f:finalizers:
          .: {}
          v:"objectbucket.io/finalizer": {}
        f:labels:
          .: {}
          f:app: {}
          f:bucket-provisioner: {}
          f:noobaa-domain: {}
        f:ownerReferences:
          .: {}
          k:{"uid":"e123d2c8-2f9d-4f39-9a83-ede316b8a5fe"}:
            .: {}
            f:apiVersion: {}
            f:blockOwnerDeletion: {}
            f:controller: {}
            f:kind: {}
            f:name: {}
            f:uid: {}
      f:type: {}
    manager: noobaa-operator
    operation: Update
    time: "2021-05-01T00:12:54Z"
  name: obc-schmaustech
  namespace: default
  ownerReferences:
  - apiVersion: objectbucket.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: ObjectBucketClaim
    name: obc-schmaustech
    uid: e123d2c8-2f9d-4f39-9a83-ede316b8a5fe
  resourceVersion: "4864261"
  selfLink: /api/v1/namespaces/default/secrets/obc-schmaustech
  uid: eda5cd99-dc57-4c7b-acf3-377343d6fef8
type: Opaque

The access and secret keys are base64 encoded so we need to ensure we gather them from a decoded perspective.  As we did with the bucket name and bucket host we will assign them to variables.  First lets pull out the access key from the yaml, decode it and then assign it to a variable and confirm the variable has the access key content:

$ AWS_ACCESS_KEY_ID=`oc get secret obc-schmaustech -o yaml|grep -m1 AWS_ACCESS_KEY_ID|cut -d: -f2|tr -d " "| base64 -d`
$ echo $AWS_ACCESS_KEY_ID
Ws6NjFueKwucoehLvUMJ

We will do the same for the secret key and verify again:

$ AWS_SECRET_ACCESS_KEY=`oc get secret obc-schmaustech -o yaml|grep -m1 AWS_SECRET_ACCESS_KEY|cut -d: -f2|tr -d " "| base64 -d`
$ echo $AWS_SECRET_ACCESS_KEY
rN/90Z3cdebo9RKA+ZbPl+uoyfNbai7K49NJDUJj

Now that we have our four variables which contain the values for the bucket name, bucket host, access key and secret key we are now ready to create our thanos-object-storage resource yaml file which we will need to start the configuration and deployment of the Red Hat Advanced Cluster Management Observability component.  This file provides the observability service the information about the S3 object storage.   Below is how we can create the file noting that the variables will substitute in the values for resource definition:

$ cat << EOF > ~/thanos-object-storage.yaml
apiVersion: v1
kind: Secret
metadata:
  name: thanos-object-storage
type: Opaque
stringData:
  thanos.yaml: |
    type: s3
    config:
      bucket: $BUCKET_NAME
      endpoint: $BUCKET_HOST
      insecure: false
      access_key: $AWS_ACCESS_KEY_ID
      secret_key: $AWS_SECRET_ACCESS_KEY
      trace:
        enable: true
      http_config:
        insecure_skip_verify: true
EOF

Once we have the definition created we can go ahead and create the open-cluster-management-observability namespace:

$ oc create namespace open-cluster-management-observability
namespace/open-cluster-management-observability created

Next we want to assign the clusters pull-secret to the docker config json variable:

$ DOCKER_CONFIG_JSON=`oc extract secret/pull-secret -n openshift-config --to=-`
# .dockerconfigjson

At this point we can go ahead and create the thanos-object-storage resource from the yaml file we created:

$ oc create -f thanos-object-storage.yaml -n open-cluster-management-observability
secret/thanos-object-storage created

Once the thanos-object-storage resource is created we can create a multiclusterobservability resource yaml file like the example below.  Notice that it references the thanos-object-storage resource we created above:

$ cat << EOF > ~/multiclusterobservability_cr.yaml
apiVersion: observability.open-cluster-management.io/v1beta1
kind: MultiClusterObservability
metadata:
  name: observability #Your customized name of MulticlusterObservability CR
spec:
  availabilityConfig: High             # Available values are High or Basic
  imagePullPolicy: Always
  imagePullSecret: multiclusterhub-operator-pull-secret
  observabilityAddonSpec:              # The ObservabilityAddonSpec is the global settings for all managed clusters
    enableMetrics: true                # EnableMetrics indicates the observability addon push metrics to hub server
    interval: 60                       # Interval for the observability addon push metrics to hub server
  retentionResolution1h: 5d            # How long to retain samples of 1 hour in bucket
  retentionResolution5m: 3d
  retentionResolutionRaw: 1d
  storageConfigObject:                 # Specifies the storage to be used by Observability
    metricObjectStorage:
      name: thanos-object-storage
      key: thanos.yaml
EOF

We can cat out the file to confirm it looks correct:

$ cat multiclusterobservability_cr.yaml
apiVersion: observability.open-cluster-management.io/v1beta1
kind: MultiClusterObservability
metadata:
  name: observability #Your customized name of MulticlusterObservability CR
spec:
  availabilityConfig: High             # Available values are High or Basic
  imagePullPolicy: Always
  imagePullSecret: multiclusterhub-operator-pull-secret
  observabilityAddonSpec:              # The ObservabilityAddonSpec is the global settings for all managed clusters
    enableMetrics: true                # EnableMetrics indicates the observability addon push metrics to hub server
    interval: 60                       # Interval for the observability addon push metrics to hub server
  retentionResolution1h: 5d            # How long to retain samples of 1 hour in bucket
  retentionResolution5m: 3d
  retentionResolutionRaw: 1d
  storageConfigObject:                 # Specifies the storage to be used by Observability
    metricObjectStorage:
      name: thanos-object-storage
      key: thanos.yaml

At this point we can double check that nothing is running under the open-cluster-management-observability namespace:

$ oc get pods -n open-cluster-management-observability
No resources found in open-cluster-management-observability namespace.


Once confirmed there are no resource running we can apply the multiclusterobservability resource file we created to start the deployment of the observability components:

$ oc apply -f multiclusterobservability_cr.yaml
multiclusterobservability.observability.open-cluster-management.io/observability created

It will take a few minutes for the associated pods to come up but after a few minutes if we look for the pods under the open-cluster-management-observability namespace we should see the following:

$  oc get pods -n open-cluster-management-observability
NAME                                                              READY   STATUS    RESTARTS   AGE
alertmanager-0                                                    2/2     Running   0          97s
alertmanager-1                                                    2/2     Running   0          73s
alertmanager-2                                                    2/2     Running   0          57s
grafana-546fb568b4-bqn22                                          2/2     Running   0          97s
grafana-546fb568b4-hxpcz                                          2/2     Running   0          97s
observability-observatorium-observatorium-api-85cf58bd8d-nlpxf    1/1     Running   0          74s
observability-observatorium-observatorium-api-85cf58bd8d-qtm98    1/1     Running   0          74s
observability-observatorium-thanos-compact-0                      1/1     Running   0          74s
observability-observatorium-thanos-query-58dc8c8ccb-4p6l8         1/1     Running   0          74s
observability-observatorium-thanos-query-58dc8c8ccb-6tmvd         1/1     Running   0          74s
observability-observatorium-thanos-query-frontend-f8869cdf66c2c   1/1     Running   0          74s
observability-observatorium-thanos-query-frontend-f8869cdfstwrg   1/1     Running   0          75s
observability-observatorium-thanos-receive-controller-56c9x6tt5   1/1     Running   0          74s
observability-observatorium-thanos-receive-default-0              1/1     Running   0          74s
observability-observatorium-thanos-receive-default-1              1/1     Running   0          56s
observability-observatorium-thanos-receive-default-2              1/1     Running   0          37s
observability-observatorium-thanos-rule-0                         2/2     Running   0          74s
observability-observatorium-thanos-rule-1                         2/2     Running   0          49s
observability-observatorium-thanos-rule-2                         2/2     Running   0          32s
observability-observatorium-thanos-store-memcached-0              2/2     Running   0          74s
observability-observatorium-thanos-store-memcached-1              2/2     Running   0          70s
observability-observatorium-thanos-store-memcached-2              2/2     Running   0          66s
observability-observatorium-thanos-store-shard-0-0                1/1     Running   0          75s
observability-observatorium-thanos-store-shard-1-0                1/1     Running   0          74s
observability-observatorium-thanos-store-shard-2-0                1/1     Running   0          75s
observatorium-operator-797ddbd9d-kqpm6                            1/1     Running   0          98s
rbac-query-proxy-769b5dbcc5-qprrr                                 1/1     Running   0          85s
rbac-query-proxy-769b5dbcc5-s5rbm                                 1/1     Running   0          91s

With the pods running under the open-cluster-management-observability namespace we can now confirm that the observability service is running by logging into the Red Hat Advanced Cluster Management console and going to observe environments.  In the upper right hand corner of the screen a Grafana link should now be present like in the screenshot below: 


Once you click on the Grafana link the following observability dashboard will appear and it even already be showing metrics from the cluster collections:


If we click on the CPU metric we can even see the breakdown of what is using the CPU of the local-cluster:


At this point we can conclude that the Red Hat Advanced Cluster Management Observability component is installed successfully and using the Noobaa S3 object bucket we created.