Monday, March 18, 2019

Stacking OpenShift with Rook and CNV


In previous blogs I was working with Rook/Ceph on Kubernetes and demonstrating how to setup a Ceph cluster and even replace failed OSDs. With that in mind I wanted to shift gears a bit and bring it more into alignment with OpenShift and Container Native Virtualization(CNV).

The following blog will guide us through a simple OpenShift deployment with Rook/Ceph and CNV configured. I will also demonstrate the use of a Rook PVC that provides the back end storage for a CNV deployed virtual instance.

The configuration for this lab is four virtual machines where one node is the master and compute and the other 3 nodes compute.  Each of these nodes has a base install of Red Hat Enterprise Linux 7 on it and the physical host they are on allows for nested virtualization.

Before we start with the installation of various software lets make sure we do a bit of user setup to ensure our install runs smoothly.  The next few steps will need to be done on all nodes to ensure a user origin (this could be any non root user) is created and has sudo rights without use of a password:

# useradd origin
# passwd origin
# echo -e 'Defaults:origin !requiretty\norigin ALL = (root) NOPASSWD:ALL' | tee /etc/sudoers.d/openshift 
# chmod 440 /etc/sudoers.d/openshift

Then we need to perform the the following steps to setup keyless authentication for the origin user from the master node to the rest of the nodes that will make up the cluster:

# ssh-keygen -q -N ""
# vi /home/origin/.ssh/config
Host ocp-master
    Hostname ocp-master.schmaustech.com
    User origin
Host ocp-node1
    Hostname ocp-node1.schmaustech.com
    User origin
Host ocp-node2
    Hostname ocp-node2.schmaustech.com
    User origin
Host ocp-node3
    Hostname ocp-node3.schmaustech.com
    User origin

# chmod 600 /home/origin/.ssh/config
# ssh-copy-id ocp-master
# ssh-copy-id ocp-node1
# ssh-copy-id ocp-node2
# ssh-copy-id ocp-node3

Now we can move on to enabling the necessary repositries on all nodes to ensure we can get access to the right packages we will need for installation:

[origin@ocp-master ~]$ sudo subscription-manager repos --enable=rhel-7-server-rpms --enable=rhel-7-server-extras-rpms --enable=rhel-7-server-rh-common-rpms --enable=rhel-7-server-ose-3.11-rpms --enable=rhel-7-server-ansible-2.6-rpms --enable=rhel-7-server-cnv-1.4-tech-preview-rpms

Next lets install the initial required packages on all the nodes:

[origin@ocp-master ~]$ sudo yum -y install openshift-ansible docker-1.13.1 kubevirt-ansible kubevirt-virtctl

On the master node lets configure the Ansible hosts file for our OpenShift installation.   The following is the example I used and I simply replaced /etc/ansible/hosts with it.

[OSEv3:children]
masters
nodes
etcd
[OSEv3:vars]
# admin user created in previous section
ansible_ssh_user=origin
ansible_become=true
oreg_url=registry.access.redhat.com/openshift3/ose-${component}:${version}
openshift_deployment_type=openshift-enterprise
#  use HTPasswd for authentication
openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]
# define default sub-domain for Master node
openshift_master_default_subdomain=apps.schmaustech.com
# allow unencrypted connection within cluster
openshift_docker_insecure_registries=172.30.0.0/16
[masters]
ocp-master.schmaustech.com openshift_schedulable=true containerized=false
[etcd]
ocp-master.schmaustech.com
[nodes]
# defined values for [openshift_node_group_name] in the file below
# [/usr/share/ansible/openshift-ansible/roles/openshift_facts/defaults/main.yml]
ocp-master.schmaustech.com openshift_node_group_name='node-config-all-in-one'
ocp-node1.schmaustech.com openshift_node_group_name='node-config-compute'
ocp-node2.schmaustech.com openshift_node_group_name='node-config-compute'
ocp-node3.schmaustech.com openshift_node_group_name='node-config-compute'

With the Ansible host file in place we are ready to run the OpenShift prerequisite playbook:

[origin@ocp-master ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/prerequisites.yml

Once the prerequisite playbook executes sucessfully we can then run the OpenShift deploy cluster playbook:

[origin@ocp-master ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml

Lets validate OpenShift is up and running:

[origin@ocp-master ~]$ oc get nodes
NAME         STATUS    ROLES                  AGE       VERSION
ocp-master   Ready     compute,infra,master   15h       v1.11.0+d4cacc0
ocp-node1    Ready     compute                14h       v1.11.0+d4cacc0
ocp-node2    Ready     compute                14h       v1.11.0+d4cacc0
ocp-node3    Ready     compute                14h       v1.11.0+d4cacc0

[origin@ocp-master ~]$ oc get pods --all-namespaces -o wide
NAMESPACE                           NAME                                           READY     STATUS      RESTARTS   AGE       IP              NODE         NOMINATED NODE
default                             docker-registry-1-g4hgd                        1/1       Running     0          14h       10.128.0.4      ocp-master   <none>
default                             registry-console-1-zwhrd                       1/1       Running     0          14h       10.128.0.6      ocp-master   <none>
default                             router-1-v8pkp                                 1/1       Running     0          14h       192.168.3.100   ocp-master   <none>
kube-service-catalog                apiserver-gxjst                                1/1       Running     0          14h       10.128.0.17     ocp-master   <none>
kube-service-catalog                controller-manager-2v6qs                       1/1       Running     3          14h       10.128.0.18     ocp-master   <none>
openshift-ansible-service-broker    asb-1-d8clq                                    1/1       Running     0          14h       10.128.0.21     ocp-master   <none>
openshift-console                   console-566f847459-pk52j                       1/1       Running     0          14h       10.128.0.12     ocp-master   <none>
openshift-monitoring                alertmanager-main-0                            3/3       Running     0          14h       10.128.0.14     ocp-master   <none>
openshift-monitoring                alertmanager-main-1                            3/3       Running     0          14h       10.128.0.15     ocp-master   <none>
openshift-monitoring                alertmanager-main-2                            3/3       Running     0          14h       10.128.0.16     ocp-master   <none>
openshift-monitoring                cluster-monitoring-operator-79d6c544f5-c8rfs   1/1       Running     0          14h       10.128.0.7      ocp-master   <none>
openshift-monitoring                grafana-8497b48bd5-bqzxb                       2/2       Running     0          14h       10.128.0.10     ocp-master   <none>
openshift-monitoring                kube-state-metrics-7d8b57fc8f-ktdq4            3/3       Running     0          14h       10.128.0.19     ocp-master   <none>
openshift-monitoring                node-exporter-5gmbc                            2/2       Running     0          14h       192.168.3.103   ocp-node3    <none>
openshift-monitoring                node-exporter-fxthd                            2/2       Running     0          14h       192.168.3.102   ocp-node2    <none>
openshift-monitoring                node-exporter-gj27b                            2/2       Running     0          14h       192.168.3.101   ocp-node1    <none>
openshift-monitoring                node-exporter-r6vjs                            2/2       Running     0          14h       192.168.3.100   ocp-master   <none>
openshift-monitoring                prometheus-k8s-0                               4/4       Running     1          14h       10.128.0.11     ocp-master   <none>
openshift-monitoring                prometheus-k8s-1                               4/4       Running     1          14h       10.128.0.13     ocp-master   <none>
openshift-monitoring                prometheus-operator-5677fb6f87-4czth           1/1       Running     0          14h       10.128.0.8      ocp-master   <none>
openshift-node                      sync-7rqcb                                     1/1       Running     0          14h       192.168.3.103   ocp-node3    <none>
openshift-node                      sync-829ql                                     1/1       Running     0          14h       192.168.3.101   ocp-node1    <none>
openshift-node                      sync-mwq6v                                     1/1       Running     0          14h       192.168.3.102   ocp-node2    <none>
openshift-node                      sync-vc4hw                                     1/1       Running     0          15h       192.168.3.100   ocp-master   <none>
openshift-sdn                       ovs-n55b8                                      1/1       Running     0          14h       192.168.3.101   ocp-node1    <none>
openshift-sdn                       ovs-nvtgq                                      1/1       Running     0          14h       192.168.3.103   ocp-node3    <none>
openshift-sdn                       ovs-t8dgh                                      1/1       Running     0          14h       192.168.3.102   ocp-node2    <none>
openshift-sdn                       ovs-wgw2v                                      1/1       Running     0          15h       192.168.3.100   ocp-master   <none>
openshift-sdn                       sdn-7r9kn                                      1/1       Running     0          14h       192.168.3.101   ocp-node1    <none>
openshift-sdn                       sdn-89284                                      1/1       Running     0          15h       192.168.3.100   ocp-master   <none>
openshift-sdn                       sdn-hmgjg                                      1/1       Running     0          14h       192.168.3.103   ocp-node3    <none>
openshift-sdn                       sdn-n7lzh                                      1/1       Running     0          14h       192.168.3.102   ocp-node2    <none>
openshift-template-service-broker   apiserver-md5sr                                1/1       Running     0          14h       10.128.0.22     ocp-master   <none>
openshift-web-console               webconsole-674f79b6fc-cjrhw                    1/1       Running     0          14h       10.128.0.9      ocp-master   <none>

With OpenShift up and running we can move onto install Rook/Ceph cluster.  The first step is to clone the Rook Git repo down to the master node and make an adjustment for the kubelet-plugins.  Please note here I am cloning down a colleagues Rook clone and not direct from the Rook project:

[origin@ocp-master ~]$ git clone https://github.com/ksingh7/ocp4-rook.git
[origin@ocp-master ~]$ sed -i.bak s+/etc/kubernetes/kubelet-plugins/volume/exec+/usr/libexec/kubernetes/kubelet-plugins/volume/exec+g /home/origin/ocp4-rook/ceph/operator.yaml

With the repository cloned we can now apply the the security context constraints needed by the Rook pods using the scc.yaml and then launch the Rook operator with operator.yaml:

[origin@ocp-master ~]$ oc create -f /home/origin/ocp4-rook/ceph/scc.yaml
[origin@ocp-master ~]$ oc create -f /home/origin/ocp4-rook/ceph/operator.yaml

Lets validate the Rook operator came up:

[origin@ocp-master ~]$ oc get pods -n rook-ceph-system 
NAME                                 READY     STATUS    RESTARTS   AGE
rook-ceph-agent-77x5n                1/1       Running   0          1h
rook-ceph-agent-cdvqr                1/1       Running   0          1h
rook-ceph-agent-gz7tl                1/1       Running   0          1h
rook-ceph-agent-rsbwh                1/1       Running   0          1h
rook-ceph-operator-b76466dcd-zmscb   1/1       Running   0          1h
rook-discover-6p5ht                  1/1       Running   0          1h
rook-discover-fnrf4                  1/1       Running   0          1h
rook-discover-grr5w                  1/1       Running   0          1h
rook-discover-mllt7                  1/1       Running   0          1h

Once the operator is up we can proceed on deploying the Ceph cluster and once that is up deploy the Ceph toolbox pod:

[origin@ocp-master ~]$ oc create -f /home/origin/ocp4-rook/ceph/cluster.yaml  
[origin@ocp-master ~]$ oc create -f /home/origin/ocp4-rook/ceph/toolbox.yaml

Lets validate the Ceph cluster is up:

[origin@ocp-master ~]$ oc get pods -n rook-ceph
NAME                                     READY     STATUS      RESTARTS   AGE
rook-ceph-mgr-a-785ddd6d6c-d4w56         1/1       Running     0          1h
rook-ceph-mon-a-67855c796b-sdvqm         1/1       Running     0          1h
rook-ceph-mon-b-6d58cd7656-xkrdz         1/1       Running     0          1h
rook-ceph-mon-c-869b8d9d9-m7544          1/1       Running     0          1h
rook-ceph-osd-0-d6cbd5776-987p9          1/1       Running     0          1h
rook-ceph-osd-1-cfddf997-pzq69           1/1       Running     0          1h
rook-ceph-osd-2-79fc94c6d5-krtnj         1/1       Running     0          1h
rook-ceph-osd-3-f9b55c4d6-7jp7c          1/1       Running     0          1h
rook-ceph-osd-prepare-ocp-master-ztmhs   0/2       Completed   0          1h
rook-ceph-osd-prepare-ocp-node1-mgbcd    0/2       Completed   0          1h
rook-ceph-osd-prepare-ocp-node2-98rtw    0/2       Completed   0          1h
rook-ceph-osd-prepare-ocp-node3-ngscg    0/2       Completed   0          1h
rook-ceph-tools                          1/1       Running     0          1h

Lets also validate from the Ceph toolbox that the cluster health is ok:

[origin@ocp-master ~]$ oc -n rook-ceph rsh rook-ceph-tools
sh-4.2# ceph status
  cluster:
    id:     6ddab3e4-1730-412f-89b8-0738708adac8
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum b,a,c
    mgr: a(active)
    osd: 4 osds: 4 up, 4 in
 
  data:
    pools:   1 pools, 100 pgs
    objects: 281  objects, 1.1 GiB
    usage:   51 GiB used, 169 GiB / 220 GiB avail
    pgs:     100 active+clean


Now that we have confirmed the Ceph cluster is deployed lets configure a Ceph storage class and also make it the default storage class for the environment:

[origin@ocp-master ~]$ oc create -f /home/origin/ocp4-rook/ceph/storageclass.yaml
[origin@ocp-master ~]$ oc patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

And now if we display the storage class we can see Rook/Ceph is our default:

[origin@ocp-master ~]$ oc get storageclass
NAME                        PROVISIONER          AGE
rook-ceph-block (default)   ceph.rook.io/block   6h


Proceeding with our stack installation lets get CNV installed.  Again with the use of the Ansible playbook we used earlier for OpenShift this is a relatively easy task:

[origin@ocp-master ~]$ oc login -u system:admin
[origin@ocp-master ~]$ cd /usr/share/ansible/kubevirt-ansible
[origin@ocp-master ~]$ ansible-playbook -i /etc/ansible/hosts -e @vars/cnv.yml playbooks/kubevirt.yml -e apb_action=provision

Once the installation completes lets run the following command to ensure the pods for CNV are up:

[origin@ocp-master ~]$ oc get pods --all-namespaces -o wide|egrep "kubevirt|cdi"
cdi                                 cdi-apiserver-7bfd97d585-tqjgt                 1/1       Running     0          6h        10.129.0.11     ocp-node3    
cdi                                 cdi-deployment-6689fcb476-4klcj                1/1       Running     0          6h        10.131.0.12     ocp-node1    
cdi                                 cdi-operator-5889d7588c-wvgl4                  1/1       Running     0          6h        10.130.0.12     ocp-node2    
cdi                                 cdi-uploadproxy-79c9fb9f59-pkskw               1/1       Running     0          6h        10.129.0.13     ocp-node3    
cdi                                 virt-launcher-f29vm-h6mc9                      1/1       Running     0          6h        10.129.0.15     ocp-node3    
kubevirt-web-ui                     console-854d4585c8-hgdhv                       1/1       Running     0          6h        10.129.0.10     ocp-node3    
kubevirt-web-ui                     kubevirt-web-ui-operator-6b4574bb95-bmsw7      1/1       Running     0          6h        10.130.0.11     ocp-node2    
kubevirt                            kubevirt-cpu-node-labeller-fvx9n               1/1       Running     0          6h        10.128.0.29     ocp-master   
kubevirt                            kubevirt-cpu-node-labeller-jr858               1/1       Running     0          6h        10.131.0.13     ocp-node1    
kubevirt                            kubevirt-cpu-node-labeller-tgq5g               1/1       Running     0          6h        10.129.0.14     ocp-node3    
kubevirt                            kubevirt-cpu-node-labeller-xqpbl               1/1       Running     0          6h        10.130.0.13     ocp-node2    
kubevirt                            virt-api-865b95d544-hg58l                      1/1       Running     0          6h        10.129.0.8      ocp-node3    
kubevirt                            virt-api-865b95d544-jrkxh                      1/1       Running     0          6h        10.131.0.10     ocp-node1    
kubevirt                            virt-controller-5c89d4978d-q79lh               1/1       Running     0          6h        10.130.0.8      ocp-node2    
kubevirt                            virt-controller-5c89d4978d-t58l7               1/1       Running     0          6h        10.130.0.10     ocp-node2    
kubevirt                            virt-handler-gblbk                             1/1       Running     0          6h        10.128.0.28     ocp-master   
kubevirt                            virt-handler-jnwx6                             1/1       Running     0          6h        10.130.0.9      ocp-node2    
kubevirt                            virt-handler-r94fb                             1/1       Running     0          6h        10.129.0.9      ocp-node3    
kubevirt                            virt-handler-z7775                             1/1       Running     0          6h        10.131.0.11     ocp-node1    
kubevirt                            virt-operator-68984b585c-265bq                 1/1       Running     0          6h        10.129.0.7      ocp-node3    

Now that CNV is up running lets pull down a Fedora 29 image and upload it into a PVC of the default storageclass which of course is Rook/Ceph:

[origin@ocp-master ~]$ curl -L -o /home/origin/f29.qcow2 http://ftp.usf.edu/pub/fedora/linux/releases/29/Cloud/x86_64/images/Fedora-Cloud-Base-29-1.2.x86_64.qcow2
[origin@ocp-master ~]$ virtctl image-upload --pvc-name=f29vm --pvc-size=5Gi --image-path=/home/origin/f29.qcow2 --uploadproxy-url=https://`oc describe route cdi-uploadproxy-route|grep Endpoints|cut -f2` --insecure

We can execute the following to see that the PVC has been created:

[origin@ocp-master ~]$ oc get pvc
NAME      STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
f29vm     Bound     pvc-4815df9e-4987-11e9-a732-525400767d62   5Gi        RWO            rook-ceph-block   6h

Besides the PVC we will also need a virtual machine configuration yaml file.  The one below is an example that will be used in this demonstration:

apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  creationTimestamp: null
  labels:
    kubevirt-vm: f29vm
  name: f29vm
spec:
  running: true
  template:
    metadata:
      creationTimestamp: null
      labels:
        kubevirt.io/domain: f29vm
    spec:
      domain:
        cpu:
          cores: 2
        devices:
          disks:
          - disk:
              bus: virtio
            name: osdisk
            volumeName: osdisk
          - disk:
              bus: virtio
            name: cloudinitdisk
            volumeName: cloudinitvolume
          interfaces:
          - name: default
            bridge: {}
        resources:
          requests:
            memory: 1024M
      terminationGracePeriodSeconds: 0
      volumes:
      - name: osdisk
        persistentVolumeClaim:
          claimName: f29vm
      - name: cloudinitdisk
        cloudInitNoCloud:
          userData: |-
            #cloud-config
            password: ${PASSWORD}
            disable_root: false
            chpasswd: { expire: False }
            ssh_authorized_keys:
            - "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDUs1KbLraX74mBM/ksoGwbsEejfpCVeMzbW7JLJjGXF8G1jyVAE3T0Uf5mO8nbNOfkjAjw24lxSsEScF2wslBzA5MIm+GB6Z+ZzR55FcRlZeouGVrfLmb67mYc2c/F/mq35TruHdRk2G5Y0+6cf8cfDs414+yiVA0heHQvWNfO7kb1z9kIOhyD6OOwdNT5jK/1O0+p6SdP+pEal51BsEf6GRGYLWc9SLIEcqtjoprnundr5UPvmC1l/pkqFQigMehwhthrdXC4GseWiyj9CnBkccxQCKvHjzko/wqsWGQLwDG3pBsHhthvbY0G5+VPB9a8YV58WJhC6nHpUTDA8jpB origin@ocp-master"
      networks:
      - name: default
        pod: {}

At this point we have all the necessary components to launch our containerized virtual machine instance.   The following command does the creation using the yaml file we created in the previous step:

[origin@ocp-master ~]$ oc create -f /home/origin/f29vm.yaml

There are multiple ways to validate the virtual machine has been instantiated.   I like to do the following to confirm the instance is running and has an IP address:

[origin@ocp-master ~]$ oc get vms
NAME      AGE       RUNNING   VOLUME
f29vm     6h        true      
[origin@ocp-master ~]$ oc get vmi
NAME      AGE       PHASE     IP            NODENAME
f29vm     6h        Running   10.129.0.15   ocp-node3

One final step you can do is actually log into the instance assuming a key was set in the yaml file:

[origin@ocp-master ~]$ ssh -i /home/origin/.ssh/id_rsa -o "StrictHostKeyChecking=no" fedora@10.129.0.15
[fedora@f29vm ~]$ cat /etc/fedora-release
Fedora release 29 (Twenty Nine)

Hopefully this demonstrated how easy it is to get OpenShift, Rook and CNV up and running and how one can then leverage the storage of Rook to provide a backend for the virtual instance that gets spun up in CNV.   What is awesome is that I have taken the steps above and put them into a DCI job where I can automatically rerun the deployment using newer version of the code base for testing.   If you are not familiar with DCI I will leave with this tease link to DCI: https://doc.distributed-ci.io/

Wednesday, January 30, 2019

Replace Failed OSD in Rook Deployed Ceph


If you have been reading some of my recent articles on Rook you have seen how to install a Ceph cluster with Rook on Kubernetes. This article extends on that Kubernetes installation and discusses how to replace a failed OSD in the Ceph cluster.

First lets review our current running Ceph cluster observing the rook-ceph-system, rook-ceph and inside the toolbox the Ceph status:

# kubectl get pods --all-namespaces -o wide
NAMESPACE          NAME                                      READY   STATUS      RESTARTS   AGE    IP            NODE          NOMINATED NODE   READINESS GATES
kube-system        coredns-86c58d9df4-22fps                  1/1     Running     4          3d2h   10.244.3.55   kube-node3               
kube-system        coredns-86c58d9df4-jp2zb                  1/1     Running     6          3d2h   10.244.2.66   kube-node2               
kube-system        etcd-kube-master                          1/1     Running     3          3d5h   10.0.0.81     kube-master              
kube-system        kube-apiserver-kube-master                1/1     Running     3          3d5h   10.0.0.81     kube-master              
kube-system        kube-controller-manager-kube-master       1/1     Running     5          3d5h   10.0.0.81     kube-master              
kube-system        kube-flannel-ds-amd64-5m9x5               1/1     Running     6          3d5h   10.0.0.83     kube-node2               
kube-system        kube-flannel-ds-amd64-7xgf4               1/1     Running     3          3d5h   10.0.0.81     kube-master              
kube-system        kube-flannel-ds-amd64-dhdzm               1/1     Running     5          3d2h   10.0.0.84     kube-node3               
kube-system        kube-flannel-ds-amd64-m6fx5               1/1     Running     3          3d5h   10.0.0.82     kube-node1               
kube-system        kube-proxy-bnbzn                          1/1     Running     3          3d5h   10.0.0.82     kube-node1               
kube-system        kube-proxy-gjxlg                          1/1     Running     4          3d2h   10.0.0.84     kube-node3               
kube-system        kube-proxy-kkxdb                          1/1     Running     3          3d5h   10.0.0.81     kube-master              
kube-system        kube-proxy-knzsl                          1/1     Running     6          3d5h   10.0.0.83     kube-node2               
kube-system        kube-scheduler-kube-master                1/1     Running     4          3d5h   10.0.0.81     kube-master              
rook-ceph-system   rook-ceph-agent-748v8                     1/1     Running     0          103m   10.0.0.83     kube-node2               
rook-ceph-system   rook-ceph-agent-9vznf                     1/1     Running     0          103m   10.0.0.82     kube-node1               
rook-ceph-system   rook-ceph-agent-hfdv6                     1/1     Running     0          103m   10.0.0.81     kube-master              
rook-ceph-system   rook-ceph-agent-lfh7m                     1/1     Running     0          103m   10.0.0.84     kube-node3               
rook-ceph-system   rook-ceph-operator-76cf7f88f-qmvn5        1/1     Running     0          103m   10.244.1.65   kube-node1               
rook-ceph-system   rook-discover-25h5z                       1/1     Running     0          103m   10.244.1.66   kube-node1               
rook-ceph-system   rook-discover-dcm7k                       1/1     Running     0          103m   10.244.0.41   kube-master              
rook-ceph-system   rook-discover-t4qs7                       1/1     Running     0          103m   10.244.3.61   kube-node3               
rook-ceph-system   rook-discover-w2nv5                       1/1     Running     0          103m   10.244.2.72   kube-node2               
rook-ceph          rook-ceph-mgr-a-8649f78d9b-k6gwl          1/1     Running     0          100m   10.244.3.62   kube-node3               
rook-ceph          rook-ceph-mon-a-576d9d49cc-q9pm6          1/1     Running     0          101m   10.244.0.42   kube-master              
rook-ceph          rook-ceph-mon-b-85f7b6cb6b-pnrhs          1/1     Running     0          101m   10.244.1.67   kube-node1               
rook-ceph          rook-ceph-mon-c-668f7f658d-hjf2v          1/1     Running     0          101m   10.244.2.74   kube-node2               
rook-ceph          rook-ceph-osd-0-6f76d5cc4c-t75gg          1/1     Running     0          100m   10.244.2.76   kube-node2               
rook-ceph          rook-ceph-osd-1-5759cd47c4-szvfg          1/1     Running     0          100m   10.244.3.64   kube-node3               
rook-ceph          rook-ceph-osd-2-6d69b78fbf-7s4bm          1/1     Running     0          100m   10.244.0.44   kube-master              
rook-ceph          rook-ceph-osd-3-7b457fc56d-22gw6          1/1     Running     0          100m   10.244.1.69   kube-node1               
rook-ceph          rook-ceph-osd-prepare-kube-master-72kfz   0/2     Completed   0          100m   10.244.0.43   kube-master              
rook-ceph          rook-ceph-osd-prepare-kube-node1-jp68h    0/2     Completed   0          100m   10.244.1.68   kube-node1               
rook-ceph          rook-ceph-osd-prepare-kube-node2-j89pc    0/2     Completed   0          100m   10.244.2.75   kube-node2               
rook-ceph          rook-ceph-osd-prepare-kube-node3-drh4t    0/2     Completed   0          100m   10.244.3.63   kube-node3               
rook-ceph          rook-ceph-tools-76c7d559b6-qvh2r          1/1     Running     0          6s     10.0.0.82     kube-node1               

# kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash

# ceph status
  cluster:
    id:     edc7cac7-21a3-45ae-80a9-5d470afb7576
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum c,a,b
    mgr: a(active)
    osd: 4 osds: 4 up, 4 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0  objects, 0 B
    usage:   17 GiB used, 123 GiB / 140 GiB avail
    pgs:     
 
# ceph osd tree  
ID CLASS WEIGHT  TYPE NAME            STATUS REWEIGHT PRI-AFF 
-1       0.13715 root default                                 
-5       0.03429     host kube-master                         
 2   hdd 0.03429         osd.2            up  1.00000 1.00000 
-4       0.03429     host kube-node1                          
 3   hdd 0.03429         osd.3            up  1.00000 1.00000 
-2       0.03429     host kube-node2                          
 0   hdd 0.03429         osd.0            up  1.00000 1.00000 
-3       0.03429     host kube-node3                          
 1   hdd 0.03429         osd.1            up  1.00000 1.00000 

At this point the Ceph cluster is clean and in a healthy state.  However I am going to introduce some chaos and which will cause osd1 to go down.  In my case since this is a virtual lab I am going to just kill the OSD process and clear out osd1 data to mimic a failed drive.

Now when we look at the cluster state in the toolbox we can see OSD1 is down:

# kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash

# ceph status
  cluster:
    id:     edc7cac7-21a3-45ae-80a9-5d470afb7576
    health: HEALTH_WARN
            1 osds down
            1 host (1 osds) down
 
  services:
    mon: 3 daemons, quorum c,a,b
    mgr: a(active)
    osd: 4 osds: 3 up, 4 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0  objects, 0 B
    usage:   17 GiB used, 123 GiB / 140 GiB avail
    pgs:     
 
[root@kube-node1 /]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME            STATUS REWEIGHT PRI-AFF 
-1       0.13715 root default                                 
-5       0.03429     host kube-master                         
 2   hdd 0.03429         osd.2            up  1.00000 1.00000 
-4       0.03429     host kube-node1                          
 3   hdd 0.03429         osd.3            up  1.00000 1.00000 
-2       0.03429     host kube-node2                          
 0   hdd 0.03429         osd.0            up  1.00000 1.00000 
-3       0.03429     host kube-node3                          
 1   hdd 0.03429         osd.1          down  1.00000 1.00000 

Given I removed the contents of the OSD lets go ahead and replace the failed drive. The first steps are to go into the toolbox and run the usual commands to remove a Ceph OSD from the cluster:

# kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash

# ceph osd out osd.1
marked out osd.1. 

# ceph osd crush remove osd.1
removed item id 1 name 'osd.1' from crush map

# ceph auth del osd.1
updated

# ceph osd rm osd.1
removed osd.1

Lets exit out of the toolbox and go back to the master node command line and delete the Ceph OSD 3 deployment:

# kubectl delete deployment -n rook-ceph rook-ceph-osd-1
deployment.extensions "rook-ceph-osd-1" deleted

Now would be the time to replace the physically failed disk. In my case the disk is still good I just simulated the failure by downing the OSD process and removing the data.

To get the new disk back into the cluster we only need to restart the rook-ceph-operator pod and we can do so in Kubernetes with the following scale deployment commands:

# kubectl scale deployment rook-ceph-operator --replicas=0 -n rook-ceph-system
deployment.extensions/rook-ceph-operator scaled

# kubectl get pods --all-namespaces -o wide|grep operator

# kubectl scale deployment rook-ceph-operator --replicas=1 -n rook-ceph-system
deployment.extensions/rook-ceph-operator scaled

# kubectl get pods --all-namespaces -o wide|grep operator
rook-ceph-system   rook-ceph-operator-76cf7f88f-g9pxr        0/1     ContainerCreating   0          2s              kube-node2               

When the rook-ceph-operator is restarted it will go through and re-run each rook-ceph-osd-prepare container which will scan the system it is on and look for any disks that should be incorporated into the cluster based on the original cluster.yaml settings when the Ceph cluster was deployed with Rook.  In this case it will see the new disk on kube-node-3 and incorporate that into OSD1.

We can confirm our assessment by seeing a new container for OSD1 was spawned and also by logging into the toolbox and running the familiar Ceph commands:

# kubectl get pods -n rook-ceph -o wide
NAME                                      READY   STATUS      RESTARTS   AGE     IP            NODE          NOMINATED NODE   READINESS GATES
rook-ceph-mgr-a-8649f78d9b-k6gwl          1/1     Running     0          110m    10.244.3.62   kube-node3    <none>           <none>
rook-ceph-mon-a-576d9d49cc-q9pm6          1/1     Running     0          110m    10.244.0.42   kube-master   <none>           <none>
rook-ceph-mon-b-85f7b6cb6b-pnrhs          1/1     Running     0          110m    10.244.1.67   kube-node1    <none>           <none>
rook-ceph-mon-c-668f7f658d-hjf2v          1/1     Running     0          110m    10.244.2.74   kube-node2    <none>           <none>
rook-ceph-osd-0-6f76d5cc4c-t75gg          1/1     Running     0          109m    10.244.2.76   kube-node2    <none>           <none>
rook-ceph-osd-1-69f5d5ffd-kndd7           1/1     Running     0          67s     10.244.3.68   kube-node3    <none>           <none>
rook-ceph-osd-2-6d69b78fbf-7s4bm          1/1     Running     0          109m    10.244.0.44   kube-master   <none>           <none>
rook-ceph-osd-3-7b457fc56d-22gw6          1/1     Running     0          109m    10.244.1.69   kube-node1    <none>           <none>
rook-ceph-osd-prepare-kube-master-n2t7g   0/2     Completed   0          79s     10.244.0.47   kube-master   <none>           <none>
rook-ceph-osd-prepare-kube-node1-ttznt    0/2     Completed   0          77s     10.244.1.72   kube-node1    <none>           <none>
rook-ceph-osd-prepare-kube-node2-9kxcl    0/2     Completed   0          75s     10.244.2.79   kube-node2    <none>           <none>
rook-ceph-osd-prepare-kube-node3-cpf4s    0/2     Completed   0          73s     10.244.3.66   kube-node3    <none>           <none>
rook-ceph-tools-76c7d559b6-qvh2r          1/1     Running     0          9m28s   10.0.0.82     kube-node1    <none>           <none>

# ceph status
  cluster:
    id:     edc7cac7-21a3-45ae-80a9-5d470afb7576
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum c,a,b
    mgr: a(active)
    osd: 4 osds: 4 up, 4 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0  objects, 0 B
    usage:   17 GiB used, 123 GiB / 140 GiB avail
    pgs:     

# ceph osd tree 
ID CLASS WEIGHT  TYPE NAME            STATUS REWEIGHT PRI-AFF 
-1       0.13715 root default                                 
-5       0.03429     host kube-master                         
 2   hdd 0.03429         osd.2            up  1.00000 1.00000 
-4       0.03429     host kube-node1                          
 3   hdd 0.03429         osd.3            up  1.00000 1.00000 
-2       0.03429     host kube-node2                          
 0   hdd 0.03429         osd.0            up  1.00000 1.00000 
-3       0.03429     host kube-node3                          
 1       0.03429         osd.1            up  1.00000 1.00000 

As you can see replacing a failed OSD with Rook is about as uneventful as replacing a failed OSD in a standard deployed Ceph cluster.   Hopefully this demonstration provided the proof of that.

Further Reading:

Rook: https://github.com/rook/rook


Rook & Ceph on Kubernetes


In a previous article I wrote about using Rook to deploy a Ceph storage cluster within Minikube (link below). The original post described what Rook can provide and demonstrated the ease of quickly setting up an all in one Ceph cluster. However I wanted explore Rook further in a multi-node configuration and how it integrates with applications in Kubernetes.

First I needed to set up a base Kubernetes environment which consisted of 1 master and 3 worker nodes. I used the following steps on all nodes to prepare them for Kubernetes: add hostname to host files, disable Selinux and swap, enable br_netfilter, install supporting utilities, enable Kubernetes repo, install docker, install Kubernetes binaries and enable/disable relevant services.

# echo "10.0.0.81   kube-master" >> /etc/hosts
# echo "10.0.0.82   kube-node1" >> /etc/hosts
# echo "10.0.0.83   kube-node2" >> /etc/hosts
# echo "10.0.0.84   kube-node3" >> /etc/hosts
# setenforce 0
# sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux
# swapoff -a
# sed -i.bak -r 's/(.+ swap .+)/#\1/' /etc/fstab
# modprobe br_netfilter
# echo '1' > /proc/sys/net/bridge/bridge-nf-call-iptables
# echo 'br_netfilter' > /etc/modules-load.d/netfilter.conf
# echo net.bridge.bridge-nf-call-iptables=1 >> /etc/sysctl.d/10-bridge-nf-call-iptables.conf
# dnf install -y yum-utils device-mapper-persistent-data lvm2
# dnf install docker
# cat > /etc/yum.repos.d/kubernetes.repo < [kubernetes]
  > name=Kubernetes
  > baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
  > enabled=1
  > gpgcheck=1
  > repo_gpgcheck=1
  > gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
  >         https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
  > EOF
# dnf install -y kubelet kubeadm kubectl
# systemctl enable docker ; systemctl start docker ; systemctl enable kubelet ; systemctl start kubelet ; systemctl stop firewalld ; systemctl disable firewalld

Once the prerequisites are met on each node lets initialize the cluster on the master node:

# kubeadm init --apiserver-advertise-address=10.0.0.81 --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.13.2
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kube-master localhost] and IPs [10.0.0.81 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kube-master localhost] and IPs [10.0.0.81 127.0.0.1 ::1]
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kube-master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.0.0.81]
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 19.511836 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "kube-master" as an annotation
[mark-control-plane] Marking the node kube-master as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node kube-master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: etmucm.238nrw6a48yu0njb
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join 10.0.0.81:6443 --token etmucm.238nrw6a48yu0njb --discovery-token-ca-cert-hash sha256:963d6d9d31f2db9debfaa600ef802d05c448f7dc9e9cb92aec268cf2a8cfee7b

After the master is up and running you can join the remaining nodes using the following command which was presented in the output when you initialized the master:

# kubeadm join 10.0.0.81:6443 --token etmucm.238nrw6a48yu0njb --discovery-token-ca-cert-hash sha256:963d6d9d31f2db9debfaa600ef802d05c448f7dc9e9cb92aec268cf2a8cfee7b
[preflight] Running pre-flight checks
[discovery] Trying to connect to API Server "10.0.0.81:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.0.0.81:6443"
[discovery] Requesting info from "https://10.0.0.81:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.0.0.81:6443"
[discovery] Successfully established connection with API Server "10.0.0.81:6443"
[join] Reading configuration from the cluster...
[join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "kube-node1" as an annotation

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

I like to do some housekeeping once all my nodes are joined which includes enabling scheduling on master and labeling the worker nodes as such:

# kubectl taint node kube-master node-role.kubernetes.io/master:NoSchedule-
# kubectl label node kube-node1 node-role.kubernetes.io/worker=worker
# kubectl label node kube-node2 node-role.kubernetes.io/worker=worker
# kubectl label node kube-node3 node-role.kubernetes.io/worker=worker
 
Once you have joined the nodes you should have a cluster that looks like this:

# kubectl get nodes
NAME          STATUS   ROLES    AGE   VERSION
kube-master   Ready    master   19h   v1.13.2
kube-node1    Ready    worker   19h   v1.13.2
kube-node2    Ready    worker   19h   v1.13.2
kube-node3    Ready    worker   17h   v1.13.2
 
Next lets deploy Flannel for networking:

# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

And finally lets deploy Rook and Ceph cluster using the familiar steps from my previous article:

# git clone https://github.com/rook/rook.git
# cd ./rook/cluster/examples/kubernetes/ceph
# sed -i.bak s+/var/lib/rook+/data/rook+g cluster.yaml
# kubectl create -f operator.yaml
# kubectl create -f cluster.yaml
# kubectl create -f toolbox.yaml
 
Once all the containers have spun up you should have something that looks like the following:
 
# kubectl get pod -n rook-ceph -o wide
NAME                                      READY   STATUS      RESTARTS   AGE   IP            NODE          NOMINATED NODE   READINESS GATES
rook-ceph-mgr-a-8649f78d9b-txsfm          1/1     Running     1          19h   10.244.2.12   kube-node2               
rook-ceph-mon-a-598b7bd4cd-kpxnx          1/1     Running     0          19h   10.244.0.3    kube-master              
rook-ceph-mon-c-759b8984f5-ggzjb          1/1     Running     1          19h   10.244.2.15   kube-node2               
rook-ceph-mon-d-77d55dcddf-mwnf8          1/1     Running     0          16h   10.244.3.3    kube-node3               
rook-ceph-osd-0-77b448bbcc-mdhsw          1/1     Running     1          19h   10.244.2.14   kube-node2               
rook-ceph-osd-1-65db4b7c5d-hgfcj          1/1     Running     0          16h   10.244.1.8    kube-node1               
rook-ceph-osd-2-5b475cb56c-x5w6n          1/1     Running     0          19h   10.244.0.5    kube-master              
rook-ceph-osd-3-657789944d-swjxd          1/1     Running     0          16h   10.244.3.6    kube-node3               
rook-ceph-osd-prepare-kube-master-tlhxf   0/2     Completed   0          16h   10.244.0.6    kube-master              
rook-ceph-osd-prepare-kube-node1-lgtrf    0/2     Completed   0          16h   10.244.1.12   kube-node1               
rook-ceph-osd-prepare-kube-node2-5tbt6    0/2     Completed   0          16h   10.244.2.17   kube-node2               
rook-ceph-osd-prepare-kube-node3-rrp4z    0/2     Completed   0          16h   10.244.3.5    kube-node3               
rook-ceph-tools-76c7d559b6-7kprh          1/1     Running     0          16h   10.0.0.84     kube-node3               

And of course we can validate the Ceph cluster is up and healthy via the toolbox container as well:

# kubectl -n rook-ceph exec -it $(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash

# ceph status
  cluster:
    id:     4be6e204-3d82-4cc4-9ea4-57f0e71f99c5
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum d,a,c
    mgr: a(active)
    osd: 4 osds: 4 up, 4 in
 
  data:
    pools:   0 pools, 0 pgs
    objects: 0  objects, 0 B
    usage:   17 GiB used, 123 GiB / 140 GiB avail
    pgs:     
 
# ceph osd tree
ID CLASS WEIGHT  TYPE NAME            STATUS REWEIGHT PRI-AFF
-1       0.13715 root default                                 
-4       0.03429     host kube-master                         
 2   hdd 0.03429         osd.2            up  1.00000 1.00000
-3       0.03429     host kube-node1                          
 1   hdd 0.03429         osd.1            up  1.00000 1.00000
-2       0.03429     host kube-node2                          
 0   hdd 0.03429         osd.0            up  1.00000 1.00000
-9       0.03429     host kube-node3                          
 3   hdd 0.03429         osd.3            up  1.00000 1.00000

Everything we have done up to this point has been very similar to what I did in the previous article with Minikube except instead of a single node we have a multiple node configuration. Now lets take it a step further and get an application to use our Ceph storage cluster.

The first step in Kubernetes will be to created a storageclass.yaml that uses Ceph.  Populate the storageclass.yaml with the following:

apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
  name: replicapool
  namespace: rook-ceph
spec:
  failureDomain: host
  replicated:
    size: 3
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
   name: rook-ceph-block
provisioner: ceph.rook.io/block
parameters:
  blockPool: replicapool
  # The value of "clusterNamespace" MUST be the same as the one in which your rook cluster exist
  clusterNamespace: rook-ceph
  # Specify the filesystem type of the volume. If not specified, it will use `ext4`.
  fstype: xfs
# Optional, default reclaimPolicy is "Delete". Other options are: "Retain", "Recycle" as documented in https://kubernetes.io/docs/concepts/storage/storage-classes/

Next lets create the storage class using the yaml we created and set it to default:

# kubectl create -f storageclass.yaml
cephblockpool.ceph.rook.io/replicapool created
storageclass.storage.k8s.io/rook-ceph-block created

# kubectl get storageclass
NAME              PROVISIONER          AGE
rook-ceph-block   ceph.rook.io/block   61s

# kubectl patch storageclass rook-ceph-block -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
storageclass.storage.k8s.io/rook-ceph-block patched

# kubectl get storageclass
NAME                        PROVISIONER          AGE
rook-ceph-block (default)   ceph.rook.io/block   3m30s

Now that we have a storageclass that uses Ceph as the backend we now need an application to consume the storageclass. Thankfully the Rook git repo includes a couple of examples: Wordpress and MySQL. Lets go ahead and create those apps doing the following:

# cd ./rook/cluster/examples/kubernetes

# kubectl create -f mysql.yaml
service/wordpress-mysql created
persistentvolumeclaim/mysql-pv-claim created
deployment.apps/wordpress-mysql created

# kubectl create -f wordpress.yaml
service/wordpress created
persistentvolumeclaim/wp-pv-claim created
deployment.extensions/wordpress created



We can confirm our two applications are running by the following:

# kubectl get pods -n default -o wide
NAME                               READY   STATUS    RESTARTS   AGE     IP            NODE         NOMINATED NODE   READINESS GATES
wordpress-7b6c4c79bb-7b4dq         1/1     Running   0          68s     10.244.1.14   kube-node1              
wordpress-mysql-6887bf844f-2m4h4   1/1     Running   0          2m47s   10.244.1.13   kube-node1              

Now lets confirm if they are actually using our Ceph storageclass:

# kubectl get pvc

NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
mysql-pv-claim   Bound    pvc-0c0be0ec-2317-11e9-a462-5254003ede95   20Gi       RWO            rook-ceph-block   3m36s
wp-pv-claim      Bound    pvc-46b4b266-2317-11e9-a462-5254003ede95   20Gi       RWO            rook-ceph-block   118s

And lets also confirm Wordpress is up and running from a user perspective. Note in this example we do not have an external IP and can only access the service via the cluster IP:

# kubectl get svc wordpress
NAME        TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
wordpress   LoadBalancer   10.104.120.47        80:32592/TCP 
  19m
# curl -v http://10.104.120.47
* About to connect() to 10.104.120.47 port 80 (#0)
*   Trying 10.104.120.47...
* Connected to 10.104.120.47 (10.104.120.47) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 10.104.120.47
> Accept: */*
>
< HTTP/1.1 302 Found
< Date: Mon, 28 Jan 2019 16:30:16 GMT
< Server: Apache/2.4.10 (Debian)
< X-Powered-By: PHP/5.6.28
< Expires: Wed, 11 Jan 1984 05:00:00 GMT
< Cache-Control: no-cache, must-revalidate, max-age=0
< Location: http://10.104.120.47/wp-admin/install.php
< Content-Length: 0
< Content-Type: text/html; charset=UTF-8
<
* Connection #0 to host 10.104.120.47 left intact

We can see from the above output we do connect but get a 302 code since Wordpress really needs to be configured first. But it does confirm our applications are up and using the Ceph storageclass.

To clean up the previous exercise lets do the following:

# kubectl delete -f wordpress.yaml
service "wordpress" deleted
persistentvolumeclaim "wp-pv-claim" deleted
deployment.extensions "wordpress" deleted

# kubectl delete -f mysql.yaml
service "wordpress-mysql" deleted
persistentvolumeclaim "mysql-pv-claim" deleted
deployment.apps "wordpress-mysql" deleted

# kubectl delete -n rook-ceph cephblockpools.ceph.rook.io replicapool
cephblockpool.ceph.rook.io "replicapool" deleted

# kubectl delete storageclass rook-ceph-block
storageclass.storage.k8s.io "rook-ceph-block" deleted

The above example was just a simple demonstration of the capabilities Rook/Ceph bring to Kubernetes from a block storage perspective. But leaves one wondering what other possibilities there might be.

Further Reading:

Rook: https://github.com/rook/rook
Kubernetes: https://kubernetes.io/
Previous Article:  https://www.linkedin.com/pulse/deploying-ceph-rook-benjamin-schmaus/

Monday, May 25, 2015

UCS Vmedia Policy with XML & Perl


In Cisco UCS there is a concept called a VMedia policy.  This policy allows you to designate a bootable ISO image from another server via HTTP protocol and then have it available as a boot device for a Cisco UCS blade in UCSM.   The following script is a rough framework in Perl that uses XML calls to configure such a policy.    This script could be elaborated on to take inputs for some of the various variables that I pre-populate as an example of how the framework would work.


#!/usr/bin/perl
use strict;
use LWP::UserAgent;
use HTTP::Request::Common;
my $ucs = "https:///nuova";       # This is the URL to your UCSM IP
my $username = "admin";           # Admin or other user to manager UCSM
my $password = "password";        # Password for user above
my $server = "ls-servername0001"; # Server name as defined in UCS convention
my $server2 = "servername0001";   # Server name from friendly view
my $policyname = "$server2";      # Policy name (derived from server name in this example)
my $mntename = "$server2";        # Mount name (derived from server name in this example)
my $type = "cdd";                 # Policy mount type (in this case CD-ROM)
my $image = "$server2.iso";       # Name of ISO image
my $imagepath = "/";              # Image path within the URL of remote host serving ISO
my $mountproto = "http";          # Protocol used to access ISO image
my $remotehost = "";              # Remote host IP serving ISO image
my $serverdn = "org-root/org-corp/$server";     # Server DN within UCSM  

###  Everything below remains constant###
### Get Cookie ###

my ($xmlout,@xmlout,$cookie);
my $login="";
my $userAgent = LWP::UserAgent->new;
my $response = $userAgent->request(POST $ucs, Content_Type => 'text/xml', Content => $login);

(@xmlout)= split(/\s+/,$response->content);

### Process Cookie ###

foreach $xmlout (@xmlout) {
        if ($xmlout =~ /outCookie/) {
                $cookie=$xmlout;
                $cookie =~ s/outCookie=\"|\"//g;
                print "$cookie\n";
        }
}

###Setup Vmedia Policy String###

my $crpolicy = "";

###Configure Mount Policy Within Vmedia Policy String###

my $mountentry = " ";

###Add Vmedia Policy String###

my $addvmedia = "  ";

###Execute Setup Vmedia Policy String###

$response = $userAgent->request(POST $ucs, Content_Type => 'text/xml', Content => $crpolicy);
(@xmlout)= split(/\s+/,$response->content);
printxml();
print "\n";

###Execute Configure Mount Policy Within Vmedia Policy String###

$response = $userAgent->request(POST $ucs, Content_Type => 'text/xml', Content => $mountentry);
(@xmlout)= split(/\s+/,$response->content);
printxml();
print "\n";

###Execute Add Vmedia Policy String###

$response = $userAgent->request(POST $ucs, Content_Type => 'text/xml', Content => $addvmedia);
(@xmlout)= split(/\s+/,$response->content);
printxml();
print "\n";
exit;

###Parse XML response###

sub printxml {
        foreach $xmlout (@xmlout) {
                if ($xmlout =~ /=/) {
                        $xmlout =~ s/\'|\"|\/|\>//g;
                        $xmlout =~ s/=/ = /g;
                        print "$xmlout\n";
                }
        }
}

Sunday, May 24, 2015

Deleting Duplicate Hosts In Satellite or Spacewalk


Sometimes when you register hosts to Satellite or Spacewalk you end up with duplicate hosts being registered.   The old one that will no longer check in and just be an orphan and a new one which will check in and get updates and packages.   I saw this behavior a lot in environments where people were using Vagrant and/or Openstack where they would continuously launch the same host with the same hostname and register it to Satellite. 

The script  below can be used to clean out those duplicate hosts and can be setup to run from cron daily.   The script assumes you run this as the root user and have configured root to run spacecmd without specifying username and password at the command line.  This was tested with Satellite 5.6.

#!/usr/bin/perl
### Delete duplicate hosts in Satellite or Spacewalk via Spacecmd ###
@duphosts = `spacecmd -q system_list | uniq -d`;
foreach $system (@duphosts) {
        chomp($system);
        $spacecmd = `spacecmd system_details $system 2>&1 |grep $system |grep =|sed 's/^.*=/=/'`;
        $spacecmd =~ s/\s+//g;
        $spacecmd =~ s/=//g;
        @ids = split(/\,/,$spacecmd);
        $count=0;
        foreach $ids (@ids) {
                chomp($ids);
                $count++;
                if ($count > $#ids) {
                        print "Duplicates removed for system: $system\n";
                        last;
                }
                $cmd = `spacecmd -y system_delete $ids`;
                print "$cmd\n";
                sleep (2);
        }
        print "Cleanup of $system complete...\n";
}
print "Cleanup of Satellite complete!\n";

Syncing Redhat Repos With Pulp


The following is a basic installation/configuration guide to setting up Pulp to pull package channels from Redhat's CDN(Content Delivery Network) so that you can leverage Spacewalk, Suse Manager or another repository manager that would not normally be able to access Redhat directly.

Assumptions:  This assumes you are installing Pulp on Redhat 6.6, although I don;t see why this would not work on Redhat 7.0 or another Linux distro for that matter.  The only changes would be the switch from init scripts to systemd in the below documentation.

1)  Register the host with Redhat directly to receive its updates:
      #subscription-manager register --force
      #subscription-manager refresh
      #subscription-manager subscribe --auto

2) Run an update to confirm you can access the Redhat repos properly:
     #yum upgrade

3) Install the Pulp repo and the Linux Epel repo as you will need packages from both:
    #rpm -Uvh https://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
    #rpm -Uvh http://repos.fedorapeople.org/repos/pulp/pulp/rhel-pulp.repo

4) Install, enable and install mongodb server:
    #yum install mongodb-server
    #service mongod start
    #chkconfig mongod on

5) Install, enable and start qpidd:
    #yum install qpid-cpp-server qpid-cpp-server-store
    #service qpidd start
    #service qpidd on
    #chkconfig qpidd on

6) Install Pulp group of packages:
    #yum groupinstall pulp-server-qpid

7) Run Pulp database setup to populate Pulp database:
    #sudo -u apache pulp-manage-db

8) Enable and start web service:
    #service httpd start
    #chkconfig httpd on
    #systemctl enable httpd

9) Enable and start pulp workers, celery beat and pulp resource manager:
    #chkconfig pulp_workers on
    #service pulp_workers start
    #chkconfig pulp_celerybeat on
    #service pulp_celerybeat start
    #chkconfig pulp_resource_manager on 
    #service pulp_resource_manager start

10) Install pulp-admin packages:
      #yum groupinstall pulp-admin

11) Install Pulp consumer qpid package:
      #yum groupinstall pulp-consumer-qpid

12) Edit Pulp admin.conf, consumer.conf and agent.conf to your specifications:
      #vi /etc/pulp/admin/admin.conf
      #vi /etc/pulp/consumer/consumer.conf
      #vi /etc/pulp/agent/agent.conf

At this point Pulp should be ready to consume something from the Redhat Content Delivery Network.   Lets see what setting up a sync looks like in the following steps.

1) Create a repo with the feed location and the correct certs and keys related to accessing that repo feed.
    #pulp-admin rpm repo create --repo-id=rhel-6-server-rpms --feed=https://cdn.redhat.com/content     /dist/rhel/server/6/6Server/x86_64/os --feed-ca-cert=/etc/rhsm/ca/redhat-uep.pem --feed-key=/etc/pki/entitlement/5161085288703435774-key.pem --feed-cert=/etc/pki/entitlement/5161085288703435774.pem

2) (Optional) Configure the repo you created with the number of download workers and max download speed.   This helps if you are pulling packages down over a smaller WAN link and do not want to saturate it.
    #pulp-admin rpm repo update --max-speed=14000 --repo-id=rhel-6-server-rpms
    #pulp-admin rpm repo update --max-downloads=2 --repo-id=rhel-6-server-rpms

3) Configure the repo so that it is served up via the web server so it can be consumed via HTTP:
    #pulp-admin rpm repo update --repo-id=rhel-6-server-rpms  --serve-http=true

4) Sync the repo from the source, in this case in our example Redhat:
    #pulp-admin rpm repo sync run --repo-id=rhel-6-server-rpms

5) (Optional) Setup the sync in step 4 in a cron job to occasionally sync the newer packages to keep your Pulp repo up to date.







Friday, January 02, 2015

Verifying Firmware (OBP) on Oracle (Sun Microsystems) Hardware


To verify the firmware of a Sun Microsystems server use one of the following options:

At the OBP, use the '.version' command:

ok .version
Firmware CORE Release 1.0.18 created 2002/5/23 18:22
Release 4.0 Version 18 created 2002/05/23 18:22
cPOST version 1.0.18 created 2002/5/23
CORE 1.0.18 2002/05/23 18:22
ok

When running Solaris, use the prtconf(1M) command:

# prtconf -V
OBP 4.0.18 2002/05/23 18:22
#

Powersearch Old Files with Bash Waste Script


Waste is a simple script that makes it easier to search for old files in a path on a linux systems.   Waste features allow you do display the listing sorted by day or size and search for files a specific number of days old or between a range of days old.    It simplifies the usage of the find command by wrapping it up into a easy to use script.   Its appeal is as a cleanup script for directories that have files that need to be purged over time.

#!/bin/bash
if [[ $# = 0 || $1 = "-h" || "$#" -lt 4 ]]; then
        echo "Usage: [-d|-s] [-r {start end}|-o {start}] directory"
        echo " -h displays the help"
        echo " -d display listing sorted by day"
        echo " -s display listing sorted by size"
        echo " -r display files between a specific number of days old"
        echo " -o display files older then a specific number of days"
        echo " Example1: waste -d -r 10 20 /home"
        echo " Example2: waste -s -o 30 /process"
        exit
fi
case "$1" in
        '-s')
                case "$2" in
                        '-r')
                                if [ "$#" -lt 5 ]; then
                                        echo "Usage: [-d|-s] [-r {start end}|-o {start}] directory"
                                        exit
                                fi
                                printf "Date\t\tSize\tDirectory/File\n"
                                /usr/bin/find $5 -type d -daystart -mtime +$3 -mtime -$4 -printf "%CY-%Cm-%Cd\t" -maxdepth 1 -exec /usr/bin/du.new -s --block-size=M "{}" \; | /bin/sort -k2nr
                                ;;
                        '-o')
                                if [ "$#" -lt 4 ]; then
                                        echo "Usage: [-d|-s] [-r {start end}|-o {start}] directory"
                                        exit
                                fi
                                printf "Date\t\tSize\tDirectory/File\n"
                                /usr/bin/find $4 -type d -daystart -mtime +$3 -printf "%CY-%Cm-%Cd\t" -maxdepth 1 -exec /usr/bin/du.new -s --block-size=M "{}" \; | /bin/sort -k2nr
                                ;;
                esac
        ;;
        '-d')
                case "$2" in
                        '-r')
                                if [ "$#" -lt 5 ]; then
                                        echo "Usage: [-d|-s] [-r {start end}|-o {start}] directory"
                                        exit
                                fi
                                printf "Date\t\tSize\tDirectory/File\n"
                                /usr/bin/find $5 -type d -daystart -mtime +$3 -mtime -$4 -printf "%CY-%Cm-%Cd\t" -maxdepth 1 -exec /usr/bin/du.new -s --block-size=M "{}" \; | /bin/sort -k2nr | /bin/sort
                                ;;
                        '-o')
                                if [ "$#" -lt 4 ]; then
                                        echo "Usage: [-d|-s] [-r {start end}|-o {start}] directory"
                                        exit
                                fi
                                printf "Date\t\tSize\tDirectory/File\n"
                                /usr/bin/find $4 -type d -daystart -mtime +$3 -printf "%CY-%Cm-%Cd\t" -maxdepth 1 -exec /usr/bin/du.new -s --block-size=M "{}" \; | /bin/sort -k2nr | /bin/sort
                                ;;
                esac
        ;;
esac 

Wednesday, December 31, 2014

x86 Hardware RAID Traps with BS_RAID_CHK


The following script is designed to run on Solaris x86 or Redhat systems with LSI and Adaptec hardware controllers.  It will check for degraded states of those controllers using mpt-status, arccon or raidctl depending on hardware vendor and OS.   If a degraded state is found it will send an snmptrap to your snmptrap collector.

#!/usr/bin/perl
#################################################################
# Script checks status of hardware raid on X86 hardware        #
# Supported OS's: Solaris x86, Redhat                #
# Supported Controller's: LSI & Adaptec                #
# Note: Requires mpt-status for LSI                #
# Note: Requires arccon for adaptec                #
# Note: Requires raidctl for Solaris x86            #
# Sends trap if degraded state                    #
#################################################################
use strict;
my $prefix = "bs_raid_chk";
my $servicechk = "unix_traps";
my $community = "asdpublic";
my $manager = "10.66.65.23";
my $raidctl = '/usr/sbin/raidctl';
my $mptstatus = '/usr/sbin/mpt-status';
my $arcconf = '/usr/StorMan/arcconf';
my (@components,@command);
my ($num,$status,$sendtrap,$volume);
### If Solaris System use this check ###
if (`uname -a` =~ /SunOS/) {
    if ( -e $raidctl) {
        @command = `$raidctl -S`;
        foreach (@command) {
            chomp();
            @components = split();
            $num = $components[1] + 2;
            $status = "$components[$#components] - Controller: $components[0], RAID: $components[$num], Number of Disks: $components[1]\n";
            if ($_ =~ /DEGRADED/) {
                system "/bin/rm /tmp/$prefix.* >/dev/null 2>&1";
                system "/bin/touch /tmp/$prefix.CRITICAL";
                system "/usr/sfw/bin/snmptrap -v 2c -c $community $manager '' .1.3.6.1.4.1.11.2.17.1.0.1005 .1.3.6.1.4.1.11.2.17 s \"$servicechk\" .1.3.6.1.4.1.11.2.17 s \"$status\"";
                        exit;
            } elsif ($_ =~ /SYNC/) {
                system "/bin/rm /tmp/$prefix.* >/dev/null 2>&1";
                                system "/bin/touch /tmp/$prefix.WARNING";
                system "/usr/sfw/bin/snmptrap -v 2c -c $community $manager '' .1.3.6.1.4.1.11.2.17.1.0.1006 .1.3.6.1.4.1.11.2.17 s \"$servicechk\" .1.3.6.1.4.1.11.2.17 s \"$status\"";
                        exit;
            } elsif ($_ =~ /OPTIMAL/) {
                if ( !-e "/tmp/$prefix.OK" ) {
                    system "/bin/rm /tmp/$prefix.* >/dev/null 2>&1";
                    system "/bin/touch /tmp/$prefix.OK";
                    system "/usr/sfw/bin/snmptrap -v 2c -c $community $manager '' .1.3.6.1.4.1.11.2.17.1.0.1007 .1.3.6.1.4.1.11.2.17 s \"$servicechk\" .1.3.6.1.4.1.11.2.17 s \"$status\"";
                }
            }
        }
    }
}
### If Linux system use this check  ###
if (`uname -a` =~ /Linux/) {
    # If system has LSI controller, then mptstatus should be installed
    if (-e $mptstatus) {
        my $modstatus = `/sbin/lsmod |grep mptctl|wc -l`;
        chomp($modstatus);
        if ($modstatus eq "0") {
            my $modload = `/sbin/modprobe mptctl`;
            $modstatus = `/sbin/lsmod |grep mptctl|wc -l`;
            chomp($modstatus);
            if ($modstatus eq "0") { print "ABORT: Failed to load mptctl module.\n";exit;}
        }
        my $controller = `$mptstatus -p -s|grep Found`;
        chomp($controller);
        my ($id,$junk) = split(/,/,$controller);
        $id =~ s/Found SCSI id=//g;
        @command = `$mptstatus -i $id -s`;
        $status="";
        foreach (@command) {
            chomp();
            $status = "$status $_";   
        }
        $status = "$status";
        #print "$status\n";
        foreach (@command) {
            chomp();
            if ( $_ =~ /DEGRADED/ ) {
                system "/bin/rm /tmp/$prefix.* >/dev/null 2>&1";
                system "/bin/touch /tmp/$prefix.CRITICAL";   
                system "/usr/bin/snmptrap -v 2c -c $community $manager '' .1.3.6.1.4.1.11.2.17.1.0.1005 .1.3.6.1.4.1.11.2.17 s \"$servicechk\" .1.3.6.1.4.1.11.2.17 s \"$status\"";
                exit;
            } elsif ($_ =~ /SYNC/ ) {
                system "/bin/rm /tmp/$prefix.* >/dev/null 2>&1";
                system "/bin/touch /tmp/$prefix.WARNING";
                system "/usr/bin/snmptrap -v 2c -c $community $manager '' .1.3.6.1.4.1.11.2.17.1.0.1006 .1.3.6.1.4.1.11.2.17 s \"$servicechk\" .1.3.6.1.4.1.11.2.17 s \"$status\"";
                exit;
            } elsif ($_ =~ /OPTIMAL/ ) {
                if ( !-e "/tmp/$prefix.OK" ) {
                    system "/bin/rm /tmp/$prefix.* >/dev/null 2>&1";
                    system "/bin/touch /tmp/$prefix.OK";
                    system "/usr/bin/snmptrap -v 2c -c $community $manager '' .1.3.6.1.4.1.11.2.17.1.0.1007 .1.3.6.1.4.1.11.2.17 s \"$servicechk\" .1.3.6.1.4.1.11.2.17 s \"$status\"";
                }
            }
        }
    }
    # if system has Adaptec controller then arcconf should be installed
    if ( -e $arcconf ) {
        @command = `$arcconf getconfig 1|grep Status|grep :`;
        foreach (@command) {
            if (( $_ =~ /Controller Status/ ) && ($_ !~ /Optimal/ )) {
                $status = "Controller not optimal";
                system "/bin/rm /tmp/$prefix.* >/dev/null 2>&1";
                system "/bin/touch /tmp/$prefix.CRITICAL";
                system "/usr/bin/snmptrap -v 2c -c $community $manager '' .1.3.6.1.4.1.11.2.17.1.0.1005 .1.3.6.1.4.1.11.2.17 s \"$servicechk\" .1.3.6.1.4.1.11.2.17 s \"$status\"";
                exit;
            }
            if (( $_ =~ /  Status  / ) && ($_ !~ /Optimal/ )) {
                $status = "Battery not optimal";
                system "/bin/rm /tmp/$prefix.* >/dev/null 2>&1";
                system "/bin/touch /tmp/$prefix.WARNING";
                system "/usr/bin/snmptrap -v 2c -c $community $manager '' .1.3.6.1.4.1.11.2.17.1.0.1006 .1.3.6.1.4.1.11.2.17 s \"$servicechk\" .1.3.6.1.4.1.11.2.17 s \"$status\"";
                exit;
            }
            if (( $_ =~ /Status of logical device/) && ($_ !~ /Optimal/ )) {
                $status = "Logical HW RAID Volume not optimal";
                system "/bin/rm /tmp/$prefix.* >/dev/null 2>&1";
                system "/bin/touch /tmp/$prefix.CRITICAL";
                system "/usr/bin/snmptrap -v 2c -c $community $manager '' .1.3.6.1.4.1.11.2.17.1.0.1005 .1.3.6.1.4.1.11.2.17 s \"$servicechk\" .1.3.6.1.4.1.11.2.17 s \"$status\"";
                exit;
            }
            if ( $_ =~ /Optimal/ ) {
                if ( !-e "/tmp/$prefix.OK" ) {
                    $status = "Hardware RAID - OK";
                    system "/bin/rm /tmp/$prefix.* >/dev/null 2>&1";
                    system "/bin/touch /tmp/$prefix.OK";
                    system "/usr/bin/snmptrap -v 2c -c $community $manager '' .1.3.6.1.4.1.11.2.17.1.0.1007 .1.3.6.1.4.1.11.2.17 s \"$servicechk\" .1.3.6.1.4.1.11.2.17 s \"$status\"";
                }
            }
        }
    }   
}
exit; 


Perl Script to Generate Logon.bat for SAMBA Users


The following script will generate a vanilla logon.bat file for SAMBA users. 

#!/usr/bin/perl
################################
# Usage: smb-logon-script      #
################################

$startpath="/data/smb-logon-scripts";
$endpath="/data/netlogon/scripts";
$smbhost = "sambahost.domain.com";


@alpha = ("g"..."t","v"..."z","aa"..."zz");
if (! defined $ARGV[0] ) {
        print " Usage: smb-logon-script \n";
        exit;
}
$username = $ARGV[0];
@group = `/usr/bin/getent group|/bin/grep $username |/bin/cut -d: -f1 -`;
$counter=0;
open FILE, ">$startpath/$username.bat.unix";
foreach $group (@group) {
        print FILE "net use $alpha[$counter]: \\\\$smbhost\\$group";
        $counter++;
}
close (FILE);
$convert = `/usr/bin/dos2unix < $startpath/$username.bat.unix > $endpath/$username.bat`;
exit;

OpenStack Neutron Distributed Virtual Routing Architectural Overview (Icehouse vs Juno)

Layer 3 Routing in Icehouse:


Layer 3 Routing in Juno with DVR:


Sun Microsystems SunFire V100

Here is a trip down memory lane today with the old SunFire V100.    


And under the cover:




Ceph Architectural Overview


Monday, December 29, 2014

Persistantly Bind Tape Devices in Solaris via Perl

The following script will look for fiber channel tape devices and then configure thedevlinks.tab file with the appropriate information so the tape drives will persistently bind to the same device across reboots on a Solaris server.   This script was tested on Solaris 10.

#!/usr/bin/perl
use strict;
my($junk,$path,$devices,$dev,$file);
my(@devices,@file);
my $date = `date +%m%d%Y`;
$file = `/usr/bin/cp /etc/devlink.tab /etc/devlink.tab.$date`;
@file = `cat /etc/devlink.tab`;
@file = grep !/type=ddi_byte:tape/, @file;
open (FILE,">/etc/devlink.tab.new");
print FILE @file;
close (FILE);
 
@devices = `ls -l /dev/rmt/*cbn|awk {'print \$9 \$11'}`;
open (FILE,">>/etc/devlink.tab.new");
foreach $devices (@devices) {
        chomp($devices);
        ($dev,$path) = split(/\.\.\/\.\./,$devices);
        $dev =~ s/cbn//g;
        $dev =~ s/\/dev\/rmt\///g;
        $path =~ s/:cbn//g;
        ($junk,$path) = split(/st\@/,$path);
        print FILE "type=ddi_byte:tape;addr=$path;\trmt/$dev\\M0\n";
}
close (FILE);
$file = `/usr/bin/mv /etc/devlink.tab.new /etc/devlink.tab`;
exit;