Monday, June 17, 2019

Centralized vBMC Controller


In my lab I use KVM virtual machines as my "baremetal" machines for testing OpenStack and Openshift.  In both of those cases I need something that provides power management to power off/on the virtual machines during deployment phases.   This is where Virtual BMC (vBMC) comes in as a handy tool to provide that functionality.   However I really don't want to install vBMC on all of the physical hosts that were providing my virtual machines.   Thankfully as this blog will explain there is a way to run vBMC where you can centrally manage all the virtual machines.

First lets pick a host that will be our centralized vBMC controller.   This host could be a physical box or a virtual machine it does not matter.  It does however need to have SSH key authentication to any of the KVM hypervisor hosts that contain virtual machines we wish to control with vBMC.

Once I have my vBMC host I will install the required package via rpm since I did not have a repo that contained the package.  If you have a repo that does container the package I would suggest using yum install instead:

# rpm -ivh python2-virtualbmc-1.4.0-1.el7.noarch.rpm 
Preparing...                          ################################# [100%]
Updating / installing...
   1:python2-virtualbmc-1.4.0-1.el7   ################################# [100%]

Once the package is installed we should be able to run the following command to see the command line usage for vbmc when adding a host.  If you get errors about cliff.app and zmq please install these packages (python2-cliff.noarch & python2-zmq.x86_64):

# vbmc add --help
usage: vbmc add [-h] [--username USERNAME] [--password PASSWORD] [--port PORT]
                [--address ADDRESS] [--libvirt-uri LIBVIRT_URI]
                [--libvirt-sasl-username LIBVIRT_SASL_USERNAME]
                [--libvirt-sasl-password LIBVIRT_SASL_PASSWORD]
                domain_name

Create a new BMC for a virtual machine instance

positional arguments:
  domain_name           The name of the virtual machine

optional arguments:
  -h, --help            show this help message and exit
  --username USERNAME   The BMC username; defaults to "admin"
  --password PASSWORD   The BMC password; defaults to "password"
  --port PORT           Port to listen on; defaults to 623
  --address ADDRESS     The address to bind to (IPv4 and IPv6 are supported);
                        defaults to ::
  --libvirt-uri LIBVIRT_URI
                        The libvirt URI; defaults to "qemu:///system"
  --libvirt-sasl-username LIBVIRT_SASL_USERNAME
                        The libvirt SASL username; defaults to None
  --libvirt-sasl-password LIBVIRT_SASL_PASSWORD
                        The libvirt SASL password; defaults to None


Now lets try adding a virtual machine called kube-master located on a remote hypervisor host:

# vbmc add --username admin --password password --port 6230 --address 192.168.0.10 --libvirt-uri qemu+ssh://root@192.168.0.4/system kube-master

Now lets add a second virtual machine on a different hypervisor and notice I increment the port number in use as this is the unique port number that gets called when using ipmi to actually connection to the specific host we wish to power on/off or get a power status from:

# vbmc add --username admin --password password --port 6231 --address 192.168.0.10 --libvirt-uri qemu+ssh://root@192.168.0.5/system cube-vm1

Now lets start the vbmc process for them and confirm they are up and running:

# vbmc start kube-master
2019-06-17 08:48:05,649.649 6915 INFO VirtualBMC [-] Started vBMC instance for domain kube-master

# vbmc start cube-vm1
2019-06-17 14:49:39,491.491 6915 INFO VirtualBMC [-] Started vBMC instance for domain cube-vm1
# vbmc list
+-------------+---------+--------------+------+
| Domain name | Status  | Address      | Port |
+-------------+---------+--------------+------+
| cube-vm1    | running | 192.168.0.10 | 6231 |
| kube-master | running | 192.168.0.10 | 6230 |
+-------------+---------+--------------+------+

Now that we have added a few virtual machines lets validate that things are working by trying to power the hosts up and get a status. In this example we will check the power status of kube-master and power on if it is off:

# ipmitool -I lanplus -H192.168.0.10 -p6230 -Uadmin -Ppassword chassis status
System Power         : off
Power Overload       : false
Power Interlock      : inactive
Main Power Fault     : false
Power Control Fault  : false
Power Restore Policy : always-off
Last Power Event     : 
Chassis Intrusion    : inactive
Front-Panel Lockout  : inactive
Drive Fault          : false
Cooling/Fan Fault    : false

# ipmitool -I lanplus -H192.168.0.10 -p6230 -Uadmin -Ppassword chassis power on
Chassis Power Control: Up/On

# ipmitool -I lanplus -H192.168.0.10 -p6230 -Uadmin -Ppassword chassis status
System Power         : on
Power Overload       : false
Power Interlock      : inactive
Main Power Fault     : false
Power Control Fault  : false
Power Restore Policy : always-off
Last Power Event     : 
Chassis Intrusion    : inactive
Front-Panel Lockout  : inactive
Drive Fault          : false
Cooling/Fan Fault    : false

In the next example we will see that cube-vm1 is powered on and we should power it off:

# ipmitool -I lanplus -H192.168.0.10 -p6231 -Uadmin -Ppassword chassis status
System Power         : on
Power Overload       : false
Power Interlock      : inactive
Main Power Fault     : false
Power Control Fault  : false
Power Restore Policy : always-off
Last Power Event     : 
Chassis Intrusion    : inactive
Front-Panel Lockout  : inactive
Drive Fault          : false
Cooling/Fan Fault    : false

# ipmitool -I lanplus -H192.168.0.10 -p6231 -Uadmin -Ppassword chassis power off
Chassis Power Control: Down/Off

# ipmitool -I lanplus -H192.168.0.10 -p6231 -Uadmin -Ppassword chassis status
System Power         : off
Power Overload       : false
Power Interlock      : inactive
Main Power Fault     : false
Power Control Fault  : false
Power Restore Policy : always-off
Last Power Event     : 
Chassis Intrusion    : inactive
Front-Panel Lockout  : inactive
Drive Fault          : false
Cooling/Fan Fault    : false

Lets summarize what we just did.  We had a vBMC host that was ipaddress 192.168.0.10 where we installed vBMC and configured two different virtual machines kube-master and cube-vm1 which were on two completely different hypervisor guests, ip address 192.168.0.4 and 192.168.0.5 respectively.  This allowed us to remotely power manage those virtual machines without the need to install any additional software on those hypervisor hosts.

Given this flexibility one could foresee maybe in the future have a centalized vBMC container that could then in turn access any KubeVirt deployed virtual machines that are deployed within that Kubernetes cluster.  I guess its only a matter of time.

Tuesday, June 11, 2019

Metal-3 Installer Dev Scripts & Macvtap


Recently I was working with the dev-scripts from the OpenShift Metal3 project located here on github.  I had been using CI automation to run test jobs which were working perfectly in my simulated virtual baremetal environment.

However last week my CI broke due to a code change.  Looking through the changes I noticed that they introduced a discovery mechanism that relied on multicast.   Under normal circumstances when using real baremetal and not virtual baremetal this issue would not have been rendered.  But in my environment it quickly became clear that multicast traffic was not being passed.

The problem is that in order to leverage PXE booting for my virtual baremetal nodes I needed to ensure that I had a network interface that was attached to a native vlan on my physical interface since PXE traffic cannot be tagged.  The solution for this issue was to use macvtap for my virtual baremetal machines.   But as I quickly learned macvtap by default does not pass multicast.

I determined this by using tcpdump on my bootstrap node and sure enough I did not see any multicast packets when the master nodes were going through the ironic-introspection:

$ sudo tcpdump -n -i any port 5353 | grep 172.22
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes

As a quick test to validate my thinking I went ahead and applied the following on the macvtap interface where the virtual bootstrap node runs:

# ip a|grep macvtap0
macvtap0@eno1: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500

# ip link set dev macvtap0 allmulticast on

# ip a|grep macvtap0
macvtap0@eno1: BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500


After setting the allmulticast on the macvtap device I went back to tcpdump again and found now my device was passing the multicast traffic I needed for host discovery:

$ sudo tcpdump -n -i any port 5353 | grep 172.22
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
07:03:26.186790 IP 172.22.0.55.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:26.186807 IP 172.22.0.55.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:26.186790 IP 172.22.0.55.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:26.188884 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:26.188888 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:26.188894 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:30.659938 IP 172.22.0.58.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal._openstack._tcp.local. TXT (QM)? baremetal._openstack._tcp.local. A (QM)? baremetal._openstack._tcp.local. (61)
07:03:30.659951 IP 172.22.0.58.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal._openstack._tcp.local. TXT (QM)? baremetal._openstack._tcp.local. A (QM)? baremetal._openstack._tcp.local. (61)
07:03:30.659938 IP 172.22.0.58.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal._openstack._tcp.local. TXT (QM)? baremetal._openstack._tcp.local. A (QM)? baremetal._openstack._tcp.local. (61)
07:03:30.660553 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal._openstack._tcp.local.:6385 0 0, (Cache flush) TXT "ipa_debug=true" "protocol=http", (Cache flush) A 172.22.0.1 (136)
07:03:30.660556 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal._openstack._tcp.local.:6385 0 0, (Cache flush) TXT "ipa_debug=true" "protocol=http", (Cache flush) A 172.22.0.1 (136)
07:03:30.660561 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal._openstack._tcp.local.:6385 0 0, (Cache flush) TXT "ipa_debug=true" "protocol=http", (Cache flush) A 172.22.0.1 (136)
07:03:33.976735 IP 172.22.0.78.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal._openstack._tcp.local. TXT (QM)? baremetal._openstack._tcp.local. A (QM)? baremetal._openstack._tcp.local. (61)
07:03:33.976749 IP 172.22.0.78.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal._openstack._tcp.local. TXT (QM)? baremetal._openstack._tcp.local. A (QM)? baremetal._openstack._tcp.local. (61)
07:03:33.976735 IP 172.22.0.78.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal._openstack._tcp.local. TXT (QM)? baremetal._openstack._tcp.local. A (QM)? baremetal._openstack._tcp.local. (61)
07:03:33.978619 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal._openstack._tcp.local.:6385 0 0, (Cache flush) TXT "ipa_debug=true" "protocol=http", (Cache flush) A 172.22.0.1 (136)
07:03:33.978622 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal._openstack._tcp.local.:6385 0 0, (Cache flush) TXT "ipa_debug=true" "protocol=http", (Cache flush) A 172.22.0.1 (136)
07:03:33.978632 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal._openstack._tcp.local.:6385 0 0, (Cache flush) TXT "ipa_debug=true" "protocol=http", (Cache flush) A 172.22.0.1 (136)
07:03:36.077289 IP 172.22.0.58.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:36.077294 IP 172.22.0.58.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:36.077289 IP 172.22.0.58.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:36.077895 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:36.077897 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:36.077900 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:39.395298 IP 172.22.0.78.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:39.395305 IP 172.22.0.78.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:39.395298 IP 172.22.0.78.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:39.396947 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:39.396951 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:39.396956 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)

Further my deployment completed successfully given that multicast was being passed.  Despite the success though this is not the end of the story.  The reason is that I needed this change permanent and any reboot of the bootstrap node would cause the macvtap state to go back to the default of multicast disabled.

The solution was to ensure the following is set on the device in the bootstrap nodes kvm XML file configuration:

interface type='direct' trustGuestRxFilters='yes'

Hopefully this helps in any situation when macvtap is being used and multicast traffic is required to pass over the interface.