Tuesday, June 11, 2019

Metal-3 Installer Dev Scripts & Macvtap


Recently I was working with the dev-scripts from the OpenShift Metal3 project located here on github.  I had been using CI automation to run test jobs which were working perfectly in my simulated virtual baremetal environment.

However last week my CI broke due to a code change.  Looking through the changes I noticed that they introduced a discovery mechanism that relied on multicast.   Under normal circumstances when using real baremetal and not virtual baremetal this issue would not have been rendered.  But in my environment it quickly became clear that multicast traffic was not being passed.

The problem is that in order to leverage PXE booting for my virtual baremetal nodes I needed to ensure that I had a network interface that was attached to a native vlan on my physical interface since PXE traffic cannot be tagged.  The solution for this issue was to use macvtap for my virtual baremetal machines.   But as I quickly learned macvtap by default does not pass multicast.

I determined this by using tcpdump on my bootstrap node and sure enough I did not see any multicast packets when the master nodes were going through the ironic-introspection:

$ sudo tcpdump -n -i any port 5353 | grep 172.22
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes

As a quick test to validate my thinking I went ahead and applied the following on the macvtap interface where the virtual bootstrap node runs:

# ip a|grep macvtap0
macvtap0@eno1: BROADCAST,MULTICAST,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500

# ip link set dev macvtap0 allmulticast on

# ip a|grep macvtap0
macvtap0@eno1: BROADCAST,MULTICAST,ALLMULTI,UP,LOWER_UP mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500


After setting the allmulticast on the macvtap device I went back to tcpdump again and found now my device was passing the multicast traffic I needed for host discovery:

$ sudo tcpdump -n -i any port 5353 | grep 172.22
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
07:03:26.186790 IP 172.22.0.55.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:26.186807 IP 172.22.0.55.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:26.186790 IP 172.22.0.55.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:26.188884 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:26.188888 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:26.188894 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:30.659938 IP 172.22.0.58.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal._openstack._tcp.local. TXT (QM)? baremetal._openstack._tcp.local. A (QM)? baremetal._openstack._tcp.local. (61)
07:03:30.659951 IP 172.22.0.58.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal._openstack._tcp.local. TXT (QM)? baremetal._openstack._tcp.local. A (QM)? baremetal._openstack._tcp.local. (61)
07:03:30.659938 IP 172.22.0.58.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal._openstack._tcp.local. TXT (QM)? baremetal._openstack._tcp.local. A (QM)? baremetal._openstack._tcp.local. (61)
07:03:30.660553 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal._openstack._tcp.local.:6385 0 0, (Cache flush) TXT "ipa_debug=true" "protocol=http", (Cache flush) A 172.22.0.1 (136)
07:03:30.660556 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal._openstack._tcp.local.:6385 0 0, (Cache flush) TXT "ipa_debug=true" "protocol=http", (Cache flush) A 172.22.0.1 (136)
07:03:30.660561 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal._openstack._tcp.local.:6385 0 0, (Cache flush) TXT "ipa_debug=true" "protocol=http", (Cache flush) A 172.22.0.1 (136)
07:03:33.976735 IP 172.22.0.78.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal._openstack._tcp.local. TXT (QM)? baremetal._openstack._tcp.local. A (QM)? baremetal._openstack._tcp.local. (61)
07:03:33.976749 IP 172.22.0.78.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal._openstack._tcp.local. TXT (QM)? baremetal._openstack._tcp.local. A (QM)? baremetal._openstack._tcp.local. (61)
07:03:33.976735 IP 172.22.0.78.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal._openstack._tcp.local. TXT (QM)? baremetal._openstack._tcp.local. A (QM)? baremetal._openstack._tcp.local. (61)
07:03:33.978619 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal._openstack._tcp.local.:6385 0 0, (Cache flush) TXT "ipa_debug=true" "protocol=http", (Cache flush) A 172.22.0.1 (136)
07:03:33.978622 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal._openstack._tcp.local.:6385 0 0, (Cache flush) TXT "ipa_debug=true" "protocol=http", (Cache flush) A 172.22.0.1 (136)
07:03:33.978632 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal._openstack._tcp.local.:6385 0 0, (Cache flush) TXT "ipa_debug=true" "protocol=http", (Cache flush) A 172.22.0.1 (136)
07:03:36.077289 IP 172.22.0.58.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:36.077294 IP 172.22.0.58.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:36.077289 IP 172.22.0.58.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:36.077895 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:36.077897 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:36.077900 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:39.395298 IP 172.22.0.78.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:39.395305 IP 172.22.0.78.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:39.395298 IP 172.22.0.78.mdns > 224.0.0.251.mdns: 0 [3q] SRV (QM)? baremetal-introspection._openstack._tcp.local. TXT (QM)? baremetal-introspection._openstack._tcp.local. A (QM)? baremetal-introspection._openstack._tcp.local. (75)
07:03:39.396947 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:39.396951 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)
07:03:39.396956 IP 172.22.0.1.mdns > 224.0.0.251.mdns: 0*- [0q] 3/0/1 (Cache flush) SRV baremetal-introspection._openstack._tcp.local.:5050 0 0, (Cache flush) TXT "ipa_debug=1" "ipa_inspection_dhcp_all_interfaces=1" "protocol=http" "ipa_collect_lldp=1", (Cache flush) A 172.22.0.1 (203)

Further my deployment completed successfully given that multicast was being passed.  Despite the success though this is not the end of the story.  The reason is that I needed this change permanent and any reboot of the bootstrap node would cause the macvtap state to go back to the default of multicast disabled.

The solution was to ensure the following is set on the device in the bootstrap nodes kvm XML file configuration:

interface type='direct' trustGuestRxFilters='yes'

Hopefully this helps in any situation when macvtap is being used and multicast traffic is required to pass over the interface.