I have two security groups. Network 1 VMs (a group that contains my VMs in the 192.168.1.0/24) and Network 2 VMs (similar, for 192.168.2.0/24 network).
Both are dynamic groups. I select members based on whether the VM name contains -n1 or -n2. (The whole exercise is just for fun/ getting to know this stuff).
I have two firewall rules making use of these rules. Layer 2 and Layer 3.
The Layer 2 rule works but the Layer 3 one does not! Weird.
I decided to troubleshoot this via the command line. Figured it would be a good opportunity.
To troubleshoot I have to check the rules on the hosts (because remember, that’s where the firewall is; it’s a kernel module in each host). For that I need to get the host-id. For which I need to get the cluster-id. Sadly there’s no command to list all hosts (or at least I don’t know of any).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
# get a list of clusters nsxmgr63-01> show dfw cluster all No. Cluster Name Cluster Id Datacenter Name Firewall Status 1 Cluster One domain-c15 DC01 Enabled 2 Cluster Two domain-c17 DC01 Enabled 3 Mgt Cluster domain-c7 DC01 Enabled # check each cluster to get the host-ids nsxmgr63-01> show cluster domain-c15 Datacenter: DC01 Cluster: Cluster One No. Host Name Host Id Installation Status 1 esx65-11.my.domain host-19 Enabled nsxmgr63-01> show cluster domain-c17 Datacenter: DC01 Cluster: Cluster Two No. Host Name Host Id Installation Status 1 esx65-21.my.domain host-21 Enabled |
So now I have my host-ids.
Let’s also take a look the my VMs (thankfully it’s a short list! I wonder how admins do this in real life):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
nsxmgr63-01> show vm vm-40 Datacenter: DC01 Cluster: Cluster One Host: esx65-11.my.domain Host-ID: host-19 VM: obsd-n2h11_1 Virtual Nics List: 1. Vnic Name obsd-n2h11_1 - Network adapter 1 Vnic Id 50364847-a9d1-3326-a41a-5c62706edc9e.000 Filters nic-110513-eth0-vmware-sfw.2 <---- (host-19) nsxmgr63-01> show vm vm-35 Datacenter: DC01 Cluster: Cluster One Host: esx65-11.my.domain Host-ID: host-19 VM: obsd-n1h11_1 Virtual Nics List: 1. Vnic Name obsd-n1h11_1 - Network adapter 1 Vnic Id 50369375-2fc4-1f85-b85e-f15613751fbf.000 Filters nic-111445-eth0-vmware-sfw.2 <---- (host-19) nsxmgr63-01> show vm vm-64 Datacenter: DC01 Cluster: Cluster Two Host: esx65-21.my.domain Host-ID: host-21 VM: obsd-n3h21_1 Virtual Nics List: 1. Vnic Name obsd-n3h21_1 - Network adapter 1 Vnic Id 503698c0-363b-2c5d-2c73-6182766cd6f1.000 Filters nic-464408-eth0-vmware-sfw.2 <---- (host-21) nsxmgr63-01> show vm vm-39 Datacenter: DC01 Cluster: Cluster Two Host: esx65-21.my.domain Host-ID: host-21 VM: obsd-n1h21_1 Virtual Nics List: 1. Vnic Name obsd-n1h21_1 - Network adapter 1 Vnic Id 50365bd9-f536-095e-eeef-d9f6d81f733b.000 Filters nic-464142-eth0-vmware-sfw.2 <---- (host-21) |
We can see the filters applying to each VM. To summarize:
1 2 3 4 |
nic-110513-eth0-vmware-sfw.2 | obsd-n2h11_1 | host-19 (esx65-11) nic-111445-eth0-vmware-sfw.2 | obsd-n1h11_1 | host-19 (esx65-11) nic-464408-eth0-vmware-sfw.2 | obsd-n1h11_1 | host-21 (esx65-21) nic-464142-eth0-vmware-sfw.2 | obsd-n3h21_1 | host-21 (esx65-21) |
And are these filters applying on the hosts themselves?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
# checking host-19 nsxmgr63-01> show dfw host host-19 summarize-dvfilter <snip> Filters: world 0 <no world> port 50331650 vmk0 vNic slot 0 name: nic-0-eth4294967295-ESXi-Firewall.0 agentName: ESXi-Firewall state: IOChain Attached vmState: Detached failurePolicy: failOpen slowPathID: none filter source: Invalid port 50331663 vmk1 vNic slot 0 name: nic-0-eth4294967295-ESXi-Firewall.0 agentName: ESXi-Firewall state: IOChain Attached vmState: Detached failurePolicy: failOpen slowPathID: none filter source: Invalid world 110513 vmm0:obsd-21 vcUuid:'50 36 48 47 a9 d1 33 26-a4 1a 5c 62 70 6e dc 9e' port 50331666 obsd-21.eth0 vNic slot 2 name: nic-110513-eth0-vmware-sfw.2 <----- applied! agentName: vmware-sfw state: IOChain Attached vmState: Detached failurePolicy: failClosed slowPathID: none filter source: Dynamic Filter Creation vNic slot 1 name: nic-110513-eth0-dvfilter-generic-vmware-swsec.1 agentName: dvfilter-generic-vmware-swsec state: IOChain Attached vmState: Detached failurePolicy: failClosed slowPathID: none filter source: Alternate Opaque Channel world 111445 vmm0:obsd-11 vcUuid:'50 36 93 75 2f c4 1f 85-b8 5e f1 56 13 75 1f bf' port 50331668 obsd-11.eth0 vNic slot 2 name: nic-111445-eth0-vmware-sfw.2 <----- applied! agentName: vmware-sfw state: IOChain Attached vmState: Detached failurePolicy: failClosed slowPathID: none filter source: Dynamic Filter Creation vNic slot 1 name: nic-111445-eth0-dvfilter-generic-vmware-swsec.1 agentName: dvfilter-generic-vmware-swsec state: IOChain Attached vmState: Detached failurePolicy: failClosed slowPathID: none filter source: Alternate Opaque Channel # checking host-21 nsxmgr63-01> show dfw host host-21 summarize-dvfilter <snip> Filters: world 0 <no world> port 50331650 vmk0 vNic slot 0 name: nic-0-eth4294967295-ESXi-Firewall.0 agentName: ESXi-Firewall state: IOChain Attached vmState: Detached failurePolicy: failOpen slowPathID: none filter source: Invalid port 50331654 vmk1 vNic slot 0 name: nic-0-eth4294967295-ESXi-Firewall.0 agentName: ESXi-Firewall state: IOChain Attached vmState: Detached failurePolicy: failOpen slowPathID: none filter source: Invalid world 464142 vmm0:obsd-n1h21_1 vcUuid:'50 36 5b d9 f5 36 09 5e-ee ef d9 f6 d8 1f 73 3b' port 50331656 obsd-n1h21_1.eth0 vNic slot 2 name: nic-464142-eth0-vmware-sfw.2 <----- applied! agentName: vmware-sfw state: IOChain Attached vmState: Detached failurePolicy: failClosed slowPathID: none filter source: Dynamic Filter Creation vNic slot 1 name: nic-464142-eth0-dvfilter-generic-vmware-swsec.1 agentName: dvfilter-generic-vmware-swsec state: IOChain Attached vmState: Detached failurePolicy: failClosed slowPathID: none filter source: Alternate Opaque Channel world 464408 vmm0:obsd-n3h21_1 vcUuid:'50 36 98 c0 36 3b 2c 5d-2c 73 61 82 76 6c d6 f1' port 50331657 obsd-n3h21_1.eth0 vNic slot 2 name: nic-464408-eth0-vmware-sfw.2 <----- applied! agentName: vmware-sfw state: IOChain Attached vmState: Detached failurePolicy: failClosed slowPathID: none filter source: Dynamic Filter Creation vNic slot 1 name: nic-464408-eth0-dvfilter-generic-vmware-swsec.1 agentName: dvfilter-generic-vmware-swsec state: IOChain Attached vmState: Detached failurePolicy: failClosed slowPathID: none filter source: Alternate Opaque Channel |
Hmm, that too looks fine.
Next I picked up one of the rule sets and explored it further:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
nsxmgr63-01> show dfw host host-19 filter nic-111445-eth0-vmware-sfw.2 rules ruleset domain-c15 { # Filter rules rule 1003 at 1 inout protocol ipv6-icmp icmptype 135 from any to any accept; rule 1003 at 2 inout protocol ipv6-icmp icmptype 136 from any to any accept; rule 1002 at 3 inout protocol udp from any to any port 68 accept; rule 1002 at 4 inout protocol udp from any to any port 67 accept; rule 1010 at 5 inout protocol any from addrset src1010 to addrset src1010 drop; <------ L3 rule 1001 at 6 inout protocol any from any to any accept; } ruleset domain-c15_L2 { # Filter rules rule 1009 at 1 inout ethertype any from addrset src1009 to addrset src1009 drop; <------ L2 # internal # rule 1009 at 2 in ethertype any from addrset src1009 to addrset src1009 drop; rule 1004 at 3 inout ethertype any from any to any accept; } |
The Layer 3 & Layer 2 rules are in separate rule sets. I have marked the ones which I am interested in. One works, the other doesn’t. So I checked the address sets used by both:
1 2 3 4 5 6 7 8 9 |
nsxmgr63-01> show dfw host host-19 filter nic-111445-eth0-vmware-sfw.2 addrsets addrset src1009 { mac 00:50:56:b6:25:a4, mac 00:50:56:b6:2a:e2, mac 00:50:56:b6:ad:b3, mac 00:50:56:b6:f2:9a, } addrset src1010 { } |
Tada! And there we have the problem. The address set for the Layer 3 rule is empty.
I checked this for the other rules too – same situation. I modified my Layer 3 rule to specifically target the subnets:
And the address set for that rule is not empty:
1 2 3 4 5 6 7 8 9 10 11 |
nsxmgr63-01> show dfw host host-19 filter nic-111445-eth0-vmware-sfw.2 addrsets addrset rdst1012 { ip 192.168.1.0/24, ip 192.168.2.0/24, } addrset rsrc1012 { ip 192.168.1.0/24, ip 192.168.2.0/24, } <snip> |
And because of this the firewall rules do work as expected. Hmm.
I modified this rule to be a group with my OpenBSD VMs from each network explicitly added to it (i.e. not dynamic membership in case that was causing an issue). But nope, same result – empty address set!
But the address set is now empty. :o)
1 2 3 4 5 |
nsxmgr63-01> show dfw host host-19 filter nic-111445-eth0-vmware-sfw.2 addrsets <snip> addrset src1012 { } |
So now I have an idea of the problem. I am not too surprised by this because I vaguely remember reading something about VMware Tools and IP detection inside a VM (i.e. NSX makes use of VMware Tools to know the IP address of a VM) and also because I am aware OpenBSD does not use the official VMware Tools package (it has its own and that only provides a subset of functions).
Googling a bit on this topic I came across the IP address Discovery section in the NSX Admin guide – prior to NSX 6.2 if VMware Tools wasn’t installed (or was stopped) NSX won’t be able to detect the IP address of the VM. Post NSX 6.2 it can do DHCP & ARP snooping to work around a missing/ stopped VMware Tools. We configure the latter in the host installation page:
I am going to go ahead and enable both on all my clusters.
That helped. But it needs time. Initially the address set was empty. I started pings from one VM to another and the source VM IP was discovered and put in the address set; but since the destination VM wasn’t in the list traffic was still being allowed. I stopped pings, started pings, waited a while … tried again … and by then the second VM IP to was discovered and put in the address set – effectively blocking communication between them.
1 2 3 4 |
addrset src1012 { ip 192.168.1.1, ip 192.168.2.1, } |
Side by side I installed a Windows 8.1 VM with VMware Tools etc and tested to see if it was being automatically picked up (I did this before enabling the snooping above). It was. In fact its IPv6 address too was discovered via VMware Tools and added to the list:
1 2 3 4 5 6 7 |
addrset src1010 { ip 192.168.1.1, ip 192.168.1.2, ip 192.168.1.11, <--- Windows 8.1 + VMware Tools ip 192.168.2.1, ip fe80::2570:6577:3d58:f198, <--- Windows 8.1 + VMware Tools } |
Nice! Picked up something interesting today.