Subscribe via Email

Subscribe via RSS


Creative Commons Attribution 4.0 International License
© Rakhesh Sasidharan

Yay! (VXLAN)

I decided to take a break from my NSX reading and just go ahead and set up a VXLAN in my test lab. Just go with a hunch of what I think the options should be based on what the menus ask me and what I have read so far. Take a leap! :)

*Ahem* The above is actually incorrect, and I am an idiot. A super huge idiot! Each VM is actually just pinging itself and not the other. Unbelievable! And to think that I got all excited thinking I managed to do something without reading the docs etc. The steps below are incomplete. I should just delete this post, but I wrote this much and had a moment of excitement that day … so am just leaving it as it is with this note. 

Above we have two OpenBSD VMs running in my nested EXIi hypervisors. 

  • obsd-01 is running on host 1, which is on network
  • obsd-02 is running on host 2, which is on network 
  • Note that each host is on a separate L3 network.
  • Each host is in a cluster of its own (doesn’t matter but just mentioning) and they connect to the same VDS.
  • In that VDS there’s a port group for VMs and that’s where obsd-01 and obsd-02 connect to. 
  • Without NSX, since the hosts are on separate networks, the two VMs wouldn’t be able to see each other. 
  • With NSX, I am able to create a VXLAN network on the VDS such that both VMs are now on the same network.
    • I put the VMs on a network so that’s my overlay network. 
    • VXLANs are basically port groups within your NSX enhanced VDS. The same way you don’t specify IP/ network information on the VMware side when creating a regular portgroup, you don’t do anything when creating the VXLAN portgroup either. All that is within the VMs on the portgroup.
  • A VDS uses VMKernel ports (vmk ports) to carry out the actual traffic. These are virtual ports bound to the physical NICs on an ESXi host, and there can be multiple vmk ports per VDS for various tasks (vMotion, FT, etc). Similar to this we need to create a new vmk port for the host to connect into the VTEP used by the VXLAN. 
    • Unlike regular vmk ports though we don’t create and assign IP addresses manually. Instead we either use DHCP or create an IP pool when configuring the VXLAN for a cluster. (It is possible to specify a static IP either via DHCP reservation or as mentioned in the install guide). 
    • Each cluster uses one VDS for its VXLAN traffic. This can be a pre-existing VDS – there’s nothing special about it just that you point to it when enabling VXLAN on a cluster; and the vmk port is created on this VDS. NSX automatically creates another portgroup, which is where the vmk port is assigned to. 

And that’s where I am so far. After doing this I went through the chapter for configuring VXLAN in the install guide and I was pretty much on the right track. Take a look at that chapter for more screenshots and info. 

Yay, my first VXLAN! :o)

p.s. I went ahead with OpenBSD in my nested environment coz (a) I like OpenBSD (though I have never got to play around much with it); (b) it has a simple & fast install process and I am familiar with it; (c) the ISO file is small, so doesn’t take much space in my ISO library; (d) OpenBSD comes with VMware tools as part of the kernel, so nothing additional to install; (e) I so love that it still has a simple rc based system and none of that systemd stuff that newer Linux distributions have (not that there’s anything wrong with systemd just that I am unfamiliar with it and rc is way simpler for my needs); (f) the base install has manpages for all the commands unlike minimal Linux ISOs that usually seem to skip these; (g) take a look at this memory usage! :o)

p.p.s. Remember to disable the PF firewall via pfctl -d.

Yay again! :o)

Update: Short-lived excitement sadly. A while later the VMs stopped communicating. Turns out VMware Workstation doesn’t support MTU larger than 1500 bytes, and VXLAN requires 1600 byte. So the VTEP interfaces of both ESXi hosts are unable to talk to each other. Bummer!

Update 2: I finally got this working. Turns out I had missed some stuff; and also I had to make some changes to allows VMware Workstation to with larger MTU sizes. I’ll blog this in a later post

vMotion NIC load balancing fails even though there is an active link

The other day I blogged about how I had a host whose vMotion VMkernel interface seemed to be broken. Any vMotion attempts to it would hang at 14%.

At that time I logged on to the destination host, then used vmkping with the -I switch (to explicitly specify the vMotion VMkernel interface of the destination host), and found that I couldn’t ping the VMkernel interface of the other hosts. These hosts could ping each other but couldn’t ping the destination host.

The VMKernel interface is backed by two physical NICs. I found that if I remove one of the physical NICs from the VMkernel it works. Interestingly this link wasn’t showing any CDP info either, so it looked like something was wrong with it (the physical NIC shows as unclaimed coz the screenshot was taken after I moved it to unclaimed).

Missing CDP infoSo the first question is why did the VMkernel fail when only one of the physical NICs failed? Since the other physical NIC backing the VMkernel NIC is still active shouldn’t it have continued working?

The reason why it failed is that by default network failover detection is via “Link status only”. This only detects failures to the link – like say the cable is broken, the switch is down, or the NIC has failed – while failures such as the link being connected but blocked by switch are not detected. In my case as you can see from the screenshot above the link status is connected – so the host doesn’t consider the link failed even though it isn’t actually working, thus continues to use it.

Next I discovered that other hosts too similarly had their second vMotion physical NIC in a failed state as above yet they weren’t failing like this host. The simple explanation for this is that the host above somehow selected the faulty physical NIC as the one to use, didn’t detect it as failed and so continued to use it; whereas other hosts were more lucky and chose the physical NIC that works alright, so didn’t have any issues.

I am not sure that’s the entire answer though. For once the host that failed was ESXi 5.5 and using a distributed switch, while the other two hosts were ESXi 4.0 and using standard switches. Did that make a difference?

The default load balancing method for both standard and distributed switches is the same. (For a standard switch you check this under the vSwitch properties on the host. For a distributed switch you check this under the portgroup in the Networking section of vSphere (web) client).

default load balancingLoad balancing is what I am concerned about here because that’s what the hosts should be using to balance between both the NICs. That’s what the host will be using to select the physical NIC to use for that particular traffic flow. The load balancing method is same between standard and distributed switches yet why were the distributed switch/ ESXi 5.5 hosts behaving differently?

I am still not sure of an answer but I have my theory. My theory is that since a distributed switch is across multiple hosts the load balancing method (above) of choosing a route based on virtual port ID comes into play. Here’s screenshots from two of my hosts connected to the same distributed switch port group for instance:

port numberAs you can see the virtual port number is different for the VMkernel NIC of each host. So each host could potentially use a different underlying physical NIC depending on how the load balancing algorithm maps it.

But what about a standard switch? Since the standard switch is only on the host, and the only VMkernel NIC connected to it (in the case of vMotion) is the single VMKernel NIC I have assigned for vMotion, there is no load balancing algorithm coming into play! If, instead of a VMkernel I had a Virtual Machine network, then the virtual port number matters because there are multiple VMs connecting to the various port numbers; but that doesn’t matter for VMkernel NICs as there is only one of them. And so my theory is that for a VMkernel NIC (such as vMotion) backed by multiple physical NICs and using the default load balancing algorithm of virtual port ID – all traffic by default goes into one of the physical NICs and the other physical NIC is never used unless the chosen one fails. And that is why my hosts using the standard switches were always using the same physical NIC (am guessing the lower numbered one as that’s what both hosts chose) while hosts using distributed switches would have chosen different physical NICs per host.

That’s all! Just thought I’d put this out there in case anyone else has the same question.

Number of IPv4 routes did not match

Was creating / migrating some ESXi hosts during the week and came across the above error “Number of IPv4 routes did not match” when checking for host profile compliance of one of the hosts. All network settings of this host appeared to be same as the rest so I was stumped as to what could be wrong. Via a VMware KB article I came across the esxcfg-route command that helped identify the problem. To run this command SSH into the host:

By default the command only outputs the default gateway but you can pass it the -l switch to list all routes:

In my case the above output was from one of the hosts, while the following was from the non-compliant host:

Notice the vmk2 interface has the wrong network. Not sure how that happened. Oddly the GUI didn’t show this incorrect network but obviously something was corrupt somewhere.

To fix that I thought I’ll remove the vmk2 interface and re-add it. Big mistake! Possibly because its network was same as that of the management network ( removing this interface caused the host to lose connectivity from vCenter. I could ping it but couldn’t connect to it via SSH, vSphere Client, or vCenter. Finally I had to reset the network via the DCUI – it’s under “Network Restore Options”. I tried “Restore vDS” first, didn’t help, so did a “Restore Standard Switch”. This is a very useful – it creates a new standard switch and moves the Management Network onto that so you get connectivity to the host. This way I was able to reconnect to the host, but now I stumbled upon a new problem.

The host didn’t have the vmk2 interface any more but when I tried to recreate it I got an error that the interface already exists. But no, it does not – the GUI has no trace of it! Some forum posts suggested restarting the vCenter service as that clears its cache and puts it in sync with the hosts but that didn’t help either. Then I came across this post which showed me that it is possible for the host to still have the VMkernel port but vCenter to not know of it. For this the esxcli command is your friend. To list all VMkernel ports on a host do the following:

After that, removing the VMkernel interface can be done by a variant of same command:

Now I could add the re-add the interface via vSphere and get the hosts into compliance.

Before I conclude this post though, a few notes on the commands above.

If you have PowerCLI installed you can run all the esxcli commands via the Get-EsxCli cmdlet. For example:

If I wanted to remove the interface via PowerCLI the command would be slightly different:

I would have written more on the esxcli command itself but this excellent blog post covers it all. It’s an all powerful command that can be used to manage many aspects of the ESXi host, even set it in maintenance mode!

Heck you can even use esxcli to upgrade from one ESXi version to another. It is also possible to run the esxcli command from a remote computer (Windows or Linux) by installing the vSphere CLI tools on that computer. Additionally, there’s also the vSphere Management Assistant (VMA) which is a virtual appliance that offers command line tools.

The esxcli is also useful if you want to kill a VM. For instance the following lists all running VMs on a host:

If that VM were stuck for some reason and cannot be stopped or restarted via vSphere it’s very useful to know the esxcli command can be used to kill the VM (has happened a couple of times to me in the past):

Regarding the type of killing you can do:

There are three types of VM kills that can be attempted: [soft, hard, force]. Users should always attempt ‘soft’ kills first, which will give the VMX process a chance to shutdown cleanly (like kill or kill -SIGTERM). If that does not work move to ‘hard’ kills which will shutdown the process immediately (like kill -9 or kill -SIGKILL). ‘force’ should be used as a last resort attempt to kill the VM. If all three fail then a reboot is required. (required)

Another command line option is vim-cmd which I stumbled upon from one of the links above. I haven’t used it much so as a reference to myself here’s a blog post explaining it in detail.

Lastly there’s also a bunch of esxcfg-* commands, one of whom we came across above.

I haven’t used these much. They seem to be present for compatibility reasons with ESXi 3.x and prior. Back then you had commands with a vicfg- prefix, now you have the same but with a esxcfg- prefix. For instance, esxcfg-vmknic is now replaced with esxcli network interface as we saw above.

That’s all for now!

Update: Thought I’d use this post to keep track of other useful commands.

To get IPv4 addresses details:

Replace with ipv6 if that’s what you want.

To set an IPv4 address:

To ping an address from the host:

Change keyboard layout:

Get current keyboard layout:

List available layouts:

Set a new layout:

Remotely enable SSH

The esxcli commands are cool but you need to enable SSH each time you want to connect to the host and run these (unless you install the CLI tools on your machine). If you have PowerCLI though you can enable SSH remotely.

To list the services:

To enable SSH and the ESXi shell: