Migrating VMkernel port from Standard to Distributed Switch fails

I am putting a link to the official VMware documentation on this as I Googled it just to confirm to myself I am not doing anything wrong! What I need to do is migrate the physical NICs and Management/ VM Network VMkernel NIC from a standard switch to a distributed switch. Process is simple and straight-forward, and one that I have done numerous times; yet it fails for me now!

Here’s a copy paste from the documentation:

  1. Navigate to Home > Inventory > Networking.
  2. Right-click the dVswitch.
  3. If the host is already added to the dVswitch, click Manage Hosts, else Click Add Host.
  4. Select the host(s), click Next.
  5. Select the physical adapters ( vmnic) to use for the vmkernel, click Next.
  6. Select the Virtual adapter ( vmk) to migrate and click Destination port group field. For each adapter, select the correct port group from dropdown, Click Next.
  7. Click Next to omit virtual machine networking migration.
  8. Click Finish after reviewing the new vmkernel and Uplink assignment.
  9. The wizard and the job completes moving both the vmk interface and the vmnic to the dVswitch.

Basically add physical NICs to the distributed switch & migrate vmk NICs as part of the process. For good measure I usually migrate only one physical NIC from the standard switch to the distributed switch, and then separately migrate the vmk NICs. 

Here’s what happens when I am doing the above now. (Note: now. I never had an issue with this earlier. Am guessing it must be some bug in a newer 5.5 update, or something’s wrong in the underlying network at my firm. I don’t think it’s the networking coz I got my network admins to take a look, and I tested that all NICs on the host have connectivity to the outside world (did this by making each NIC the active one and disabling the others)). 

First it’s stuck in progress:

And then vCenter cannot see the host any more:

Oddly I can still ping the host on the vmk NIC IP address. However I can’t SSH into it, so the Management bits are what seem to be down. The host has connectivity to the outside world because it passes the Management network tests from DCUI (which I can connect to via iLO). I restarted the Management agents too, but nope – cannot SSH or get vCenter to see the host. Something in the migration step breaks things. Only solution is to reboot and then vCenter can see the host.

Here’s what I did to workaround anyways. 

First I moved one physical NIC to the distributed switch.

Then I created a new management portgroup and VMkernel NIC on that for management traffic. Assigned it a temporary IP.

Next I opened a console to the host. Here’s the current config on the host:

The interface vmk0 (or its IPv4 address rather) is what I wanted to migrate. The interface vmk4 is what I created temporarily. 

I now removed the IPv4 address of the existing vmk NIC and assigned that to the new one. Also, confirmed the changes just to be sure. As soon as I did so vCenter picked up the changes. I then tried to move the remaining physical NIC over to the distributed switch, but that failed. Gave an error that the existing connection was forcibly closed by the host. So I rebooted the host. Post-reboot I found that the host now thought it had no IP, even though it was responding to the old IP via the new vmk. So this approach was a no-go (but still leaving it here as a reminder to myself that this does not work)

I now migrated vmk0 from the standard switch to the distributed switch. As before, this will fail – vCenter will lose connectivity to the ESX host. But that’s why I have a console open. As expected the output of esxcli network ip interface list shows me that vmk0 hasn’t moved to the distributed switch:

So now I go ahead and remove the IPv4 address of vmk0 and assign that to vmk4 (the new one). Also confirmed the changes. 

Next I rebooted (reboot) the host, and via the CLI I removed vmk0 (for some reason the GUI showed both vmk0 and vmk4 with the same IP I assigned above). 

Reboot again!

Post-reboot I can go back to the GUI and move the remaining physical NIC over to the distributed switch. :) Yay!