Contact

Subscribe via Email

Subscribe via RSS/JSON

Categories

Recent Posts

Creative Commons Attribution 4.0 International License
© Rakhesh Sasidharan

Elsewhere

ESXi host – cannot install HA – no space left on device

These are less of notes and more of links and what I did when I encountered this issue. Just for my future self.

At work we had a host which was giving HA errors. The message was along the lines that vCenter could not contact HA. So I tried reconfiguring it for HA (right click the host and select “Reconfigure for vSphere HA”) upon which I got a new error: Cannot install the vCenter Server agent service. Cannot upload agent.

HA-errorInitially I thought it must just be a permissions issue. But it wasn’t so.

To investigate further I tried logging on to the server. I couldn’t enable SSH and ESXi Shell from the Configuration tab – it gave me an error. So I iLO’d into the server DCUI and enabled SSH and ESXi Shell. SSH still refused to let me in, and when I’d press Alt+F1 on the console to get the login prompt it was filled with messages like these: /bin/sh cant fork. Initially I thought it might be to do with HP AMS memory leak (see this and this) but it wasn’t.

I pressed Alt+F12 to see the on-screen logs. It was filled with messages like these:

alt+f12 logsBlimey!

There was nothing more I could do here basically. Couldn’t login to the server at all, heck I couldn’t even Shutdown/ Restart it gracefully via F12 in DCUI (nothing would happen). So I cold booted it and that got it working. 

It’s been about 2 hours since I did that and the server seems stable so maybe it was a one off-thing. I looked at more logs though and here’s what I found.

/var/log/syslog.log

(Contains: Management service initialization, watchdogs, scheduled tasks and DCUI use)

/var/log/vmkwarning.log

(Contains: A summary of Warning and Alert log messages excerpted from the VMkernel logs)

/var/log/vob.log

(Contains: VMkernel Observation events)

/var/log/vmkernel.log

(Contains: Core VMkernel logs, including device discovery, storage and networking device and driver events, and virtual machine startup)

/var/log/hostd.log

(Contains: Host management service logs, including virtual machine and host Task and Events, communication with the vSphere Client and vCenter Server vpxa agent, and SDK connections.)

From these logs one thing was clear. The ESXi RAMdisk hosting the root filesystem had run out of inodes. Possibly caused by the SFCB service. Because of this the root filesystem had run out of space and everything was failing. Great!

In Linux I am used to the df command to check filesystem usage. But in ESXi df only seems to be give info on the mounted filesystems whereas vdf gives the local filesystems (like RAMdisks and Tardisks (whatever that is)).

Above output is after a reboot and all seems fine. To check the inode usage use the stat command.

Or use exscli. It gives you the free space as well as the inode count!

Note to self: Make a habit of using the esxcli command as that seems to be the VMware preferred way of doing things. Plus it’s one command with various namespaces you can use for networking and other info.

In my case things look to be fine now.

KB 2037798 talks about this problem. Apparently it is fixed via a patch released in 2013, and as far as I can tell we are properly patched so we shouldn’t have been hit by this issue. If it happens again though the same KB article talks about creating a separate RAMdisk for SFCB so even if it eats up all the inodes your root file system isn’t affected. This involves creating a new RAMdisk at boot time by modifying rc.local (nice!). The esxcli command can be used to create a new ramdisk and mount it at the mount point required by SFCB:

Turns out such an issue can also occur because of SNMP. Or if you have an HP Gen8 blade server then coz of the hpHelper.log file, which is fixed via a patch from HP (this server was a Gen8 blade but it didn’t have this log file). KB 2040707 too talks about this. Didn’t help much in my case as that didn’t seem to be my issue.

Two useful links for future reference are:

That’s all for now.

p.s. I keep talking about SFCB above but have no idea what it is. Turns out it is the CIM server for ESXi. Found this blog post on it. 

vMotion fails at 14% – ESX hosts failed to connect over the VMotion network

On a newly built host today vMotion was failing while migrating VMs to this host. vMotion would get stuck at 14% and then fail with the above error. Found this VMware KB article – it was informative but didn’t help. From this article I learnt though that I can use vmkping with the -I switch to specify an interface to use while pinging. This is handy when you want to ping a remote address via a specific interface – say, the vMotion IP address of a remote host, via the vMotion VMKernel of this host. Usually vmkping automatically selects an interface on the network you are trying to ping but it’s possible you are using the same subnet for vMotion and many other services.

Anyhow, in my case I noticed that if I removed one of the underlying physical adapters I am able to vmkping. So add that to the list of things to try if you too are in a similar situation. Odd though that it failed though! I would have thought a failed physical adapter means it will just try a different one? Clearly in my case the other adapter was working.

I don’t know more details but it could be that the physical NIC was up but the switch was blocking? Not sure.

ESXI 6 seems to have a hard minimum requirement of 4GB RAM

With 2GB RAM the boot up process hangs at “User loaded successfully”

With 2.5GB RAM the boot up process goes past the above but throws many errors and eventually gives a screen with no text like below.

esxi-2.5You can login to the system by pressing F2 but all the network configuring options are grayed out. This is the case even if you boot up with 4GB RAM, configure network, then reduce RAM and reboot. The options are grayed out and you can’t ping the host. Trying to start the management services manually (Enable ESXi Shell, press Alt+F1, type /sbin/services.sh restart doesn’t work either).

With 3GB of RAM the boot up process hangs a while at “Running rabbitmqproxy start” but then pulls through. No IP address though and I can’t configure anything (even if I configure initially by booting up with 4GB RAM). I tried repairing network settings but that simply fails.

Didn’t try with 3.5GB RAM!

With 4GB of RAM the boot up process goes normally. In my case since I was playing with the host by increasing RAM in 500MB increments the network was still bust after booting with 4GB RAM. But I was able to restore the network and get it working.

So it looks like 4GB of RAM is more of a hard minimum requirement for ESXi 6. That sucks! I am able to install ESXi 5.5 with 4GB RAM and then decrease it to 2GB post-install with no issues. In my case all these hosts run in a laptop anyway so I just need them up and running with the bare minimum. Guess I won’t be able to do that with ESXi 6 (unless someone has a workaround, I didn’t search much on this topic – just noticed today and played around a bit).

Notes on vSphere High Availability (HA)

Just some notes on vSphere HA as I reading along on that. Nothing new here …

Starting with vSphere 5.0 HA has a Master/ Slave model. One ESXi host is elected as a Master, the rest are Slaves. The Master is the one with the most number of datastores connected to it; if all ESXi hosts have the same number of datastores connected to it, the Master is the one with the largest Managed Object ID (MOID). Note that the MOID is interpreted lexically – so an MOID 99 is larger than 100. PowerCLI can be used to view the MOIDs:

Also, the MOID is a vCenter specific construct. Whenever a host, VM, datastore, etc is added to vCenter it is assigned an MOID. For instance here are the MOIDs of my datastores:

Although I haven’t used this it’s also possible to find MOIDs vSphere Managed Object Browser. See this KB article for more info.

Back to the topic – the above is how a Master is elected. There’s only one Master per cluster. When it comes to HA, the Fault Domain Manager (FDM) on this Master is responsible for most of the tasks (which is why even if vCenter is down for a while HA can continue working). vCenter checks with the Master and the Master communicates with vCenter to keep each other abreast of the cluster situation.

  • FDM is installed at /opt/vmware/fdm/fdm/
  • FDM config files are at /etc/opt/vmware/fdm/

The Master monitors the Slave hosts and if a Slave goes down/ is unreachable the Master is responsible for starting these Protected VMs elsewhere. The Master is also responsible for keeping the Slaves abreast of the cluster configuration.

Slaves are limited to monitoring VMs running with them. Slaves monitor the VM health and if a Protected VM powers down they inform the Master so it can be restarted. (Note on Protected VMs: once you enable VM monitoring on a cluster or set a VM as Protected, the VM must be powered off and powered on to be protected). Slaves also keep in touch with each other and if they find the Master is down they conduct an election to select a new Master.

The only time vCenter communicates with Slaves is when a new Master needs to be elected or when the Master reports a Slave as missing and so vCenter tries to contact it.

Slaves send network heartbeats to the Master every second. When a Master stops receiving heartbeats from a Slave it knows it is offline or partitioned/ isolated. Similarly when a Slave stops receiving heartbeats from a Master it knows the Master is offline or partitioned/ isolated.

  • If a Slave is cut off from all other hosts (Master and Slaves) it is considered isolated (caveat: you can also specify up to 10 isolation IP addresses to ping – if these are reachable but the Master and Slaves are not, the Slave does not consider itself isolated, only partitioned).
  • If a Slave is cut off from the Master and some/ none Slaves (i.e. it still has contact with some Slaves) then it is considered partitioned.

In the past if a Slave were isolated/ partitioned the Master would consider it as offline and restart its Protected VMs elsewhere. Starting with vSphere 5.0 the Master also sends a ping (ICMP packet) to the Slave to see if responds and uses datastore heartbeats to verify the Slave is really down. It could be that the Management network is down but the VM and storage networks are up, so the VMs are still functioning as expected.

Datastore heartbeats work thus (and remember they are only used in case of isolation/ partition scenarios):

  • When enabling HA for a cluster, a datastore is automatically selected (or can be selected manually by the user) to be used for datastore heartbeats.
  • On this datastore a folder called .vSphere-HA is created within which a sub-folder of name FDM-<Fault Domain ID>-<vCenter Server Name> is created. (Such a name allows the same datastore to be used by multiple clusters).
  • Each host creates a file with its MOID name in this sub-folder. Like thus:heartbeats
  • Notice the host-X-hb file above? That is created by each host (you can check the /var/log/fdm.log file on each host to see it creating this file). When a Slave does not get heartbeats from a Master it updates its file above (and also checks the timestamp of the file for the Master – if that has updates it means the Master is alive). Similarly, when a Master does not hear from a Slave it checks the Slave’s file above to see if there’s updates. This is how datastore heartbeats work.
  • If a Slave is network partitioned – i.e. it cannot contact the Master – but can see some of the other Slaves, the Master and Slave can conclude that each other is still alive from the datastore heartbeats as above.
    • If the Master is down – i.e. the Slaves think they are partitioned because actually the Master is down – they can now elect a new Master since there are no datastore heartbeats from the Master.
    • If the Slave is down – i.e. the Master is not getting any datastore heartbeats from the Slave – then it restarts the Protected VMs on other hosts. (If the Slave were actually up but had lost network access to the datastore and so cannot update heartbeats, it is as good as down because the VMs have probably crashed by now).
  • If a Slave is network isolated – i.e. it cannot contact the Master or any other Slave (nor can it ping the isolation addresses) – then the Slave adds a special bit in the host-X-poweron file above. This tells the Master that the Slave is network isolated.
    • The Master then locks the file called protectedlist. This is a list of all Protected VMs. Once the Master has locked this file, the Slave knows the Master has taken responsibility for the Protected VMs and the Slave can leave these powered on, shut down, or power off (depending on which of these is selected as the host isolation response when setting up HA).
    • The protectedlist file thus ensures that unless another host has taken over these VMs the current host will not shut down/ power off these.

Two advanced options to keep in mind:

  • I mentioned this earlier: das.isolationAddress[0-9] allow one to specify up to 10 isolation IP addresses to check before a host considers itself isolated.
  • And das.allowNetwork[0-9] allow one to specify up to 10 port groups to use for HA. See this KB article for examples.

Lastly, I haven’t read it fully but this HA Deepdive is a great resource.

vSphere client does not depend on the Inventory Service

On numerous occasions I have noticed my vSphere client always has the correct inventory of objects in vCenter whereas the vSphere web client tends to lag behind. While reading Mastering VMware vSphere 5.5 (a great book if you want to really understand how all this works!) I learnt that that’s because the web client depends on the vCenter Inventory Service as a cache between the web client and vCenter whereas the regular client talks to it directly.

The vCenter Inventory Service does two things – one, it caches inventory objects from vCenter so that each time the web client needs something it doesn’t have to ask vCenter (thus reducing load on vCenter); two, it allows for tags. Having the Inventory Service allows for more web client sessions with lesser load on vCenter server.

This also means its a good idea to place the Inventory Service with the web client, not with the vCenter server.

vCenter and vSphere editions (5.5)

vCenter editions. Just three.

  • Essentials
  • Foundation
  • Standard

Standard is what you usually want. No limits or restrictions.

Essentials is only available when purchased as part of vSphere Essentials or vSphere Essentials Plus kits. Not sold separately. These kits are targeted for SMBs. Limited to 3 hosts of 2 CPUs each. Self-contained – cannot be used with other editions.

Foundation is also for 3 hosts only.

All editions of vCenter include the Management service, SSO, Inventory service, Orchestrator, Web client – everything. There’s no difference in the components included in each edition.

vSphere is the suite. There are three plus two edition of vSphere suite.

Two editions are the kits:

  • Essentials
  • Essentials Plus

Three editions are bundled with vCenter Operations Manager:

  • Standard
  • Enterprise
  • Enterprise Plus

The Essentials & Essentials Plus editions only work with vCenter Essentials. The Standard, Enterprise, and Enterprise Plus work with vCenter Foundation or Standard.

Essentials is pretty basic. Remember it is for 3 hosts of 2 CPUs each. Standalone. In addition you don’t get features like vMotion either. All you get is (1) Thin Provisioning, (2) Update Manager, and (3) vStorage APIs for Data Protection (VADP). Note the latter is only APIs. It is not VMware solution vSphere Data Protection (VDP). Also, no VSAN.

Essentials Plus is a bit more than basic. Once again, only for 3 hosts of 2 CPUs each. Standalone. However, in addition to the three features above you also get (4) vSphere Data Protection, (5) High Availability (HA), (6) vMotion, and (7) vSphere Replication. So you get some useful features. In fact, if I had just 3 hosts and I am unlikely to expand further this is the option I would go for – for me vMotion is very useful and so is HA. Sadly, no Distributed Resource Scheduling (DRS). But you do get VSAN.

Moving on to the big boys …

Standard gives you all the above plus useful features like (8) Storage vMotion, (9) Fault Tolerance, and some more (Hot Add & vShield Endpoint). Still no DRS.

Enterprise gives you all the above plus (10) Storage APIs for Array Integration (nice! but useful only in an Enterprise context where you are likely to have a SAN array and need something like this), (11) DRS, (12) DPM, and (13) Storage APIs for Multi-pathing. As expected, features that are more useful when you have a lot of hosts and are in an Enterprise-y setup. Except DRS :) which would have been nice to have in Standard/ Essentials Plus too.

Finally, Enterprise Plus. All the above plus (13) Distributed Switches, (14) Host Profiles, (15) Auto Deploy, (16) Storage DRS – four of my favorite features – and a bunch of others like App HA, Storage IO Control, Network IO Control, etc.

vCenter – Cannot load the users for the selected domain

I spent the better part of today evening trying to sort this issue. But didn’t get any where. I don’t want to forget the stuff I learnt while troubleshooting so here’s a blog post.

Today evening I added one of my ESXi hosts to my domain. The other two wouldn’t add, until I discovered that the time on those two hosts were out of sync. I spent some time trying to troubleshoot that but didn’t get anywhere. The NTP client on these hosts was running, the ports were open, the DC (which was also the forest PDC and hence the time keeper) was reachable – but time was still out of sync.

Found an informative VMware KB article. The ntpq command (short for “NTP query”) can be used to see the status of NTP daemon on the client side. Like thus:

The command has an interactive mode (which you get into if run without any switches; read the manpage for more info). The -p switch tells ntpq to output a list of peers and their state. The KB article above suggests running this command every 2 seconds using the watch command but you don’t really need to do that.

Important points about the output of this command:

  • If it says “No association ID's returned” it means the ESXi host cannot reach the NTP server. Considering I didn’t get that, it means I have no connectivity issue.
  • If it says “***Request timed out” it means the response from the NTP server didn’t get through. That’s not my problem either.
  • If there’s an asterisk before the remote server name (like so) it means there is a huge gap between the time on the host and the time given by the NTP server. Because of the huge gap NTP is not changing the time (to avoid any issues caused by a sudden jump in the OS time). Manually restarting the NTP daemon (/etc/init.d/ntpd restart) should sort it out.
    • The output above doesn’t show it but one of my problem hosts had an asterisk. Restarting the daemon didn’t help.

The refid field shows the time stream to which the client is syncing. For instance here’s the w3tm output from my domain:

Notice the PDC has a refid of LOCL (indicating it is its own time source) while the rest have a refid of the PDC name. My ESXi host has a refid of .INIT. which means it has not received any response from the NTP server (shouldn’t the error message have been something else!?). So that’s the problem in my case.

Obviously the PDC is working because all my Windows machines are keeping correct time from it. So is vCenter. But some my ESXi hosts aren’t.

I have no idea what’s wrong. After some troubleshooting I left it because that’s when I discovered my domain had some inconsistencies. Fixing those took a while, after which I hit upon a new problem – vCenter clients wouldn’t show me vCenter or any hosts when I login with my domain accounts. Everything appears as expected under the administrator@vsphere.local account but the domain accounts return a blank.

While double-checking that the domain admin accounts still have permissions to vCenter and SSO I came across the following error:

Cannot load the users

Great! (The message is “Cannot load the users for the selected domain“).

I am using the vCenter appliance. Digging through the /var/log/messages on this I found the following entries:

Searched Google a bit but couldn’t find any resolutions. Many blog posts suggested removing vCenter from the domain and re-adding but that didn’t help. Some blog posts (and a VMware KB article) talk about ensuring reverse PTR records exist for the DCs – they do in my case. So I am drawing a blank here.

Odd thing is the appliance is correctly connected to the domain and can read the DCs and get a list of users. The appliance uses Likewise (now called PowerBroker Open) to join itself to the domain and authenticate with it. The /opt/likewise/bin directory has a bunch of commands which I used to verify domain connectivity:

All looks well! In fact, I added a user to my domain and re-ran the lw-enum-users command it correctly picked up the new user. So the appliance can definitely see my domain and get a list of users from it. The problem appears to be in the upper layers.

In /var/log/vmware/sso/ssoAdminServer.log I found the following each time I’d query the domain for users via the SSO section in the web client:

Makes no sense to me but the problem looks to be in Java/ SSO.

I tried removing AD from the list of identity sources in SSO (in the web client) and re-added it. No luck.

Tried re-adding AD but this time I used an SPN account instead of the machine account. No luck!

Finally I tried adding AD as an LDAP Server just to see if I can get it working somehow – and that clicked! :)

AD as LDAP

So while I didn’t really solve the problem I managed to work around it …

Update: Added the rest of my DCs as time sources to the ESXi hosts and restarted the ntpd service. Maybe that helped, now NTP is working on the hosts.

 

Upgrading iLO firmware manually (working around a stuck HP logo screen when updating)

Past two weeks I have been upgrading the iLO and ROM of all our servers (a bunch of HP DL 360s basically – Gen6 to Gen8) following which I upgrade them from ESXi 4.1 to 5.5. Side by side I have also been upgrading the iLO and ROM of our LeftHand/ StoreVirtual boxes following which I upgrade them from LeftHand OS 8.5 to 12.0. Yes, I’ve been busy!

Interesting thing about the firmware upgrades is that even between servers of the same model, when upgrading with the same Service Pack for Proliant (SPP) CD version, I get different errors. Some odd ones really. For instance some servers simply power off once the SPP CD boots, others give me a Pink Screen of Death, and yet others simply hang with the pulsating HP logo.

Pink Screen

Pulsating Logo

I couldn’t find any solutions for the servers that power off (I used SPP version 2015.04, 2014.09, 2014.06 and 2013.02 – same results for all). I was able to work around the pink screen by using an older version (for instance, I was using 2015.04 and that failed but 2014.09 worked). And I sorted of worked around the pulsating logo problem.

For the pulsating logo issue apparently the fix is to upgrade iLO first and then run the SPP. In my case the servers had really ancient versions of iLO – “1.87 06/03/2009” – so I upgraded them via the iLO webpage. The blog post I link to before (and also this one) show a way of updating iLO via SSH but that didn’t work for me sadly. (Could just be the web server I was running. I used TinyWeb to run a small web server off my desktop machine).

Before upgrading iLO via SSH or the webpage, you need to get iLO first. That should be easy but I had trouble getting it. For anyone else looking for the latest and greatest version of iLO 2 this HP page is what you want (and the “Revision History” tab on that page gives you older versions too). That page lets you download versions of the firmware for flashing via Linux or Windows. I downloaded the Windows versions, right clicked on it (it’s an EXE file) via 7-Zip (any other zip tool should do), and extracted the contents. The result is a file with a name like “ilo2_225.bin”. This is the binary image of the iLO 2 firmware that you can flash via SSH or the webpage.

Flashing via the webpage is easy. Go to the “Administration” tab, click Browse to select this file, and click “Send firmware image”.

GUI firmwareUse a modern browser if you can. :-) I used the ancient version of IE on my server and that didn’t do anything, but when I used Firefox I was able to see a progress bar and the firmware actually got updated.

GUI firmware flashingAfter doing this I was able to run the SPP without any issue.

Another thing I learnt is that for the LeftHand/ StoreVirtual servers, simply upgrading the OS or patching it is enough to upgrade the ROM too. So I could have saved some time for myself with the LeftHand/ StoreVirtual servers by updating the iLO (as above) and upgrading the OS. No need to run the SPP.

On a related note, I had some servers with an “Internal Health LED failed” error even though everything seemed to be alright with them. Upgrading the iLO sorted that out!

And while on the topic of iLO I had some servers whose iLO was not responsive. I couldn’t ping the iLO IP address nor could I connect to it. I was able to fix some of those servers by completely powering off the server, removing the power cables, removing the iLO cable, waiting a few minutes, putting back the power cable and powering on the server, and once it has loaded the OS put in the iLO cable. (I have also read reports on the Internet where there was no need to remove/ re-insert the iLO cable so YMMV).

One server though had no luck – its iLO chip was faulty I guess. I tried to upgrade its iLO firmware and ROM by physically being in front of the server but it would hang at the pulsating logo as above. I think the faulty iLO was causing SPP to fail. Because of the faulty iLO though, ESXi would hang at “loading module ipmi_si_drv” for about 30 minutes each time it would boot (or when I’d run the installer to upgrade to 5.5). The solution is as detailed in this blog post. (Note: the argument is noipmiEnabled – I was mistakenly typing noipmiEnable the first few times and nothing happened). Post-install I configured the VMkernel.Boot.impiEnabled advanced configuration option to 0 (I unchecked it). This way I don’t have to enter the boot options each time.

That’s all!

VMware/ PowerCLI – find disks used by a template

It’s easy finding the disks used by a VM. Just check its settings via the client or use PowerCLI.

Can’t do the same for a template though as the Get-Template output has no similar property.

But then I came across Get-HardDisk:

Sweet! Same command works for templates (as above) and VMs:

That’s all. Hope this helps someone.

Enable & Disable SSH on ESXi host via PowerCLI

I alluded to this in another post but couldn’t find it when I was searching my posts for the cmdlet. So here’s a separate post.

The Get-VMHostService is your friend when dealing with services on ESXi hosts. You can use it to view the services thus:

To start and stop services we use the Start-VMHostService and Stop-VMHostService but these take (an array of) HostService objects.  HostService objects are what we get from the Get-VMHostService cmdlet above. Here’s how you stop the SSH & ESXi Shell services for instance:

Since the cmdlet takes an array, you can give it HostService objects of multiple hosts. Here’s how I start SSH & ESXi Shell for all hosts:

As an aside here’s a nice post on six different ways to enable SSH on a host. Good one!

Misc ESXI/ vSphere stuff

Just some notes to myself so I can refer to this later.

  • You can only have a maximum of 256 VMFS datastores per ESXI host. (This is one reason why you wouldn’t want to create a LUN/ datastore per VM. Wouldn’t work if you have a lot of VMs!)
    • Other maximums (for vSphere 5.5) can be found at this link.
  • When you create distributed switch port group there are 3 port binding options:
    • Static Binding (the default): VM NICs are connected to the port group at VM creation and remain so until the VM is removed from the port group. Power off a VM or disconnecting the NIC from the port group does not remove it from the port group – the port is still kept aside for the VM. What this means is that once you connect a VM to a port it stays with that forever.
      • Since the ports are assigned at VM creation, even if vCenter is down when the VM later powers on/ connects to the port group, it will continue to have network connectivity. (Note the emphasis on “later”. If the VM were already running and vCenter were to go down network traffic isn’t affected in either of the binding options).
    • Dynamic Binding (deprecated): VM NICs are connected to the port group only when the VM is powered on and its NIC connected to the port group. Power off the VM or disconnect the NIC and it is not longer connected to the same port when it comes back on or is reconnected.
      • Since the port binding happens only when the VM is powered on or connected, and the port group resides with vCenter, what this means is that you can only power on / off such VMs via vCenter. If vCenter is off / unreachable when the VM powers on / connects, it will not have network connectivity as it won’t have a port in the port group. (As above, note that this doesn’t affect VMs that are already running).
      • Dynamic Binding is deprecated but is useful when the number of VMs is larger than the number of ports in the port group and not all VMs will be on / connected at the same time.
    • Ephemeral Binding: Similar to Distributed Binding, VM NICs are connected to the port group only when the VM is powered on and its NIC connected to the port group. Powering off the VM or disconnecting it results in the port being removed from the port group. 
      • Although Dynamic and Ephemeral Bindings seem similar, they don’t have similar limitations. Thus while VMs with Dynamic Binding port groups won’t have network connectivity if they are powered on / connected when vCenter is off / unreachable, VMs with Ephemeral Binding have no such limitation. They don’t get a proper port number from the port group, but get a temporary one like h-1 which changes to a proper port number whenever connectivity with vCenter is restored.
      • Below screenshot shows the port numbers of three VMs, each connected to a port group of different binding (Ephemeral, Dynamic, Standard from top to bottom) and powered on when the vCenter was unreachable. Bindings
      • Although the NIC is unable to get a port – like Dynamic Binding – with an Ephemeral Binding port group the host creates a fake port and connects the VM anyway. 
      • I don’t understand why Dynamic Binding even exists as an option – unless it’s for backward compatibility? Ephemeral Binding seems to have the advantage of Dynamic Binding – ports are created at VM connection / powering on and so you can oversubscribe to a port group – but doesn’t have the disadvantage of lost connectivity when vCenter is off / unreachable. (I assume Ephemeral port groups too can be used for over subscribing, though the official KB articles don’t say anything like this so I could be wrong).
      • Dynamically creating / removing ports from the port group is an expensive operation so Dynamic and Ephemeral Binding port groups have a performance overhead. Static Binding is the preferred one.
      • Also, Ephemeral Binding port groups lose their history and security controls across host reboots. Apparently Dynamic Binding port groups don’t do this as I don’t see any mention of this as a Dynamic Binding limitation anywhere.

That’s all for now!

 

Get ESXi host network info using PowerShell/ PowerCLI

Not an exhaustive post, I am still exploring this stuff.

To get a list of network adapters on a host:

To get a list of virtual switches on a host, with the NICs assigned to these:

To get a list of port groups on a host:

To get a list of port groups , the virtual switches they are mapped to, and the NICs that make up these switches:

This essentially combines the first and third cmdlets above.

More later!

ESXi 4.0 “unsupported” mode

At work we still use some ESXi 4.0 hosts so this one’s a reminder to myself as ESXi Shell access works slightly different with that one.

On ESXi 4.0 once we are on the DCUI screen, pressing Alt+F1 gives access to a different (hidden) console. Whatever you type here seems to have no effect, but if you type the word unsupported and press Enter, you will be prompted to enter the root password and enter the Tech Support Mode (TSM). For screenshots and such of this check out this blog post.

On ESXi 4.1 and above you can enable this via the DCUI. See this KB article for the deets.

On that note here’s a good blog post detailing various ways of enabling SSH access on an ESXi host. Informative.

Number of IPv4 routes did not match

Was creating / migrating some ESXi hosts during the week and came across the above error “Number of IPv4 routes did not match” when checking for host profile compliance of one of the hosts. All network settings of this host appeared to be same as the rest so I was stumped as to what could be wrong. Via a VMware KB article I came across the esxcfg-route command that helped identify the problem. To run this command SSH into the host:

By default the command only outputs the default gateway but you can pass it the -l switch to list all routes:

In my case the above output was from one of the hosts, while the following was from the non-compliant host:

Notice the vmk2 interface has the wrong network. Not sure how that happened. Oddly the GUI didn’t show this incorrect network but obviously something was corrupt somewhere.

To fix that I thought I’ll remove the vmk2 interface and re-add it. Big mistake! Possibly because its network was same as that of the management network (10.50.0.0/24) removing this interface caused the host to lose connectivity from vCenter. I could ping it but couldn’t connect to it via SSH, vSphere Client, or vCenter. Finally I had to reset the network via the DCUI – it’s under “Network Restore Options”. I tried “Restore vDS” first, didn’t help, so did a “Restore Standard Switch”. This is a very useful – it creates a new standard switch and moves the Management Network onto that so you get connectivity to the host. This way I was able to reconnect to the host, but now I stumbled upon a new problem.

The host didn’t have the vmk2 interface any more but when I tried to recreate it I got an error that the interface already exists. But no, it does not – the GUI has no trace of it! Some forum posts suggested restarting the vCenter service as that clears its cache and puts it in sync with the hosts but that didn’t help either. Then I came across this post which showed me that it is possible for the host to still have the VMkernel port but vCenter to not know of it. For this the esxcli command is your friend. To list all VMkernel ports on a host do the following:

After that, removing the VMkernel interface can be done by a variant of same command:

Now I could add the re-add the interface via vSphere and get the hosts into compliance.

Before I conclude this post though, a few notes on the commands above.

If you have PowerCLI installed you can run all the esxcli commands via the Get-EsxCli cmdlet. For example:

If I wanted to remove the interface via PowerCLI the command would be slightly different:

I would have written more on the esxcli command itself but this excellent blog post covers it all. It’s an all powerful command that can be used to manage many aspects of the ESXi host, even set it in maintenance mode!

Heck you can even use esxcli to upgrade from one ESXi version to another. It is also possible to run the esxcli command from a remote computer (Windows or Linux) by installing the vSphere CLI tools on that computer. Additionally, there’s also the vSphere Management Assistant (VMA) which is a virtual appliance that offers command line tools.

The esxcli is also useful if you want to kill a VM. For instance the following lists all running VMs on a host:

If that VM were stuck for some reason and cannot be stopped or restarted via vSphere it’s very useful to know the esxcli command can be used to kill the VM (has happened a couple of times to me in the past):

Regarding the type of killing you can do:

There are three types of VM kills that can be attempted: [soft, hard, force]. Users should always attempt ‘soft’ kills first, which will give the VMX process a chance to shutdown cleanly (like kill or kill -SIGTERM). If that does not work move to ‘hard’ kills which will shutdown the process immediately (like kill -9 or kill -SIGKILL). ‘force’ should be used as a last resort attempt to kill the VM. If all three fail then a reboot is required. (required)

Another command line option is vim-cmd which I stumbled upon from one of the links above. I haven’t used it much so as a reference to myself here’s a blog post explaining it in detail.

Lastly there’s also a bunch of esxcfg-* commands, one of whom we came across above.

I haven’t used these much. They seem to be present for compatibility reasons with ESXi 3.x and prior. Back then you had commands with a vicfg- prefix, now you have the same but with a esxcfg- prefix. For instance, esxcfg-vmknic is now replaced with esxcli network interface as we saw above.

That’s all for now!

Update: Thought I’d use this post to keep track of other useful commands.

To get IPv4 addresses details:

Replace with ipv6 if that’s what you want.

To set an IPv4 address:

To ping an address from the host:

Change keyboard layout:

Get current keyboard layout:

List available layouts:

Set a new layout:

Remotely enable SSH

The esxcli commands are cool but you need to enable SSH each time you want to connect to the host and run these (unless you install the CLI tools on your machine). If you have PowerCLI though you can enable SSH remotely.

To list the services:

To enable SSH and the ESXi shell:

 

Load balancing in vCenter and ESXI

One of the things you can do with a portgroup is define teaming for the underlying physical NICs.

teaming

If you don’t do anything here, the default setting of “Route based on originating virtual port” applies. What this does is quite obvious. Each virtual port on the virtual switch is mapped to a physical NIC behind the scenes; so all traffic to & from that virtual port goes & comes via that physical NIC. Since your virtual NIC connects to a virtual port this is equivalent to saying all traffic for that virtual NIC happens via a particular physical NIC.

In the screenshot above, for instance, I have two physical NICs dvUplink1 and dvUplink2. If I left teaming at the default setting and say I had 4 VMs connecting to 4 virtual ports, chances are two of these VMs will use dvUplink1 and two will use dvUplink2. They will continue using these mappings until one of the dvUplinks dies, in which case the other will take over – so that’s how you get failover.

This is pretty straightforward and easy to set up. And the only disadvantage, if any, is that you are limited to the bandwidth of a single physical NIC. If each of dvUplink1 & dvUplink2 were 1Gb NICs it isn’t as though the underlying VMs had 2Gb (2 NICs x 1Gb each) available to them. Since each VM is mapped to one uplink, 1Gb is all they get.

Moreover, if say two VMs were mapped to an uplink, and one of them was hogging up all the bandwidth of this uplink while the remaining uplink was relatively free, the other VM on this uplink won’t automatically be mapped to the free uplink to make better use of resources. So that’s a bummer too.

A neat thing about “Route based on originating virtual port” is that the virtual port is fixed for the lifetime of the virtual machine so the host doesn’t have to calculate which physical NIC to use each time it receives traffic to & from the virtual machine. Only if the virtual machine is powered off, deleted, or moved to a different host does it get a new virtual port.

The other options are:

  • Route based on MAC hash
  • Route based on IP hash
  • Route based on physical NIC load
  • Explicit failover

We’ll ignore the last one for now – that just tells the host to use the first physical NIC in the list and use that for all VMs.

“Route based on MAC hash” is similar to “Route based on originating virtual port” in that it uses the MAC address of the virtual NIC instead of virtual port. I am not very clear on how this is better than the latter. Since the MAC address of a virtual machine is usually constant (unless it is changed or a different virtual NIC used) all traffic from that MAC address will use the same physical NIC always. Moreover, there is the additional overhead in that the host has to check each packet for the MAC address and decide which physical NIC to use. VMware documentation says it provides a more even distribution of traffic but I am not clear how.

“Route based on physical NIC load” a good one. It starts off with “Route based on originating virtual port” but if a physical NIC is loaded, then the virtual ports mapped to it are moved to a physical NIC with less load! This load balancing option is only available for distributed switches. Every 30s the distributed switch checks the physical NIC load and if it exceeds 75% then the virtual port of the VM with highest utilization is moved to a different physical NIC. So you have the advantages of “Route based on originating virtual port” with one of its major disadvantages removed.

In fact, except for “Route based on IP hash” none of the other load balancing mechanisms have an option to utilize more than a single physical NIC bandwidth. And “Route based on IP hash” does not do this entirely as you would expect.

“Route based on IP hash”, as the name suggests, does load balancing based on the IP hash of the virtual machine and the remote end it is communicating with. Based on a hash of these two IP addresses all traffic for the communication between these two IPs is sent through one NIC. So if a virtual machine is communicating with two remote servers, it is quite likely that traffic to one server goes through one physical NIC while traffic to the other goes via another physical NIC – thus allowing the virtual machine to use more bandwidth than that of one physical NIC. However – and this is an often overlooked point – all traffic between the virtual server and one remote server is still constrained by the bandwidth of the physical NIC it happens via. Once traffic is mapped to a particular physical NIC, if more bandwidth is required or the physical NIC is loaded, it is not as though an additional physical NIC is used. This is a catch with “Route based on IP hash” that’s worth remembering.

If you select “Route based on IP hash” as a load balancing option you get two warnings:

  • With IP hash load balancing policy, all physical switch ports connected to the active uplinks must be in link aggregation mode.
  • IP hash load balancing should be set for all port groups using the same set of uplinks.

What this means is that unlike the other load balancing schemes where there was no additional configuration required on the physical NICs or the switch(es) they connect to, with “Route based on IP hash” we must combine/ bond/ aggregate the physical NICs as one. There’s a reason for this.

In all the other load balancing options the virtual NIC MAC is associated with one physical NIC (and hence one physical port on the physical switch). So incoming traffic for a VM knows which physical port/ physical NIC to go via. But with “Route based on IP hash” there is no such one to one mapping. This causes havoc with the physical switch. Here’s what happens:

  • Different outgoing traffic flows choose different physical NICs. With each of these packets the physical switch will keep updating its MAC address table with the port the packet was got from. So for instance, say the two physical NICs are connected to physical switch Port1 and Port2 and the virtual NIC MAC address is VMAC1. When an outgoing traffic packet goes via the first physical NIC, the switch will update its tables to reflect that VMAC1 is connected to Port1. Subsequent traffic flows might continue using the first physical NIC so all is well. Then say a traffic flow uses the second physical NIC. Now the switch will map VMAC1 to Port2; then a traffic flow could use Port1 so the mapping gets changed to Port1, and then Port2, and so on …
  • When incoming traffic hits the physical switch for MAC address VMAC1, the switch will look up its tables and decide which port to send traffic on. If the current mapping is Port1 traffic will go out via that; if the current mapping is Port2 traffic will go out via that. The important thing to note is that the incoming traffic flow port chosen is not based on the IP hash mapping – it is purely based on whatever physical port the switch currently has mapped for VMAC1.
  • So what’s required is a way of telling the physical switch that the two physical NICs are to be considered as bonded/ aggregated such that traffic from either of those NICs/ ports is to be treated accordingly. And that’s what EtherChannel does. It tells the physical switch that the two ports/ physical NICs are bonded and that it must route incoming traffic to these ports based on an IP hash (which we must tell EtherChannel to use while configuring it).
  • EtherChannel also helps with the MAC address table in that now there can be multiple ports mapped to the same MAC address. Thus in the above example there would now be two mappings VMAC1-Port1 and VMAC1-Port2 instead of them over-writing each other!

“Route based on IP hash” is a complicated load balancing option to implement because of EtherChannel. And as I mentioned above, while it does allow a virtual machine to use more bandwidth than a single physical NIC, an individual traffic flow is still limited to the bandwidth of a single physical NIC. Moreover there is more overhead on the host because it has to calculate the physical NIC used for each traffic flow (essentially each packet).

Prior to vCenter 5.1 only static EtherChannel was supported (unless you use a third party virtual switch such as the Cisco Nexus 1000V). Static EtherChannel means you explicitly bond the physical NICs. But from vCenter 5.1 onwards the inbuilt distributed switch supports LACP (Link Aggregation Control Protocol) which is a way of automatically bonding physical NICs. Enable LACP on both the physical switch and distributed switch and the physical NICs will automatically be bonded.

(To enable LACP on the physical NICs go to the uplink portgroup that these physical NICs are connected to and enable LACP).

lacpThat’s it for now!

Update

Came across this blog post which covers pretty much everything I covered above but in much greater detail. A must read!