Contact

Subscribe via Email

Subscribe via RSS

Categories

Creative Commons Attribution 4.0 International License
© Rakhesh Sasidharan

Find the profiles in an offline ESXi update zip file

I use esxcli to manually update our ESXi hosts that don’t have access to VUM (e.g. our DMZ hosts). I do so via command-line:

Usually the VMware page where I download the patch from mentions the profile name, but today I had a patch file and wanted to find the list of profiles it had. 

One way is to open the zip file, then the metadata.zip file in that, and that should contain a list of profiles. Another way is to use esxcli

Screenshot example:

Yay! (VXLAN)

I decided to take a break from my NSX reading and just go ahead and set up a VXLAN in my test lab. Just go with a hunch of what I think the options should be based on what the menus ask me and what I have read so far. Take a leap! :)

*Ahem* The above is actually incorrect, and I am an idiot. A super huge idiot! Each VM is actually just pinging itself and not the other. Unbelievable! And to think that I got all excited thinking I managed to do something without reading the docs etc. The steps below are incomplete. I should just delete this post, but I wrote this much and had a moment of excitement that day … so am just leaving it as it is with this note. 

Above we have two OpenBSD VMs running in my nested EXIi hypervisors. 

  • obsd-01 is running on host 1, which is on network 10.10.3.0/24.
  • obsd-02 is running on host 2, which is on network 10.10.4.0/24. 
  • Note that each host is on a separate L3 network.
  • Each host is in a cluster of its own (doesn’t matter but just mentioning) and they connect to the same VDS.
  • In that VDS there’s a port group for VMs and that’s where obsd-01 and obsd-02 connect to. 
  • Without NSX, since the hosts are on separate networks, the two VMs wouldn’t be able to see each other. 
  • With NSX, I am able to create a VXLAN network on the VDS such that both VMs are now on the same network.
    • I put the VMs on a 192.168.0.0/24 network so that’s my overlay network. 
    • VXLANs are basically port groups within your NSX enhanced VDS. The same way you don’t specify IP/ network information on the VMware side when creating a regular portgroup, you don’t do anything when creating the VXLAN portgroup either. All that is within the VMs on the portgroup.
  • A VDS uses VMKernel ports (vmk ports) to carry out the actual traffic. These are virtual ports bound to the physical NICs on an ESXi host, and there can be multiple vmk ports per VDS for various tasks (vMotion, FT, etc). Similar to this we need to create a new vmk port for the host to connect into the VTEP used by the VXLAN. 
    • Unlike regular vmk ports though we don’t create and assign IP addresses manually. Instead we either use DHCP or create an IP pool when configuring the VXLAN for a cluster. (It is possible to specify a static IP either via DHCP reservation or as mentioned in the install guide). 
    • Each cluster uses one VDS for its VXLAN traffic. This can be a pre-existing VDS – there’s nothing special about it just that you point to it when enabling VXLAN on a cluster; and the vmk port is created on this VDS. NSX automatically creates another portgroup, which is where the vmk port is assigned to. 

And that’s where I am so far. After doing this I went through the chapter for configuring VXLAN in the install guide and I was pretty much on the right track. Take a look at that chapter for more screenshots and info. 

Yay, my first VXLAN! :o)

p.s. I went ahead with OpenBSD in my nested environment coz (a) I like OpenBSD (though I have never got to play around much with it); (b) it has a simple & fast install process and I am familiar with it; (c) the ISO file is small, so doesn’t take much space in my ISO library; (d) OpenBSD comes with VMware tools as part of the kernel, so nothing additional to install; (e) I so love that it still has a simple rc based system and none of that systemd stuff that newer Linux distributions have (not that there’s anything wrong with systemd just that I am unfamiliar with it and rc is way simpler for my needs); (f) the base install has manpages for all the commands unlike minimal Linux ISOs that usually seem to skip these; (g) take a look at this memory usage! :o)

p.p.s. Remember to disable the PF firewall via pfctl -d.

Yay again! :o)

Update: Short-lived excitement sadly. A while later the VMs stopped communicating. Turns out VMware Workstation doesn’t support MTU larger than 1500 bytes, and VXLAN requires 1600 byte. So the VTEP interfaces of both ESXi hosts are unable to talk to each other. Bummer!

Update 2: I finally got this working. Turns out I had missed some stuff; and also I had to make some changes to allows VMware Workstation to with larger MTU sizes. I’ll blog this in a later post

Notes to self while installing NSX 6.3 (part 1)

(No sense or order here. These are just notes I took when installing NSX 6.3 in my home lab, while reading this excellent NSX for Newbies series and the NSX 6.3 install guide from VMware (which I find to be quite informative). Splitting these into parts as I have been typing this for a few days).

You can install NSX Manager in VMware Workstation (rather than in the nested ESXi installation if you are doing it in a home lab). You won’t get a chance to configure the IP address, but you can figure it from your DHCP server. Browse to that IP in a browser and login as username “admin” password “default” (no double quotes). 

If you want to add a certificate from your AD CA to NSX Manager create the certificate as usual in Certificate Manager. Then export the generated certificate and your root CA and any intermediate CA certificates as a “Base-64 encoded X.509 (.CER)” file. Then concatenate all these certificates into a single file (basically, open up Notepad and make a new file that has all these certificates in it). Then you can import it into NSX Manager. (More details here).

During the Host Preparation step on an ESXi 5.5 host it failed with the following error: 

“Could not install image profile: ([], “Error in running [‘/etc/init.d/vShield-Stateful-Firewall’, ‘start’, ‘install’]:\nReturn code: 1\nOutput: vShield-Stateful-Firewall is not running\nwatchdog-dfwpktlogs: PID file /var/run/vmware/watchdog-dfwpktlogs.PID does not exist\nwatchdog-dfwpktlogs: Unable to terminate watchdog: No running watchdog process for dfwpktlogs\nFailed to release memory reservation for vsfwd\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nSet memory minlimit for vsfwd to 256MB\nFailed to set memory reservation for vsfwd to 256MB, trying for 256MB\nFailed to set memory reservation for vsfwd to failsafe value of 256MB\nMemory reservation released for vsfwd\nResource pool ‘host/vim/vmvisor/vsfwd’ released.\nResource pool creation failed. Not starting vShield-Stateful-Firewall\n\nIt is not safe to continue. Please reboot the host immediately to discard the unfinished update.”)” Error 3/16/2017 5:17:49 AM esx55-01.fqdn

Initially I thought maybe NSX 6.3 wasn’t compatible with ESXi 5.5 or that I was on an older version of ESXi 5.5 – so I Googled around on pre-requisites (ESXi 5.5 seems to be fine) and also updated ESXi 5.5 to the latest version. Then I took a closer look at the error message above and saw the bit about the 256MB memory reservation. My ESXi 5.5 host only had 3GB RAM (I had installed with 4GB and reduced it to 3GB) so I bumped it up to 4GB RAM and tried again. And voila! the install worked. So NSX 6.3 requires an ESXi 5.5 host with minimum 4GB RAM (well maybe 3.5 GB RAM works too – I was too lazy to try!) :o)

If you want, you can browse to “https://<NSX_MANAGER_IP>/bin/vdn/nwfabric.properties” to manually download the VIBs that get installed as part of the Host Preparation. This is in case you want to do a manual install (thought had crossed my mind as part of troubleshooting above).

NSX Manager is your management layer. You install it first and it communicates with vCenter server. A single NSX Manager install is sufficient. There’s one NSX Manager per vCenter. 

The next step after installing NSX Manager is to install NSX Controllers. These are installed in odd numbers to maintain quorum. This is your control plane. Note: No data traffic flows through the controllers. The NSX Controllers perform many roles and each role has a master controller node (if this node fails another one takes its place via election). 

Remember that in NSX the VXLAN is your data plane. NSX supports three control plane modes: multicast, unicast, and hybrid when it comes to BUM (Broadcast, unknown Unicast, and Multicast) traffic. BUM traffic is basically traffic that doesn’t have a specific Layer 3 destination. (More info: [1], [2], [3] … and many on the Internet but these three are what I came across initially via Google searches).

  • In unicast mode a host replicates all BUM traffic to all other hosts on the same VXLAN and also picks a host in every other VXLAN to do the same for hosts in their VXLANs. Thus there’s no dependence on the underlying hardware. There could, however, be increased traffic as the number of VXLANs increase. Note that in the case of unknown unicast the host first checks with the NSX Controller for more info. (That’s the impression I get at least from the [2] post above – I am not entirely clear). 
  • In multicast mode a host depends on the underlying networking hardware to replicate BUM traffic via multicast. All hosts on all VXLAN segments join multicast groups so any BUM traffic can be replicated by the network hardware to this multicast group. Obviously this mode requires hardware support. Note that multicast is used for both Layer 2 and Layer 3 here. 
  • In hybrid mode some of the BUM traffic replication is handed over to the first hop physical switch (so rather than a host sending unicast traffic to all other hosts connected to the same physical switch it relies on the switch to do this) while the rest of the replication is done by the host to hosts in other VXLANs. Note that multicast is used only for Layer 2 here. Also note that as in the unicast mode, in the case of unknown unicast traffic the Controller is consulted first. 

NSX Edge provides the routing. This is either via the Distributed Logical Router (DLR), which is installed on the hypervisor + a DLR virtual appliance; or via the Edge Services Gateway (ESG), which is a virtual appliance. 

  • A DLR can have up to 8 uplink interfaces and 1000 internal interfaces.
    • A DLR uplink typically connects to an ESG via a Layer 2 logical switch. 
    • DLR virtual appliance can be set up in HA mode – in an active/ standby configuration.
      • Created from NSX Manager?
    • The DLR virtual appliance is the control plane – it supports dynamic routing protocols and exchanges routing updates with Layer 3 devices (usually ESG).
      • Even if this virtual appliance is down the routing isn’t affected. New routes won’t be learnt that’s all.
    • The ESXi hypervisors have DLR VIBs which contain the routing information etc. got from the controllers (note: not from the DLR virtual appliance). This the data layer. Performs ARP lookup, route lookup etc. 
      • The VIBs also add a Logical InterFace (LIF) to the hypervisor. There’s one for each Logical Switch (VXLAN) the host connects to. Each LIF, of each host, is set to the default gateway IP of that Layer 2 segment. 
  • An ESG can have up to 10 uplink and internal interfaces. (With a trunk an ESG can have up to 200 sub-interfaces). 
    • There can be multiple ESG appliances in a datacenter. 
    • Here’s how new routes are learnt: NSX Edge Gateway (EGW) learns a new route -> This is picked up by the DLR virtual appliance as they are connected -> DLR virtual appliance passes this info to the NSX Controllers -> NSX Controllers pass this to the ESXi hosts.
    • The ESG is what connects to the uplink. The DLR connects to ESG via a Logical Switch. 

Logical Switch – this is the switch for a VXLAN. 

NSX Edge provides Logical VPNs, Logical Firewall, and Logical Load Balancer. 

TIL: Control Plane & Data Plane (networking)

Reading a bit of networking stuff, which is new to me, as I am trying to understand and appreciate NSX (instead of already diving into it). Hence a few of these TIL posts like this one and the previous. 

One common term I read in the context of NSX or SDN (Software Defined Networking) in general is “control plane” and “data plane” (a.k.a “forwarding” plane). 

This forum post is a good intro. Basically, when it comes to Networking your network equipment does two sort of things. One is the actual pushing of packets that come to it to others. The other is figuring out what packets need to go where. The latter is where various networking protocols like RIP and EIGRP come in. Control plane traffic is used to update a network device’s routing tables or configuration state, and its processing happens on the network device itself.  Data plane traffic passes through the router. Control plane traffic determines what should be done with the data plane traffic. Another way of thinking about control plan and data planes is where the traffic originates from/ is destined to. Basically, control plane traffic is sent to/ from the network devices to control it (e.g RIP, EIGRP); while data plane traffic is what passes through a network device.

( Control plane traffic doesn’t necessarily mean its traffic for controlling a network device. For example, SSH or Telnet could be used to connect to a network device and control it, but it’s not really in the control plane. These come more under a “management” plane – which may or may not be considered as a separate plane. )

Once you think of network devices along these lines, you can see that a device’s actual work is in the data plane. How fast can it push packets through. Yes, it needs to know where to push packets through to, but the two aren’t tied together. It’s sort of like how one might think of a computer as being hardware (CPU) + software (OS) tied together. If we imagine the two as tied together, then we are limiting ourselves on how much each of these can be pushed. If improvements in the OS require improvements in the CPU then we limit ourselves – the two can only be improved in-step. But if the OS improvements can happen independent of the underlying CPU (yes, a newer CPU might help the OS take advantage of newer features or perform better, but it isn’t a requirement) then OS developers can keep innovating on the OS irrespective of CPU manufacturers. In fact, OS developers can use any CPU as long as there are clearly defined interfaces between the OS and the CPU. Similarly, CPU manufacturers can innovate independent of the OS. Ultimately if we think (very simply) of CPUs as having a function of quickly processing data, and OS as a platform that can make use of a CPU to do various processing tasks, we can see that the two are independent and all that’s required is a set of interfaces between them. This is how things already are with computers so what I mentioned just now doesn’t sound so grand or new, but this wasn’t always the case. 

With SDN we try to decouple the control and data planes. The data plane then is the physical layer comprising of network devices or servers. They are programmable and expose a set of interfaces. The control plane now can be a VM or something independent of the physical hardware of the data plane. It is no longer limited to what a single network device sees. The control plane is aware of the whole infrastructure and accordingly informs/ configures the data plane devices.  

If you want a better explanation of what I was trying to convey above, this article might help. 

In the context of NSX its data plane would be the VXLAN based Logical Switches and the ESXi hosts that make it up. And its control plane would be the NSX Controllers. It’s the NSX Controllers that takes care of knowing what to do with the network traffic. It identifies all these, informs the hosts that are part of the data plane accordingly, and let them do the needful. The NSX Controller VMs are deployed in odd numbers (preferably 3 or higher, though you could get away with 1 too) for HA and cluster quorum (that’s why odd numbers) but they are independent of the data plane. Even if all the NSX Controllers are down the data flow would not be affected


I saw a video from Scott Shenker on the future of networking and the past of protocols. Here’s a link to the slides, and here’s a link to the video on YouTube. I think the video is a must watch. Here’s some of the salient points from the video+slides though – mainly as a reminder to myself (note: since I am not a networking person I am vague at many places as I don’t understand it myself):

  • Layering is a useful thing. Layering is what made networking successful. The TCP/IP model, the OSI model. Basically you don’t try and think of the “networking problem” as a big composite thing, but you break it down into layers with each layer doing one task and the layer above it assuming that the layer below it has somehow solved that problem. It’s similar to Unix pipes and all that. Break the problem into discrete parts with interfaces, and each part does what it does best and assumes the part below it is taking care of what it needs to do. 
  • This layering was useful when it came to the data plane mentioned above. That’s what TCP/IP is all about anyways – getting stuff from one point to another. 
  • The control plane used to be simple. It was just about the L2 or L3 tables – where to send a frame to, or where to send a packet to. Then the control plane got complicated by way of ACLs and all that (I don’t know what all to be honest as I am not a networking person :)). There was no “academic” approach to solving this problem similar to how the data plane was tackled; so we just kept adding more and more protocols to the mix to simply solve each issue as it came along. This made things even more complicated, but that’s OK as the people who manage all these liked the complexity and it worked after all. 
  • A good quote (from Don Norman) – “The ability to master complexity is not the same as the ability to extract simplicity“. Well said! So simple and prescient. 
    • It’s OK if you are only good at mastering complexity. But be aware of that. Don’t be under a misconception that just because you are good at mastering the complexity you can also extract simplicity out of it. That’s the key thing. Don’t fool yourself. :)
  • In the context of the control plane, the thing is we have learnt to master its complexity but not learnt to extract simplicity from it. That’s the key problem. 
    • To give an analogy with programming, we no longer think of programming in terms of machine language or registers or memory spaces. All these are abstracted away. This abstraction means a programmer can focus on tackling the problem in a totally different way compared to how he/ she would have had to approach it if they had to take care of all the underlying issues and figure it out. Abstraction is a very useful tool. E.g. Object Oriented Programming, Garbage Collection. Extract simplicity! 
  • Another good quote (from Barbara Liskov) – “Modularity based on abstraction is the way things get done“.
    • Or put another way :) Abstractions -> Interfaces -> Modularity (you abstract away stuff; provide interfaces between them; and that leads to modularity). 
  • As mentioned earlier the data plan has good abstraction, interfaces, and modularity (the layers). Each layer has well defined interfaces and the actual implementation of how a particular layer gets things done is down to the protocols used in that layer or its implementations. The layers above and below do not care. E.g. Layer 3 (IP) expects Layer 2 to somehow get it’s stuff done. The fact that it uses Ethernet and Frames etc is of no concern to IP. 
  • So, what are the control plane problems in networking? 
    • We need to be able to compute the configuration state of each network device. As in what ACLs are it supposed to be applying, what its forwarding tables are like …
    • We need to be able to do this while operating without communication guarantees. So we have to deal with communication delays or packet drops etc as changes are pushed out. 
    • We also need to be able to do this while operating within the limitations of the protocol we are using (e.g. IP). 
  • Anyone trying to master the control plane has to deal with all three. To give an analogy with programming, it is as though a programmer had to worry about where data is placed in RAM, take care of memory management and process communication etc. No one does that now. It is all magically taken care of by the underlying system (like the OS or the programming language itself). The programmer merely focuses on what they need to do. Something similar is required for the control plane. 
  • What is needed?
    • We need an abstraction for computing the configuration state of each device. [Specification Abstraction]
      • Instead of thinking of how to compute the configuration state of a device or how to change a configuration state, we just declare what we want and it is magically taken care of. You declare how things should be, and the underlying system takes care of making it so. 
      • We think in terms of specifications. If the intention is that Device A should not have access to Device B, we simply specify that in the language of our model without thinking of the how in terms of the underlying physical model. The shift in thinking here is that we view each thing as a layer and only focus on that. To implement a policy that Device A should not have access to Device B we do not need to think of the network structure or the devices in between – all that is just taken care of (by the Network Operating System, so to speak). 
      • This layer is  Network Virtualization. We have a simplified model of the network that we work with and which we specify how it should be, and the Network Virtualization takes care of actually implementing it. 
    • We need an abstraction that captures the lack of communication guarantees- i.e. the distributed state of the system. [Distributed State Abstraction]
      • Instead of thinking how to deal with the distributed network we abstract it away and assume that it is magically taken care of. 
      • Each device has access to an annotated network graph that they can query for whatever info they want. A global network view, so to say. 
      • There is some layer that gathers an overall picture of the network from all the devices and presents this global view to the devices. (We can think of this layer as being a central source of information, but it can be decentralized too. Point is that’s an implementation problem for whoever designs that layer). This layer is the Network Operating System, so to speak. 
    • We need an abstraction of the underlying protocol so we don’t have to deal with it directly.  [Forwarding Abstraction]
      • Network devices have a Management CPU and a Forwarding ASIC. We need an abstraction for both. 
      • The Management CPU abstraction can be anything. The ASIC abstraction is OpenFlow. 
      • This is the layer that closest to the hardware. 
  • SDN abstracts these three things – distribution, forwarding, and configuration. 
    • You have a Control Program that configures an abstract network view based on the operator requirements (note: this doesn’t deal with the underlying hardware at all) ->
    • You have a Network Virtualization layer that takes this abstract network view and maps it to a global view based on the underlying physical hardware (the specification abstraction) ->
    • You have a Network OS that communicates this global network view to all the physical devices to make it happen (the distributed state abstraction (for disseminating the information) and the forwarding abstraction (for configuring the hardware)).
  • Very important: Each piece of the above architecture has a very limited job that doesn’t involve the overall picture. 

From this Whitepaper:

SDN has three layers: (1) an Application layer, (2) a Control layer (the Control Program mentioned above), and (3) an Infrastructure layer (the network devices). 

The Application layer is where business applications reside. These talk to the Control Program in the Control layer via APIs. This way applications can program their network requirements directly. 

OpenFlow (mentioned in Scott’s talk under the ASIC abstraction) is the interface between the control plane and the data/ forwarding place. Rather than paraphrase, let me quote from that whitepaper for my own reference:

OpenFlow is the first standard communications interface defined between the control and forwarding layers of an SDN architecture. OpenFlow allows direct access to and manipulation of the forwarding plane of network devices such as switches and routers, both physical and virtual (hypervisor-based). It is the absence of an open interface to the forwarding plane that has led to the characterization of today’s networking devices as monolithic, closed, and mainframe-like. No other standard protocol does what OpenFlow does, and a protocol like OpenFlow is needed to move network control out of the networking switches to logically centralized control software.

OpenFlow can be compared to the instruction set of a CPU. The protocol specifies basic primitives that can be used by an external software application to program the forwarding plane of network devices, just like the instruction set of a CPU would program a computer system.

OpenFlow uses the concept of flows to identify network traffic based on pre-defined match rules that can be statically or dynamically programmed by the SDN control software. It also allows IT to define how traffic should flow through network devices based on parameters such as usage patterns, applications, and cloud resources. Since OpenFlow allows the network to be programmed on a per-flow basis, an OpenFlow-based SDN architecture provides extremely granular control, enabling the network to respond to real-time changes at the application, user, and session levels. Current IP-based routing does not provide this level of control, as all flows between two endpoints must follow the same path through the network, regardless of their different requirements.

I don’t think OpenFlow is used by NSX though. It is used by Open vSwitch and was used by NVP (Nicira Virtualization Platform – the predecessor of NSX).

Speaking of NVP and NSX: VMware acquired NSX from Nicira (which was a company founded by Martin Casado, Nick McKeown and Scott Shenker – the same Scott Shenker whose video I was watching above). The product was called NVP back then and primarily ran on the Xen hypervisor. VMware renamed it to NSX and it was has two flavors. NSX-V is the version that runs on the VMware ESXi hypervisor, and is in active development. There’s also NSX-MH which is a “multi-hypervisor” version that’s supposed to be able to run on Xen, KVM, etc. but I couldn’t find much information on it. There’s some presentation slides in case anyone’s interested. 

Before I conclude here’s some more blog posts related to all this. They are in order of publishing so we get a feel of how things have progressed. I am starting to get a headache reading all this network stuff, most of which is going above my head, so I am going to take a break here and simply link to the articles (with minimal/ half info) and not go much into it. :)

  • This one talks about how the VXLAN specification doesn’t specify any control plane.
    • There is no way for hosts participating in a VXLAN network to know the MAC addresses of other hosts or VMs in the VXLAN so we need some way of achieving that. 
    • Nicira NVP uses OpenFlow as a control-plane protocol. 
  • This one talks about how OpenFlow is used by Nicira NVP. Some points of interest:
    • Each Open vSwitch (OVS) implementation has 1) a flow-based forwarding module loaded in the kernel; 2) an agent that communicates with the Controller; and 3) an OVS DB daemon that keeps track of of the local configuration. 
    • NVP had clusters of 3 or 5 controllers. These used OpenFlow protocol to download forwarding entries into the OVS and OVSDB (a.k.a. ovsdb-daemon) to configure the OVS itself  (creating/ deleting/ modifying bridges, interfaces, etc). 
    • Read that post on how the forwarding tables and tunnel interfaces are modified as new devices join the overlay network. 
    • Broadcast traffic, unknown Unicast traffic, and Multicast traffic (a.k.a. BUM traffic) can be handled in two ways – either by sending these to an extra server that replicates these to all devices in the overlay network; or the source hypervisor/ physical device can encapsulate the BUM frame and send it as unicast to all the other devices in that overlay. 
  • This one talks about how Nicira NVP seems to be moving away from OpenFlow or supplementing it with something (I am not entirely clear).
    • This is a good read though just that I was lost by this point coz I have been doing this reading for nearly 2 days and it’s starting to get tiring. 

One more post from the author of the three posts above. It’s a good read. Kind of obvious stuff, but good to see in pictures. That author has some informative posts – wish I was more brainy! :)

TIL: VXLAN is a standard

VXLAN == Virtual eXtensible LAN.

While reading about NSX I was under the impression VXLAN is something VMware cooked up and owns (possibly via Nicira, which is where NSX came from). But turns out that isn’t the case. It was originally created by VMware & Cisco (check out this Register article – a good read) and is actually covered under an RFC 7348. The encapsulation mechanism is standardized, and so is the UDP port used for communication (port number 4789 by the way). A lot of vendors now support VXLAN, and similar to NSX being an implementation of VXLAN we also have Open vSwitch. Nice!

(Note to self: got to read more about Open vSwitch. It’s used in XenServer and is a part of Linux. The *BSDs too support it). 

VXLAN is meant to both virtualize Layer 2 and also replace VLANs. You can have up to 16 million VXLANs (the NSX Logical Switches I mentioned earlier). In contrast you are limited to 4094 VLANs. I like the analogy of how VXLAN is to IP addresses how cell phones are to telephone numbers. Prior to cell phones, when everyone had landline numbers, your phone number was tied to your location. If you shifted houses/ locations you got a new phone number. In contrast, with cell phones numbers it doesn’t matter where you are as the number is linked to you, not your location. Similarly with VXLAN your VM IP address is linked to the VM, not its location. 

Update:

  • Found a good whitepaper by Arista on VXLANs. Something I hadn’t realized earlier was that the 24bit VXLAN Network Identifier is called VNI (this is what lets you have 16 millions VXLAN segments/ NSX Logical Switches) and that a VM’s MAC is combined with its VNI – thus allowing multiple VMs with the same MAC address to exist across the network (as long as they are on separate VXNETs). 
  • Also, while I am noting acronyms I might as well also mention VTEPs. These stand for Virtual Tunnel End Points. This is the “thing” that encapsulates/ decapsulates packets for VXLAN. This can be virtual bridges in the hypervisor (ESXi or any other); or even VXLAN aware VM applications or VXLAN capable switching hardware (wasn’t aware of this until I read the Arista whitepaper). 
  • VTEP communicates over UDP. The port number is 4789 (NSX 6.2.3 and later) or 8472 (pre-NSX 6.2.3).
  • A post by Duncan Epping on VXLAN use cases. Probably dated in terms of the VXLAN issues it mentions (traffic tromboning) but I wanted to link it here as (a) it’s a good read and (b) it’s good to know such issues as that will help me better understand why things might be a certain way now (because they are designed to work around such issues). 

vSphere Distributed Switches are Layer 2 devices (doh!)

This is a very basic post. Was trying to read up on NSX and before I could appreciate it I wanted to go down and explore how things are without NSX so I can better understand what NSX is trying to do. I wanted to put it down in writing as I spent some time on it, but there’s nothing new or grand here.

So. vSphere Distributed Switches (VDS). These are Layer 2 switches that exist on each ESX host and which contain port groups that you can connect VMs running on a host onto. In case it wasn’t obvious from the name “switch”, these are Layer 2. Which means that all the hosts connecting to a particular Distributed Switch must be on the same Layer 2. Once you create a Distributed Switch and add ESXi hosts and their physical NICs to it, you can create VMKernel ports for Management, vMotion, Fault Tolerance, etc but these VMKernel ports aren’t used by the port groups you create on the Distributed Switch. The port groups are just like Layer 2 switches – they communicate via broadcasting using the underlying physical NICs that are assigned to the Distributed Switch; but since there’s no IP address as such assigned to a port group there’s no routing involved. (This is an obvious point but I keep forgetting it).

For example say you have two ESX hosts – HostA and HostB – and these are on two separate physical networks (i.e. separated by a router). You create a new Distributed Switch comprising of a physical NIC each from each host. Then you make a port group on this switch and put VM-A on HostA and VM-B on HostB. When creating the Distributed Switch and adding physical NICs to it, VMware doesn’t care if the physical NICs aren’t in the same Layer 2 domain. It will happily add the NICs but when you try to send traffic from VM-A to VM-B it will fail. That’s because when VM-A tries to communicate with VM-B (let’s assume these two VMs know each others MAC address so there’s no need for ARP communication first), VM-A will send Ethernet frames to the Distributed Switch on HostA who will then broadcast it to the Layer 2 network its physical NIC assigned to the Distributed Switch is connected to. Since these broadcasted frames won’t reach the physical NIC of HostB the VM-B there never sees it, and so the two VMs cannot communicate with each other. 

So – keep in mind that all physical NICs connecting to the same Distributed Switch must be on the same Layer 2. If the underlying physical NICs are on separate Layer 3 networks, and these Layer 3 networks have connectivity to each other, it doesn’t matter – the VMs in the port groups will not be able to communicate. 

And this is where NSX comes in. Using the concept of VXLANs, NSX stretches a Layer 2 network across Layer 3. Basically it encapsulates the Layer 2 traffic within Layer 3 packets and gives the illusion of all VMs being on the same Layer 2 network – but this illusion is what Network Virtualization if all about, right? :) VXLAN is an overlay

VXLAN encapsulates Layer 2 frames in UDP packets. The VXLAN is like a tunnel to which all the hosts connecting to this VXLAN hook into. On each host there’s something called a Virtual Tunnel End Point (VTEP) which is the “thing” that actually hooks into the VXLAN. If a VXLAN is a Distributed Switch made up of physical NICs from the host, the VTEP is the VMKernel ports of this Distributed Switch that do the actual communication (like how vMotion traffic between two hosts happens via the VMKernel ports you assign for vMotion). In fact, during an NSX install you install three VIBs on the ESXi hosts – one of these enhances the existing Distributed Switch with VXLAN capabilities (the encapsulation stuff  I mentioned above). 

Once you have NSX you can create multiple Logical Switches. These are basically VXLAN switches that operate like Layer 2 switches but can actually stretch multiple Layer 3 networks. Logical Switches are overlay switches. ;o) Each Logical Switch corresponds to one VXLAN. 

ps. VXLAN is one of the cool features of NSX. The other cool features are the Distributed Logical Router (DLR) and the Distributed Firewall (DFW). I mentioned that a ESXi host has 3 VIBs installed as part of NSX, and that one of them is VXLAN functionality? Well the other two are DLR and DFW (god, so many acronyms!). Prior to DLR if an ESXi host had two VMs connected to different Distributed Switches, and if these two hosts wanted to talk to each other, the traffic would go down from one of the VMs, to the host, to the underlying physical network router, and back to the host and up to the VM on the other Distributed Switch. But with DLR, the ESXi hypervisor kernel can do Layer 3 routing too, so it will simply send traffic directly to the VM in the other Distributed Switch. 

Similarly, DFW just means each ESXi hypervisor can also apply firewall rules to the packets, so you don’t need one centralized firewall place any more. You simply create rules and push it out to the ESXi hosts and they can do firewalling between VMs. Everything is virtual! :)

pps. Some other jargon. East-West traffic means network traffic that’s usually within or between servers (ESXi hosts in our case). North-South traffic means any other network traffic – basically, traffic that goes out of this layer of ESXi hosts. With NSX you try and have more traffic East-West rather than North-South. 

TIL: Transparent Page Sharing (TPS) between VMs is disabled by default

(TIL is short for “Today I Learned” by the way).

I always thought an ESXi host did some page sharing of VM memory between the VMs running on it. The feature is called Transparent Page Sharing (TPS) and it was something I remember from my VMware course and also read in blog posts such as this and this. The idea is that if you have (say) a bunch of Server 2012R2 VMs running on a host, it’s quite likely these VMs have quite a bit of common stuff between them in RAM, so it makes sense for the ESXi host to share that common stuff between the hosts. So even if each VM has (say) 4 GB RAM assigned to it, and there’s about 2GB worth of stuff common between the VMs, the host only needs to use 2GB shared RAM + 2 x 2GB private RAM for a total of 6GB RAM. 

Apart from this as the host is under increased memory pressure it resorts to techniques like ballooning and memory swapping to free up some RAM for itself. 

I even made a script today to list out all the VMs in our environment that have greater than 8GB RAM assigned to them and are powered on and to list the amount of shared RAM (just for my own info). 

Anyhow – around 2015 VMware stopped page sharing of VM memory between VMs. VMware calls this sort of RAM sharing as inter-VM TPS. Apparently this is a security risk and VMware likes to ship their products as secure by default, so via some patches to the 5.x series (and as default in the 6.x series) it turned off inter-VM TPS and introduced some controls that allow IT Admins to turn this on if they so wish. Intra-VM TPS is still enabled – i.e. the ESXi host will do page sharing within each VM – but it not longer does page sharing between VMs by default. 

Using the newly introduced controls, however, it is possible to enable inter-VM TPS for all VMs, or selectively between some VMs. Quoting from this blog post

You can set a virtual machine’s advanced parameter sched.mem.pshare.salt to control its ability to participate in transparent page sharing.  

TPS is only allowed within a virtual machine (intra-VM TPS) by default, because the ESXi host configuration option Mem.ShareForceSalting is set to 2, the sched.mem.pshare.salt is not present in the virtual machine configuration file, and thus the virtual machine salt value is set to unique value. In this case, to allow TPS among a specific set of virtual machines, set the sched.mem.pshare.salt of each virtual machine in the set to an identical value.  

Alternatively, to enable TPS among all virtual machines (inter-VM TPS), you can set Mem.ShareForceSalting to 0, which causes sched.mem.pshare.salt to be ignored and to have no impact.

Or, to enable inter-VM TPS as the default, but yet allow the use of sched.mem.pshare.salt to control the effect of TPS per virtual machine, set the value of Mem.ShareForceSalting to 1. In this case, change the value of sched.mem.pshare.salt per virtual machine to prevent it from sharing with all virtual machines and restrict it to sharing with those that have an identical setting.

Nice! 

I wonder if intra-VM TPS has much memory savings. Looking at the output from my script for our estate I see that many of our server VMs have about half their allocated RAM as shared, so it does make an impact. I guess it will also make a difference when moving to a container architecture wherein a single VM might have many containers. 

I would also like to point out to this blog post and another blog post I came across from it on whether inter-VM TPS even makes much sense in today’s environments and also on the kind of impact it can have during vMotion etc. Good stuff. I am still reading these but wanted to link to them for reference. Mainly – nowadays we have larger page sizes and so the probability of finding an identical page to be shared between two VMs is low; then there is NUMA that places memory pages closer to the CPU and TPS could disrupt that; and also, TPS is a process that runs periodically to compare pages, so there is an operational cost as it runs and finds a match and then does a full compare of the two pages to ensure they are really identical. 

Good to know. 

Reboot a bunch of ESXi hosts one after the other

Not a big deal, I know, but I felt like posting this. :)

Our HP Gen8 ESXi hosts were randomly crashing ever since we applied the latest ESXi 5.5 updates to them in December. Logged a call with HP and turns out until a proper fix is issued by VMware/ HPE we need to change a setting on all our hosts and reboot them. I didn’t want to do it manually, so I used PowerCLI to do it en masse.

Here’s the script I wrote to target Gen8 hosts and make the change:

I could have done the reboot along with this, but I didn’t want to. Instead I copy pasted the list of affected hosts into a text file (called ESXReboot.txt in the script below) and wrote another script to put them into maintenance mode and reboot one by one.

The screenshot output is slightly different from what you would get from the script as I modified it a bit since taking the screenshot. Functionality-wise there’s no change.

VCSA migration – “A problem occurred while logging in. Verify the connection details.”

So, I was trying out a Windows vCenter 5.5 to VCSA 6.5 appliance migration and at the stage where I enter the target ESX host name where the appliance will be deployed to I got the above error.

Wasted the better part of my day troubleshooting this as I could find absolutely no mention of what was causing this. The installer log had the following but that didn’t shed much light either.

Tried stuff like 1) try a different ESX host, 2) update it to a later version (it was 5.5 Build 3568722), 3) turn on the ESX Shell and SSH in case that mattered – but nothing helped!

Nothing came up regarding the “vimService creation failed: Error” line either. But then I began Googling on “vimService” and learnt that it is the vSphere Management SDK and that you access the SDK via a URL like https://servername/sdk. That got me thinking whether the VCSA installer looks to the proxy settings of the machine where I am running it from, so I turned off the proxy settings in IE – and that helped!

Who would have thought. :)

vSphere Replication does not support changing the length of a replicated disk.

Had to extend a VM disk today and got the above error. This is because the VM is replicated via vSphere Replication so you can’t simply extend the disk as you would do for any regular VM.

error-1

Here’s a top level summary of how you do it (based on this KB article).

  1. You have to break the replication. Stop it that is. But doing so deletes the replicated files, so first you want to work around that (as below).
    1. Note the current settings of the replication.
    2. Then pause the replication.
    3. Find out which datastore holds the replicated VM disks.
    4. Rename the replicated VM folder.
    5. Now you can stop the replication because you have kept a copy of the data.
  2. SSH into any ESX host that has access to the above datastore and extend the disk associated with the VMDK via vmkfstools.
  3. Rename the folder back to what it was before.
  4. Recreate the replication, but point the destination to the same datastore as above and select the folder above. vSphere Replication will ask whether you want to use the existing data as seed – answer yes.

That’s it basically.

In terms of the details, I didn’t know how to find which datastore had the replicated VM files. So I SSH’d into one of the hosts in the replicated VM cluster and ran the following:

There must be some better way, but what the heck. Once I found the path above I did the following to find other VMs in it, and using that info I was able to find the datastore name from vSphere client.

You need this datastore name for when setting up a new replication, so you can point to that.

Some more things to keep in mind are the following.

  1. Since we pause the replication rather than stop it, the folder will contain a bunch of hbr* files. Delete those.
  2. The vmkfstools command -X switch takes the new size of the disk. Not the additional amount. So if the disk is 10GB and you want to add 20GB, you specify it the argument as 30GB. If you are getting a “Failed to extend disk : One of the parameters supplied is invalid (1).” error with vmkfstools that’s probably why.

vCenter unable to connects to hosts; vSphere client gives error ‘”ServiceInstance.RetrieveContent” for object “ServiceInstance” on Server “IP-Address” failed’

Our Network team had been making some changes at work and suddenly vCenter in our London office lost connectivity with all the ESX hosts in one of our remote office. Moreover, when trying to connect from the vSphere Client to any of the remote hosts directly we were getting the following error –

client error

Connectivity from vSphere Client in the remote office to the ESX host in the same office was fine; it was only connectivity from other offices to this remote office. So it definitely indicated a network issue.

This KB article is a handy one to know what ports are required by various VMware products. Port 443 is what needs to be open to ESX hosts for vCenter Server to be able to talk to them. I did a telnet from the vCenter server to each of the remote office hosts on port 443 and it went through fine – so wasn’t a firewall issue. (Another post with port numbers, just FYI, is this one).

After a fair bit of troubleshooting we tracked the issue down to MTU.

Digressing into MTUs

Communication between two IP addresses (i.e. layer 3) happens through packets. Thus when my London vCenter Server communicates with my remote office ESX host, the two send TCP/IP packets to each other. When these packets from the vCenter Server reach the switch/ router on the same LAN as the ESX host, it becomes a layer 2 communication (because they are on the same network and it’s a matter of data reaching the ESX host from the switch/ router). In the case of Ethernet, this layer 2 communication happens via Ethernet frames. The frames encapsulate the IP packets – so the switch/ router breaks the packets and fits them into multiple frames, while the ESX host receives these frames and re-assembles the packets (and vice versa). (The picture on this Wikipedia page is worth a look to see the encapsulation). 

How much data can be held by a layer 2 frame is defined by the Maximum Transmission Unit (MTU). Larger MTUs are good because you can carry more data; but they have a downside in that each frame takes longer to be transmitted, and in case of any errors more data has to be re-transmitted when the frame is resent. So a balance is important. In the case of Ethernet, RFC 894 (see errata also) defines the MTU as a maximum of 1500 bytes. In the case of other layer 2 protocols, the MTU varies: for example 4464 bytes for Token Ring; 4352 bytes for FDDI; 9180 bytes for ATM; etc. In the case of Ethernet there are now also jumbo frames, which are frames with an MTU size of 9000 bytes (see this page for a table comparing regular frames and jumbo frames) and are commonly used in iSCSI networks.

Taking the case of Ethernet, assume the MTU of all Ethernet networks is 1500 bytes. So when two devices are conversing with each other over layer 3, and this conversation spans multiple Ethernet networks, it is helpful if the devices know that the MTU of the underlying layer 2 network is 1500 bytes. That way the two devices can keep the size of their layer 3 packets to be less than 1500 bytes. Why? Because if the size of the layer 3 packets are greater than 1500 bytes, then the devices and all the routers/ switches in between will have to fragment (break) the layer 3 packets into smaller packets of less than 1500 bytes to fit it in the Ethernet frame. This is a waste of resources for all, so it’s best if the two devices know of the underlying layer 2 MTU and act accordingly.

Now, note that Ethernet MTUs are defined as a maximum of 1500 bytes. So the MTU for a particular LAN segment can be set to a lower number for whatever reason (maybe there are additional fields in the Ethernet frame and to accommodate these the data portion must be reduced). Similarly, a layer 3 conversation between when two devices can go over a mix of layer 2 networks – Ethernet, Token Ring, etc – each with a different MTU. So what is required for the two devices really is a way of knowing what’s the lowest MTU across all these layer 2 devices, so the two devices can use it as the MTU of the layer 3 packets for their conversation. This is known as the Path MTU or IP MTU – and is basically the smallest MTU of all the underlying layer 2 MTUs over which that conversation traverses. It is discovered through a process known as “Path MTU Discovery” (PMTUD) (check this Wikipedia article, or Google this term to learn more). Very briefly, in the case of IPv4 what happens is that each device sends across packets of increasing size to the other end, with a flag set that says “do not fragment this packet”. Packets of size smaller than the lowest layer 2 MTU will get through, but once the size exceeds the lowest MTU the packet will fail & return because it cannot be fragmented (due to the flag) and so is returned via ICMP to the sender. Thus the Path MTU is discovered. This check happens in both directions.

So we have layer 2 MTUs and layer 3 MTUs. Layer 2 MTUs have a maximum value that is dependent on the layer 2 network technology. But what about the minimum value? RFC 791, which defines the Internet Protocol (the IP in TCP/IP), requires that all devices supporting IP must be able to forward packets of 68 bytes without fragmenting (68 bytes because IP headers take 60 bytes size and layer 2 headers take 8 bytes size minimum) and be able to accept packets of minimum size 576 bytes either as one packet or multiple packets that require assembling. Because of this the minimum layer 2 MTU can be thought of as 68 bytes. In a practical sense, however, most IP devices accept 576 bytes without fragmenting, and since this number is higher than the values for all layer 2 networks the minimum layer 2 & layer 3 MTU can be thought of as 576 bytes.

Just for completeness I will also mention Maximum Segment Size (MSS) which is a layer 4 MTU (of sorts) that defines what’s the maximum TCP segment (which is what a TCP packet is called) that can be accepted by devices. It has a default value of 536 bytes. This is based on the 576 bytes that IP requires hosts to accept at minimum, minus 20 bytes for IP headers and 20 bytes for TCP headers. Idea behind using 576 bytes as the base is that this way the TCP segment can be expected to arrive without fragmenting. In a practical sense again, for TCP/IP traffic over Ethernet (which is the common case), since Ethernet frames have an MTU of 1500, the MSS is usually set to 1500 minus 20 minus 20 = 1460 bytes.

This is a good article I came upon. Just linking it as a reference to myself.

Back to our issue

In our case the router in the remote site had the following set in its configuration:

I am not entirely clear where it was set or why it was set, as that comes under the Network team. What this does though is tell the router not to clear the “Do Not Fragment” (DF) bit in Ethernet frames. If a DF bit is present in a frame then the router will not fragment it if the frame size is larger than the MTU (this is how PMTUD also works). I am not sure why this was set – part of some testing I suppose – but because of this larger frames were not getting through to the other side and hence failing. Our Network team removed this statement and then communication with the ESX hosts started working fine.

I wanted to write more about this statement but I am running out of time. This and this are two good links worth reading for more info. Especially the Scenario 4 section in the second link – that’s pretty much what was happening in our case, I think.

Power cycle/ Reset an HP blade server

Was getting the following error on one of our servers. It’s from ESXi. None of the NICs were working for the server (the NICs seemed to be working, just that the driver wasn’t loading). 

error

Power cycle required. 

I switched off and switched on the server but that didn’t help. Turns out that doesn’t actually power cycle the server (because the server still has power – doh!). What you need to do is do something called an e-fuse reset. This power cycles the blade. You have to do this by opening an SSH session to the Onboard Administrator, finding the bay number of the blade you want to power cycle, and typing the command reset server <bay number>

Good to know!

Note: The command does not appear when you type help, but it’s there:

Also, to get a list of your bays and servers use the show server list command. To do the same for interconnects use the show interconnect list command.

Install a vSphere web-client plugin offline

Trying out vCloud Air at work and I wanted to install the vCloud Air plugin for vSphere web-client. The installation kept failing though. Initially it was due to the vCenter server not having access to the Internet (not your browser, the vSphere web-client itself needs to have access) but even after I specified a proxy (check out this post on how to specify a proxy) and gave vSphere web-client access to the Internet the download would begin and fail. 

install-failed

It’s possible to download the plugin, but how to add it to vSphere web-client?

Through a bit of trial and error I found a way. :)

Turns the plugins are store at C:\Program Files\VMware\Infrastructure\vSphereWebClient\plugin-packages on the server. So all you have to do is:

  1. Download the plugin zip file. 
  2. Create a folder in the above location and extract the zip file to this folder.
  3. Restart the vSphere web-client service. 

And that’s it! Then your plugin will appear under Administration > Client Plug-Ins

plugins

It’s very simple but I couldn’t find any info on how to download and install a plugin when I Googled for it, so thought I’d make a post. Hope it helps someone!

Unable to login to vSphere because the admin@system-domain password cannot be reset

vSphere 5.1 has admin@system-domain as the default admin account. vSphere 5.5 changes that to administrator@vsphere.local. However, if you upgrade from 5.1 to 5.5 the default admin account remains admin@system-domain. Which is fine and dandy until the password for this account expires. Then you are unable to reset or login! See below. :)

Trying to login as usual

1 - login

Password has expired, needs a reset

2 - reset

Reset fails though coz you can only reset for the vsphere.local domain

3 - reset fails

Missed out on taking a screenshot but if you were to try and login with administrator@vsphere.local instead you get an error that the credentials are invalid (because that account doesn’t exist!). So you are stuck!

What do you do?

Solution is to reset the admin password

When you do this vSphere automatically creates the administrator@vsphere.local account. Follow the steps in this KB article.

4 - reset password

Now you can login with administrator@vsphere.local and the generated password.

Downgrading ESXi Host

Today I upgraded one of our hosts to a newer version than what was supported by our vCenter so had to find a way of downgrading it. The host was now at “5.5 Patch 10” (which is after “5.5 Update 3”) which our vCenter version only supported versions prior to “5.5 Update 3”. (See this post for a list of build numbers and versions; see this KB article for why vCenter and the host were now incompatible).

I found this blog post and KB article that talked about downgrading and upgrading. Based on those two here’s what I did to downgrade my host.

First, some terminology. Read this blog post on what VIBs are. At a very high level a VIB file is like a zip file with some metadata and verification thrown in. They are the software packages for ESX (think of it like a .deb or .rpm file). The VIB file contains the actual files on the host that will be replaced. The metadata tells you more about the VIB file – its dependencies, requirements, issues, etc. And the verification bit lets the host verify that the VIB hasn’t been tampered with, and also allows you to have various “levels” of VIBs – those certified by VMware, those certified by partners of VMware, etc – such that you as a System Admin can decide what level of VIBs you want installed on your host.

You can install/ remove/ update VIBs via the command esxcli:

Here’s a short list of the VIBs installed on my host:

Next you have Image Profiles. These are a collection of VIBs. In fact, since any installation of ESXi is a collection of VIBs, an image profile can be thought of as defining an ESXi image. For instance, all the VIBs on my currently installed ESXi server – including 3rd party VIBs – together can be thought of as an image profile. I can then deploy this image profile to other hosts to get the exact configuration on those hosts too.

One thing to keep in mind is that image profiles are not anything tangible. As in they are not files as such, they just define the VIBs that make up the profile.

Lastly you have Software Depots. These are your equivalent of Linux package repositories. They contain VIBs and Image Profiles and are accessible online via HTTP/ HTTPS/ FTP or even offline as a ZIP file (which is a neat thing IMHO). You would point to a software depot – online or offline – and specify an image profile you want, which then pulls in the VIBs you want.

Now back to esxcli. As we saw above this command can be used to list, update, remove etc VIBs. The cool thing though is that it can work with both VIB files and software depots (either online or a ZIP file containing a bunch of VIB files). Here’s the usage for the software vib install command which deals with installing VIBs:

You have two options:

  • The -d switch can be used to specify a software depot (online or offline) along with the -n switch to specify the VIBs to be installed from this depot.
  • Or the -v switch can be used to directly specify VIBs to be installed.

The esxcli command can also work with image profiles.

Here you have just one option (coz like I said you can’t download something called an image profile – you have to necessarily use a software depot). You use the -d switch to specify a depot (online or offline) and the -p switch to specify the image profile you are interested in.

Apart from installing VIBs & image profiles, the esxcli command can also remove and update these. When it comes to image profiles though, the command can also downgrade profiles via an --allow-downgrades switch. So that’s what we use to downgrade ESXi versions. 

First find the ESXi version you want to downgrade to. In my case it was ESXi 5.5 Update 2. Go to My VMware (login with your account) and find the 5.5 Update 2 product. Download the offline bundle – which is a ZIP file (basically an offline software depot). In my case I got a file named “update-from-esxi5.5-5.5_update02-2068190.zip”. Now open this ZIP file and go to the “metadata.zip\profiles” folder in that. This gives you the list of profiles in this depot.

profiles

You can also get the names from a link such as this which gives more info on the release and the image profiles in it. (I came across it by Googling for “ESXi 5.5 Update 2 profile name”).

The profiles with an “s” in them only contain security fixes while the ones without an “s” contain both security and bug fixes. In my case the profile I am looking for is “ESXi-5.5.0-20140902001-standard”. I wasn’t sure if I need to go for the “no-tools” version or not, but figured I’ll stick with the “standard”.

Now, copy the ZIP file you downloaded to the host. Either upload it to the host directly, or to some shared storage, etc.

Then run a command similar to this:

That’s it! Following a host reboot you are now downgraded. Very straight-forward and easy.