Contact

Subscribe via Email

Subscribe via RSS

Categories

Creative Commons Attribution 4.0 International License
© Rakhesh Sasidharan

Find the profiles in an offline ESXi update zip file

I use esxcli to manually update our ESXi hosts that don’t have access to VUM (e.g. our DMZ hosts). I do so via command-line:

Usually the VMware page where I download the patch from mentions the profile name, but today I had a patch file and wanted to find the list of profiles it had. 

One way is to open the zip file, then the metadata.zip file in that, and that should contain a list of profiles. Another way is to use esxcli

Screenshot example:

Yay! (VXLAN) contd. + Notes to self while installing NSX 6.3 (part 2)

In my previous post I said the following (in gray). Here I’d like to add on:

  • A VDS uses VMKernel ports (vmk ports) to carry out the actual traffic. These are virtual ports bound to the physical NICs on an ESXi host, and there can be multiple vmk ports per VDS for various tasks (vMotion, FT, etc). Similar to this we need to create a new vmk port for the host to connect into the VTEP used by the VXLAN. 
    • Unlike regular vmk ports though we don’t create and assign IP addresses manually. Instead we either use DHCP or create an IP pool when configuring the VXLAN for a cluster. (It is possible to specify a static IP either via DHCP reservation or as mentioned in the install guide).
      • The number of vmk ports (and hence IP addresses) corresponds to the number of uplinks. So a host with 2 uplinks will have two VTEP vmk ports, hence two IP addresses taken from the pool. Bear that in mind when creating the pool.
    • Each cluster uses one VDS for its VXLAN traffic. This can be a pre-existing VDS – there’s nothing special about it just that you point to it when enabling VXLAN on a cluster; and the vmk port is created on this VDS. NSX automatically creates another portgroup, which is where the vmk port is assigned to.
    • VXLANs are created on this VDS – they are basically portgroups in the VDS. Each VXLAN has an ID – the VXLAN Network Identifier (VNI) – which NSX refers to as segment IDs. 
      • Before creating VXLANS we have to allocate a pool of segment IDs (the VNIs) taking into account any VNIs that may already be in use in the environment.
      • The number of segment IDs is also limited by the fact that a single vCenter only supports a maximum of 10,000 portgroups
      • The web UI only allows us to configure a single segment ID range, but multiple ranges can be configured via the NSX API
  • Logical Switch == VXLAN -> which has an ID (called segment ID or VNI) == Portgroup. All of this is in a VDS. 

While installing NSX I came across “Transport Zones”.

Remember ESXi hosts are part of a VDS. VXLANs are created on a VDS. Each VXLAN is a portgroup on this VDS. However, not all hosts need be part of the same VXLANs, but since all hosts are part of the same VDS and hence have visibility to all the VXLANs we need same way of marking which hosts are part of a VXLAN. We also need some place to identify if a VXLAN is in unicast, multicast, or hybrid mode. This is where Transport Zones come in.

If all your VXLANs are going to behave the same way (multicast etc) and have the same hosts, then you just need one transport zone. Else you would create separate zones based on your requirement. (That said, when you create a Logical Switch/ VXLAN you have an option to specify the control plane mode (multicast mode etc). Am guessing that overrides the zone setting, so you don’t need to create separate zones just to specify different modes). 

Note: I keep saying hosts above (last two paragraphs) but that’s not correct. It’s actually clusters. I keep forgetting, so thought I should note it separately here rather the correct my mistake above. 1) VXLANs are configured on clusters, not hosts. 2) All hosts within a cluster must be connected to a common VDS (at least one common VDS, for VXLAN purposes). 3) NSX Controllers are optional and can be skipped if you are using multicast replication? 4) Transport Zones are made up of clusters (i.e. all hosts in a cluster; you cannot pick & choose just some hosts – this makes sense when you think that a cluster is for HA and DRS so naturally you wouldn’t want to exclude some hosts from where a VM can vMotion to as this would make things difficult). 

Worth keeping in mind: 1) A cluster can belong to multiple transport zones. 2) A logical switch can belong to only one transport zone. 3) A VM cannot be connected to logical switches in different transport zones. 4) A DLR (Distributed Logical Router) cannot connect to logical switches in multiple transport zones. Ditto for an ESG (Edge Services Gateway). 

After creating a transport zone, we can create a Logical Switch. This assigns a segment ID from the pool automatically and this (finally!!) is your VXLAN. Each logical switch creates yet another portgroup. Once you create a logical switch you can assign VMs to it – that basically changes their port group to the one created by the logical switch. Now your VMs will have connectivity to each other even if they are on hosts in separate L3 networks. 

Something I hadn’t realized: 1) Logical Switches are created on Transport Zones. 2) Transport Zones are made up of / can span clusters. 3) Within a cluster the logical switches (VXLANs) are created on the VDS that’s common to the cluster. 4) What I hadn’t realized was this: no where in the previous statements did I imply that transport zones are limited to a single VDS. So if a transport zone is made up of multiple clusters, each / some of which have their own common VDS, any logical switch I create will be created on all these VDSes.  

Sadly, I don’t feel like saying yay at the this point unlike before. I am too tired. :(

Which also brings me to the question of how I got this working with VMware Workstation. 

By default VMware Workstation emulates an e1000 NIC in the VMs and this doesn’t support an MTU larger than 1500 bytes. We can edit the .VMX file of a VM and replace “e1000” with “vmxnet3” to replace the emulated Intel 82545EM Gigabit Etherne NIC with a paravirtual VMXNET3 NIC to the VMs. This NIC supports an MTU larger than 1500 bytes and VXLAN will begin working. One thing though: a quick way of testing if the VTEP VMkernel NICs are able to talk to each other with a larger MTU is via a command such as ping ++netstack=vxlan -I vmk3 -d -s 1600 xxx.xxx.xxx.xxx. If you do this once you add a VMXNET3 NIC though, it crashes the ESXi host. I don’t know why. It only crashes when using the VXLAN network stack; the same command with any other VMkernel NIC works fine (so I know the MTU part is ok). Also, when testing the Logical Switch connectivity via the Web UI (see example here) there’s no crash with a VXLAN standard test packet – maybe that doesn’t use the VXLAN network stack? I spent a fair bit of time chasing after the ping ++netstack command until I realized that even though it was crashing my host the VXLAN was actually working!

Before I conclude a hat-tip to this post for the Web UI test method and also for generally posting how the author set up his NSX test lab. That’s an example of how to post something like this properly, instead of the stream of thoughts my few posts have been. :)

Yay! (VXLAN)

I decided to take a break from my NSX reading and just go ahead and set up a VXLAN in my test lab. Just go with a hunch of what I think the options should be based on what the menus ask me and what I have read so far. Take a leap! :)

*Ahem* The above is actually incorrect, and I am an idiot. A super huge idiot! Each VM is actually just pinging itself and not the other. Unbelievable! And to think that I got all excited thinking I managed to do something without reading the docs etc. The steps below are incomplete. I should just delete this post, but I wrote this much and had a moment of excitement that day … so am just leaving it as it is with this note. 

Above we have two OpenBSD VMs running in my nested EXIi hypervisors. 

  • obsd-01 is running on host 1, which is on network 10.10.3.0/24.
  • obsd-02 is running on host 2, which is on network 10.10.4.0/24. 
  • Note that each host is on a separate L3 network.
  • Each host is in a cluster of its own (doesn’t matter but just mentioning) and they connect to the same VDS.
  • In that VDS there’s a port group for VMs and that’s where obsd-01 and obsd-02 connect to. 
  • Without NSX, since the hosts are on separate networks, the two VMs wouldn’t be able to see each other. 
  • With NSX, I am able to create a VXLAN network on the VDS such that both VMs are now on the same network.
    • I put the VMs on a 192.168.0.0/24 network so that’s my overlay network. 
    • VXLANs are basically port groups within your NSX enhanced VDS. The same way you don’t specify IP/ network information on the VMware side when creating a regular portgroup, you don’t do anything when creating the VXLAN portgroup either. All that is within the VMs on the portgroup.
  • A VDS uses VMKernel ports (vmk ports) to carry out the actual traffic. These are virtual ports bound to the physical NICs on an ESXi host, and there can be multiple vmk ports per VDS for various tasks (vMotion, FT, etc). Similar to this we need to create a new vmk port for the host to connect into the VTEP used by the VXLAN. 
    • Unlike regular vmk ports though we don’t create and assign IP addresses manually. Instead we either use DHCP or create an IP pool when configuring the VXLAN for a cluster. (It is possible to specify a static IP either via DHCP reservation or as mentioned in the install guide). 
    • Each cluster uses one VDS for its VXLAN traffic. This can be a pre-existing VDS – there’s nothing special about it just that you point to it when enabling VXLAN on a cluster; and the vmk port is created on this VDS. NSX automatically creates another portgroup, which is where the vmk port is assigned to. 

And that’s where I am so far. After doing this I went through the chapter for configuring VXLAN in the install guide and I was pretty much on the right track. Take a look at that chapter for more screenshots and info. 

Yay, my first VXLAN! :o)

p.s. I went ahead with OpenBSD in my nested environment coz (a) I like OpenBSD (though I have never got to play around much with it); (b) it has a simple & fast install process and I am familiar with it; (c) the ISO file is small, so doesn’t take much space in my ISO library; (d) OpenBSD comes with VMware tools as part of the kernel, so nothing additional to install; (e) I so love that it still has a simple rc based system and none of that systemd stuff that newer Linux distributions have (not that there’s anything wrong with systemd just that I am unfamiliar with it and rc is way simpler for my needs); (f) the base install has manpages for all the commands unlike minimal Linux ISOs that usually seem to skip these; (g) take a look at this memory usage! :o)

p.p.s. Remember to disable the PF firewall via pfctl -d.

Yay again! :o)

Update: Short-lived excitement sadly. A while later the VMs stopped communicating. Turns out VMware Workstation doesn’t support MTU larger than 1500 bytes, and VXLAN requires 1600 byte. So the VTEP interfaces of both ESXi hosts are unable to talk to each other. Bummer!

Update 2: I finally got this working. Turns out I had missed some stuff; and also I had to make some changes to allows VMware Workstation to with larger MTU sizes. I’ll blog this in a later post

Useful offline Windows troubleshooting/ fixing tricks

Had a Windows Server 2008 R2 server that started giving a blank screen since the recent Windows update reboot. This was a VM and it was the same result via VMware console or RDP. Safe Mode didn’t help either. Bummer!

Since this is a VM I mounted its disk on another 2008 R2 VM and tried to fix the problem offline. Most of my attempts didn’t help but I thought of posting them here for reference. 

Note: In the following examples the broken VM’s disk is mounted to F: drive. 

Recent updates

I used dism to list recent updates and remove them. To list updates from this month (March 2017):

To remove an update:

I did this for each of the updates I had. That didn’t help though. And oddly I found that one of the updates kept re-appearing with a slightly different name (a different number suffixed to it actually) each time I’d remove it. Not sure why that was the case but I saw that F:\Windows\SxS had a file called pending.xml and figured this must be doing something to stop the update from being removed. I couldn’t delete the file in-spite of taking ownership and full control, so I opened it in Notepad and cleared all the contents. :o) After that the updates didn’t return but the machine was still broken. 

SFC

I used sfc to check the integrity of all the system files:

No luck with that either!

Event Logs

Maybe the Event Logs have something? These can be found at F:\Windows\System32\Winevt\Logs. Double click the ones of interest to view. 

In my case the Event Logs had nothing! No record at all of the VM starting up or what was causing it to hang. Tough luck!

Bonus info: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Eventlog contains locations of the files backing the Event Logs. Just mentioning it here as I came across this.

Drivers

Could drivers cause any issue? Unlikely. You can’t use dism to query drivers as above but you can check via registry. See this post. Honestly, I didn’t read it much. I didn’t suspect drivers and it seemed too much work fiddling through registry keys and folders. 

Last Known Good Configuration

Whenever I’d boot up the VM I never got the Last Known Good (LKG) Configuration option. I tried pressing F8 a couple of times but it had no effect. So I wondered if I could tweak this via the registry. Turns out I can. And turns out I already knew this just that I had forgotten!

Your current configuration is HKLM\System\CurrentControlSet. This is actually a link to HKLM\System\CurrentControlSet01 or HKLM\System\CurrentControlSet02 or HKLM\System\CurrentControlSet03 or … (you get the point). Each of the CurrentControlSetXXX key is one of your previous configurations. The one that’s actually used can be found via HKLM\System\Select. The entry Current points to the number of the CurrentControlSetXXX key in use. The entry LastKnownGood points to the Last Known Good Configuration. Now we know what to do. 

  1. Mount the HKLM\SYSTEM hive of the broken VM. All registry hives can be found under %windir%\System32\Config. In my case that translates to the file F:\Windows\System32\Config\SYSTEM.
  2. To mount this file open Registry Editor, select the HKLM hive, and go to File > Load Hive. (This is a good post with screenshots etc).  
  3. Go to the Select key above. Change Current to whatever LastKnownGood was. 
  4. That’s all. Now unload the hive and you are done.

This helped in my case! I was finally able to move past the blank screen and get a login prompt. Upon login I was also able to download and install all the patches and confirm that the VM is now working fine (took a snapshot of course, just in case!). I have no idea what went wrong, but at least I have the pleasure of being able to fix it. From the post I link to below, I’d say it looks like a registry hive corruption. 

Since I successfully logged in, my machine’s Last Known Good Configuration will be automatically updated by Windows with the current one. Here’s a blog post that explains this in more detail. 

That’s all! Hope this helps someone. 

Notes to self while installing NSX 6.3 (part 1)

(No sense or order here. These are just notes I took when installing NSX 6.3 in my home lab, while reading this excellent NSX for Newbies series and the NSX 6.3 install guide from VMware (which I find to be quite informative). Splitting these into parts as I have been typing this for a few days).

You can install NSX Manager in VMware Workstation (rather than in the nested ESXi installation if you are doing it in a home lab). You won’t get a chance to configure the IP address, but you can figure it from your DHCP server. Browse to that IP in a browser and login as username “admin” password “default” (no double quotes). 

If you want to add a certificate from your AD CA to NSX Manager create the certificate as usual in Certificate Manager. Then export the generated certificate and your root CA and any intermediate CA certificates as a “Base-64 encoded X.509 (.CER)” file. Then concatenate all these certificates into a single file (basically, open up Notepad and make a new file that has all these certificates in it). Then you can import it into NSX Manager. (More details here).

During the Host Preparation step on an ESXi 5.5 host it failed with the following error: 

“Could not install image profile: ([], “Error in running [‘/etc/init.d/vShield-Stateful-Firewall’, ‘start’, ‘install’]:\nReturn code: 1\nOutput: vShield-Stateful-Firewall is not running\nwatchdog-dfwpktlogs: PID file /var/run/vmware/watchdog-dfwpktlogs.PID does not exist\nwatchdog-dfwpktlogs: Unable to terminate watchdog: No running watchdog process for dfwpktlogs\nFailed to release memory reservation for vsfwd\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nSet memory minlimit for vsfwd to 256MB\nFailed to set memory reservation for vsfwd to 256MB, trying for 256MB\nFailed to set memory reservation for vsfwd to failsafe value of 256MB\nMemory reservation released for vsfwd\nResource pool ‘host/vim/vmvisor/vsfwd’ released.\nResource pool creation failed. Not starting vShield-Stateful-Firewall\n\nIt is not safe to continue. Please reboot the host immediately to discard the unfinished update.”)” Error 3/16/2017 5:17:49 AM esx55-01.fqdn

Initially I thought maybe NSX 6.3 wasn’t compatible with ESXi 5.5 or that I was on an older version of ESXi 5.5 – so I Googled around on pre-requisites (ESXi 5.5 seems to be fine) and also updated ESXi 5.5 to the latest version. Then I took a closer look at the error message above and saw the bit about the 256MB memory reservation. My ESXi 5.5 host only had 3GB RAM (I had installed with 4GB and reduced it to 3GB) so I bumped it up to 4GB RAM and tried again. And voila! the install worked. So NSX 6.3 requires an ESXi 5.5 host with minimum 4GB RAM (well maybe 3.5 GB RAM works too – I was too lazy to try!) :o)

If you want, you can browse to “https://<NSX_MANAGER_IP>/bin/vdn/nwfabric.properties” to manually download the VIBs that get installed as part of the Host Preparation. This is in case you want to do a manual install (thought had crossed my mind as part of troubleshooting above).

NSX Manager is your management layer. You install it first and it communicates with vCenter server. A single NSX Manager install is sufficient. There’s one NSX Manager per vCenter. 

The next step after installing NSX Manager is to install NSX Controllers. These are installed in odd numbers to maintain quorum. This is your control plane. Note: No data traffic flows through the controllers. The NSX Controllers perform many roles and each role has a master controller node (if this node fails another one takes its place via election). 

Remember that in NSX the VXLAN is your data plane. NSX supports three control plane modes: multicast, unicast, and hybrid when it comes to BUM (Broadcast, unknown Unicast, and Multicast) traffic. BUM traffic is basically traffic that doesn’t have a specific Layer 3 destination. (More info: [1], [2], [3] … and many on the Internet but these three are what I came across initially via Google searches).

  • In unicast mode a host replicates all BUM traffic to all other hosts on the same VXLAN and also picks a host in every other VXLAN to do the same for hosts in their VXLANs. Thus there’s no dependence on the underlying hardware. There could, however, be increased traffic as the number of VXLANs increase. Note that in the case of unknown unicast the host first checks with the NSX Controller for more info. (That’s the impression I get at least from the [2] post above – I am not entirely clear). 
  • In multicast mode a host depends on the underlying networking hardware to replicate BUM traffic via multicast. All hosts on all VXLAN segments join multicast groups so any BUM traffic can be replicated by the network hardware to this multicast group. Obviously this mode requires hardware support. Note that multicast is used for both Layer 2 and Layer 3 here. 
  • In hybrid mode some of the BUM traffic replication is handed over to the first hop physical switch (so rather than a host sending unicast traffic to all other hosts connected to the same physical switch it relies on the switch to do this) while the rest of the replication is done by the host to hosts in other VXLANs. Note that multicast is used only for Layer 2 here. Also note that as in the unicast mode, in the case of unknown unicast traffic the Controller is consulted first. 

NSX Edge provides the routing. This is either via the Distributed Logical Router (DLR), which is installed on the hypervisor + a DLR virtual appliance; or via the Edge Services Gateway (ESG), which is a virtual appliance. 

  • A DLR can have up to 8 uplink interfaces and 1000 internal interfaces.
    • A DLR uplink typically connects to an ESG via a Layer 2 logical switch. 
    • DLR virtual appliance can be set up in HA mode – in an active/ standby configuration.
      • Created from NSX Manager?
    • The DLR virtual appliance is the control plane – it supports dynamic routing protocols and exchanges routing updates with Layer 3 devices (usually ESG).
      • Even if this virtual appliance is down the routing isn’t affected. New routes won’t be learnt that’s all.
    • The ESXi hypervisors have DLR VIBs which contain the routing information etc. got from the controllers (note: not from the DLR virtual appliance). This the data layer. Performs ARP lookup, route lookup etc. 
      • The VIBs also add a Logical InterFace (LIF) to the hypervisor. There’s one for each Logical Switch (VXLAN) the host connects to. Each LIF, of each host, is set to the default gateway IP of that Layer 2 segment. 
  • An ESG can have up to 10 uplink and internal interfaces. (With a trunk an ESG can have up to 200 sub-interfaces). 
    • There can be multiple ESG appliances in a datacenter. 
    • Here’s how new routes are learnt: NSX Edge Gateway (EGW) learns a new route -> This is picked up by the DLR virtual appliance as they are connected -> DLR virtual appliance passes this info to the NSX Controllers -> NSX Controllers pass this to the ESXi hosts.
    • The ESG is what connects to the uplink. The DLR connects to ESG via a Logical Switch. 

Logical Switch – this is the switch for a VXLAN. 

NSX Edge provides Logical VPNs, Logical Firewall, and Logical Load Balancer. 

Pir Jalani

Before I get busy with my day, I wanted to quickly give a shoutout to this video – Pir Jalani, from Coke Studio (Clinton Cerejo and Mangey ‘Manga’ Khan; music by Clinton Cerejo). It’s a fusion song – a traditional composition featuring lyrics in some Indian language I don’t know as well as Hindi. That’s what I have been on listening since day-before yesterday night when I first discovered it. I love the mix of the raw singing of Mangey Khan with the softer singing of Clinton Cerejo and the music – which is are sort of opposite to the raw singing and yet complements it and the two get along together very well. The song starts off in a very traditional way but quickly develops layers and becomes something else altogether! Loved it! (I especially loved the trombones and trumpets – totally didn’t expect that!)

Coke Studio has some good songs. Here’s a few off the top of my head (note: I have updated this list since my original posting):

  • Bismillah (Kailash Kher, Munawar Masoom; music by Salim-Sulaiman) – such an amazing pious song!
  • Piya Se Naina (Sona Mohapatra; music by Ram Sampath) – a peppy number.
  • Aigiri Nandini (Padma Shri Aruna Sairam, Sona Mohapatra; music by Ram Sampath) – two contrasting styles, singers, voices – what more to say!
  • Madari (Vishal Dadlani, Sonu Kakkar; music by Clinton Cerejo) – a powerful song; both Vishal Dadlani & Sonu Kakkar shine with their voice through this song.
  • Ambwa Taley (Javed Bashir, Humera Channa) – I don’t think I can even describe what I feel about this song; the singing is so strong and touching.
  • Aao Balma (Padmabhushan Ustad Ghulam Mustafa Khan, Murtuza Mustafa, Qadir Mustafa, Rabbani Mustafa, Hasan Mustafa, Faiz Mustafa; music by A.R. Rahman) – I discovered this early morning one day when I was woken up as I was on-call at work and couldn’t go to sleep after that; listening to this just blew my mind and I think I spent the whole day and the next few listening to this on loop.
  • Saathi Salaam (Sawan Khan Manganiyar, Clinton Cerejo; music by Clinton Cerejo) – another good song.
  • Naariyan (Shalmali Kholgade, Karthik, Amit Trivedi; music by Amit Trivedi) – an upbeat number, different to the rest; less Indian sounding. One thing about Amit Trivedi is that you can expect various sounds, different instruments, and he manages to mix them all together. Fun lyrics too, this one!
  • Rabba (Amit Trivedi, Tochi Raina, Jaggi; music by Amit Trivedi) – I wasn’t so hot about this song initially but it slowly catches on to you. 

One thing I noticed (an obvious observation, but I wanted to mention anyways) is how the headphones I use seems to enhance the music. My favorite way of listening to such music is via the Sennheiser HD 558. These are probably my favorite headphones – not practical to carry around or even use with others around – but they are super comfortable and open-backed (which is why I can’t use it with others around as it lets the music out and also lets in sound from outside; but this enhances the sound quality I think) and they just add “something” to the music. It’s like it lets the music/ the instruments “free” – gives them more space, so to say, a wider feeling … difficult to describe. It adds something to the whole experience. 

Apart from this I also listen to music via the Sennheiser PXC 550 which I previously mentioned, Bragi Dash, Bose SoundSport, and SoundMagic E10 & E10S (mostly E10S). The order in which I mentioned is the order in which I rank their music quality. It is not a huge difference, but I always notice a difference between these headphones. Each has its pros and cons which is why I use them, so I don’t judge their sound quality difference against them – but until a few years ago (which is when I started noticing this and began investing in good headphones) I wouldn’t have imagined headphones to make that much of a difference (and even now, like I said, it’s not a huge difference – it’s subtle, and may not matter to all, but it matters to me and makes a difference to me in the way I enjoy and appreciate the music). 

Enjoy the music! Such amazing talent.

Update: Some more (non Coke Studio songs):

  • Neeye (Yazin Nizar, Sharanya Srinivas; music by Phani Kalyan) – amazing music, and the male singer has such a wonderful voice!
  • Poori Qaaynaat (Raj Pandit, Vishal Dadlani; music by Salim-Sulaiman) – again, amazing music! The singing is of course great, I loved the Sitar too.

Desktop VDA only listens on port 1494/2598 when the connection comes in

Was troubleshooting a Citrix issue (“Failed with status 1110”) and one of the possibilities was that something is blocking the VDA ports 1494/2598 (two other possibilities seem to be mismatched STAs or issues with the root CA certs – neither seems to be the problem in my case as only one user seems to affected) .

My first response was to fire up telnet and try connecting 1494/ 2598. That gave me mixed results until I realized that the VDA only starts listening on these ports when the user is going to connect to it. From CTX213761:

Windows 7 – Desktop OS will listen on Port 1494 only when request comes in from StoreFront or WebInterface.
netstat -ano on Windows 7 will not show 1494 | 2598 listening up until the time of ICA launch.
netstat -ano  on Windows 2012R2 – Server OS will be listening on Port 1494 | 2598 regardless.

 Worth keeping in mind. Two takeaways for me:

  1. This doesn’t affect Server OS (so XenApp is unaffected)
  2. So if VDA isn’t listening on port 1494/ 2598 that means it hasn’t received a request from StoreFront/ WebInterface – so there could be communication trouble between STF/ WI and VDA. 

For future reference:

Going through an earlier post of mine about the flow during a Citrix session (and also CTX128909 – good one by the way, it has a diagram too) I don’t see any step where the StoreFront/ WebInterface talks to the VDA. All the StoreFront communication is with the Delivery Controller or Receiver, so am guessing the VDA starts listening on ports 1494/2598 when the Delivery Controller selects a machine from its Delivery Group and informs the StoreFront/ WebInterface (so it can put this in the ICA file). At this point either the StoreFront or the Delivery Controller talks to the VDA – not sure which one. The troubleshooting flowchart in CTX136668 mentions that one must check whether the VDA and Controller are both listening on port 80 (as that’s the ports they use for talking to each other) so my bet is on the Delivery Controller. When the Delivery Controllers informs the VDA (via port 80) that it is selected, the VDA starts listening for Receiver connections on port 1494/ 2598.

Before I conclude – port 2598 is used for Session Reliability. If Session Reliability is enabled only port 2598 is used; else only port 1494 is used. It’s either, not both!

Useful WMIC filters

I have these tabs open in my browser from last month when I was doing some WMI based GPO targeting. Meant to write a blog post but I keep getting side tracked and now it’s been nearly a month so I have lost the flow. But I want to put these in the blog as a reference to my future self. 

That’s all.

Go through a group of servers and find whether a particular patch is installed

Patch Tuesday is upon us. Our pilot group of server was patched via SCCM but there were reports that 2012R2 servers were not picking up one of the patches. I wanted to quickly identify the servers that were missing patches. 

Our pilot servers are in two groups. So I did the following:

The first two lines basically enumerate the two groups. If it was just one group I could have replaced it with Get-ADGroupMember "GroupName"

The remaining code checks whether the server is online, filters out 2012 R2 servers (version number 6.3.9600), and makes a list of the servers along with the installed date of the hotfix I am interested in. If the hotfix is not installed, the date will be blank. Simple. 

Oh, and I wanted to get the output as and when it comes so I went with a Width=20 in the name field. I could have avoided that and gone for an -AutoSize but that would mean I’ll have to patiently wait for PowerShell to generate the entire output and then Format-Table to do an autosize. 

Update: While on the Win32_QuickFixEngineering WMI class I wanted to point out to these posts: [1], [2]

Worth keeping in mind that Win32_QuickFixEngineering (or QFE for short) only returns patches installed via the CBS (Component Based Servicing) – which is what Windows Updates do anyway. What this means, however, is that it does not return patches installed via an MSI/ MSP/ MSU. 

TIL: Control Plane & Data Plane (networking)

Reading a bit of networking stuff, which is new to me, as I am trying to understand and appreciate NSX (instead of already diving into it). Hence a few of these TIL posts like this one and the previous. 

One common term I read in the context of NSX or SDN (Software Defined Networking) in general is “control plane” and “data plane” (a.k.a “forwarding” plane). 

This forum post is a good intro. Basically, when it comes to Networking your network equipment does two sort of things. One is the actual pushing of packets that come to it to others. The other is figuring out what packets need to go where. The latter is where various networking protocols like RIP and EIGRP come in. Control plane traffic is used to update a network device’s routing tables or configuration state, and its processing happens on the network device itself.  Data plane traffic passes through the router. Control plane traffic determines what should be done with the data plane traffic. Another way of thinking about control plan and data planes is where the traffic originates from/ is destined to. Basically, control plane traffic is sent to/ from the network devices to control it (e.g RIP, EIGRP); while data plane traffic is what passes through a network device.

( Control plane traffic doesn’t necessarily mean its traffic for controlling a network device. For example, SSH or Telnet could be used to connect to a network device and control it, but it’s not really in the control plane. These come more under a “management” plane – which may or may not be considered as a separate plane. )

Once you think of network devices along these lines, you can see that a device’s actual work is in the data plane. How fast can it push packets through. Yes, it needs to know where to push packets through to, but the two aren’t tied together. It’s sort of like how one might think of a computer as being hardware (CPU) + software (OS) tied together. If we imagine the two as tied together, then we are limiting ourselves on how much each of these can be pushed. If improvements in the OS require improvements in the CPU then we limit ourselves – the two can only be improved in-step. But if the OS improvements can happen independent of the underlying CPU (yes, a newer CPU might help the OS take advantage of newer features or perform better, but it isn’t a requirement) then OS developers can keep innovating on the OS irrespective of CPU manufacturers. In fact, OS developers can use any CPU as long as there are clearly defined interfaces between the OS and the CPU. Similarly, CPU manufacturers can innovate independent of the OS. Ultimately if we think (very simply) of CPUs as having a function of quickly processing data, and OS as a platform that can make use of a CPU to do various processing tasks, we can see that the two are independent and all that’s required is a set of interfaces between them. This is how things already are with computers so what I mentioned just now doesn’t sound so grand or new, but this wasn’t always the case. 

With SDN we try to decouple the control and data planes. The data plane then is the physical layer comprising of network devices or servers. They are programmable and expose a set of interfaces. The control plane now can be a VM or something independent of the physical hardware of the data plane. It is no longer limited to what a single network device sees. The control plane is aware of the whole infrastructure and accordingly informs/ configures the data plane devices.  

If you want a better explanation of what I was trying to convey above, this article might help. 

In the context of NSX its data plane would be the VXLAN based Logical Switches and the ESXi hosts that make it up. And its control plane would be the NSX Controllers. It’s the NSX Controllers that takes care of knowing what to do with the network traffic. It identifies all these, informs the hosts that are part of the data plane accordingly, and let them do the needful. The NSX Controller VMs are deployed in odd numbers (preferably 3 or higher, though you could get away with 1 too) for HA and cluster quorum (that’s why odd numbers) but they are independent of the data plane. Even if all the NSX Controllers are down the data flow would not be affected


I saw a video from Scott Shenker on the future of networking and the past of protocols. Here’s a link to the slides, and here’s a link to the video on YouTube. I think the video is a must watch. Here’s some of the salient points from the video+slides though – mainly as a reminder to myself (note: since I am not a networking person I am vague at many places as I don’t understand it myself):

  • Layering is a useful thing. Layering is what made networking successful. The TCP/IP model, the OSI model. Basically you don’t try and think of the “networking problem” as a big composite thing, but you break it down into layers with each layer doing one task and the layer above it assuming that the layer below it has somehow solved that problem. It’s similar to Unix pipes and all that. Break the problem into discrete parts with interfaces, and each part does what it does best and assumes the part below it is taking care of what it needs to do. 
  • This layering was useful when it came to the data plane mentioned above. That’s what TCP/IP is all about anyways – getting stuff from one point to another. 
  • The control plane used to be simple. It was just about the L2 or L3 tables – where to send a frame to, or where to send a packet to. Then the control plane got complicated by way of ACLs and all that (I don’t know what all to be honest as I am not a networking person :)). There was no “academic” approach to solving this problem similar to how the data plane was tackled; so we just kept adding more and more protocols to the mix to simply solve each issue as it came along. This made things even more complicated, but that’s OK as the people who manage all these liked the complexity and it worked after all. 
  • A good quote (from Don Norman) – “The ability to master complexity is not the same as the ability to extract simplicity“. Well said! So simple and prescient. 
    • It’s OK if you are only good at mastering complexity. But be aware of that. Don’t be under a misconception that just because you are good at mastering the complexity you can also extract simplicity out of it. That’s the key thing. Don’t fool yourself. :)
  • In the context of the control plane, the thing is we have learnt to master its complexity but not learnt to extract simplicity from it. That’s the key problem. 
    • To give an analogy with programming, we no longer think of programming in terms of machine language or registers or memory spaces. All these are abstracted away. This abstraction means a programmer can focus on tackling the problem in a totally different way compared to how he/ she would have had to approach it if they had to take care of all the underlying issues and figure it out. Abstraction is a very useful tool. E.g. Object Oriented Programming, Garbage Collection. Extract simplicity! 
  • Another good quote (from Barbara Liskov) – “Modularity based on abstraction is the way things get done“.
    • Or put another way :) Abstractions -> Interfaces -> Modularity (you abstract away stuff; provide interfaces between them; and that leads to modularity). 
  • As mentioned earlier the data plan has good abstraction, interfaces, and modularity (the layers). Each layer has well defined interfaces and the actual implementation of how a particular layer gets things done is down to the protocols used in that layer or its implementations. The layers above and below do not care. E.g. Layer 3 (IP) expects Layer 2 to somehow get it’s stuff done. The fact that it uses Ethernet and Frames etc is of no concern to IP. 
  • So, what are the control plane problems in networking? 
    • We need to be able to compute the configuration state of each network device. As in what ACLs are it supposed to be applying, what its forwarding tables are like …
    • We need to be able to do this while operating without communication guarantees. So we have to deal with communication delays or packet drops etc as changes are pushed out. 
    • We also need to be able to do this while operating within the limitations of the protocol we are using (e.g. IP). 
  • Anyone trying to master the control plane has to deal with all three. To give an analogy with programming, it is as though a programmer had to worry about where data is placed in RAM, take care of memory management and process communication etc. No one does that now. It is all magically taken care of by the underlying system (like the OS or the programming language itself). The programmer merely focuses on what they need to do. Something similar is required for the control plane. 
  • What is needed?
    • We need an abstraction for computing the configuration state of each device. [Specification Abstraction]
      • Instead of thinking of how to compute the configuration state of a device or how to change a configuration state, we just declare what we want and it is magically taken care of. You declare how things should be, and the underlying system takes care of making it so. 
      • We think in terms of specifications. If the intention is that Device A should not have access to Device B, we simply specify that in the language of our model without thinking of the how in terms of the underlying physical model. The shift in thinking here is that we view each thing as a layer and only focus on that. To implement a policy that Device A should not have access to Device B we do not need to think of the network structure or the devices in between – all that is just taken care of (by the Network Operating System, so to speak). 
      • This layer is  Network Virtualization. We have a simplified model of the network that we work with and which we specify how it should be, and the Network Virtualization takes care of actually implementing it. 
    • We need an abstraction that captures the lack of communication guarantees- i.e. the distributed state of the system. [Distributed State Abstraction]
      • Instead of thinking how to deal with the distributed network we abstract it away and assume that it is magically taken care of. 
      • Each device has access to an annotated network graph that they can query for whatever info they want. A global network view, so to say. 
      • There is some layer that gathers an overall picture of the network from all the devices and presents this global view to the devices. (We can think of this layer as being a central source of information, but it can be decentralized too. Point is that’s an implementation problem for whoever designs that layer). This layer is the Network Operating System, so to speak. 
    • We need an abstraction of the underlying protocol so we don’t have to deal with it directly.  [Forwarding Abstraction]
      • Network devices have a Management CPU and a Forwarding ASIC. We need an abstraction for both. 
      • The Management CPU abstraction can be anything. The ASIC abstraction is OpenFlow. 
      • This is the layer that closest to the hardware. 
  • SDN abstracts these three things – distribution, forwarding, and configuration. 
    • You have a Control Program that configures an abstract network view based on the operator requirements (note: this doesn’t deal with the underlying hardware at all) ->
    • You have a Network Virtualization layer that takes this abstract network view and maps it to a global view based on the underlying physical hardware (the specification abstraction) ->
    • You have a Network OS that communicates this global network view to all the physical devices to make it happen (the distributed state abstraction (for disseminating the information) and the forwarding abstraction (for configuring the hardware)).
  • Very important: Each piece of the above architecture has a very limited job that doesn’t involve the overall picture. 

From this Whitepaper:

SDN has three layers: (1) an Application layer, (2) a Control layer (the Control Program mentioned above), and (3) an Infrastructure layer (the network devices). 

The Application layer is where business applications reside. These talk to the Control Program in the Control layer via APIs. This way applications can program their network requirements directly. 

OpenFlow (mentioned in Scott’s talk under the ASIC abstraction) is the interface between the control plane and the data/ forwarding place. Rather than paraphrase, let me quote from that whitepaper for my own reference:

OpenFlow is the first standard communications interface defined between the control and forwarding layers of an SDN architecture. OpenFlow allows direct access to and manipulation of the forwarding plane of network devices such as switches and routers, both physical and virtual (hypervisor-based). It is the absence of an open interface to the forwarding plane that has led to the characterization of today’s networking devices as monolithic, closed, and mainframe-like. No other standard protocol does what OpenFlow does, and a protocol like OpenFlow is needed to move network control out of the networking switches to logically centralized control software.

OpenFlow can be compared to the instruction set of a CPU. The protocol specifies basic primitives that can be used by an external software application to program the forwarding plane of network devices, just like the instruction set of a CPU would program a computer system.

OpenFlow uses the concept of flows to identify network traffic based on pre-defined match rules that can be statically or dynamically programmed by the SDN control software. It also allows IT to define how traffic should flow through network devices based on parameters such as usage patterns, applications, and cloud resources. Since OpenFlow allows the network to be programmed on a per-flow basis, an OpenFlow-based SDN architecture provides extremely granular control, enabling the network to respond to real-time changes at the application, user, and session levels. Current IP-based routing does not provide this level of control, as all flows between two endpoints must follow the same path through the network, regardless of their different requirements.

I don’t think OpenFlow is used by NSX though. It is used by Open vSwitch and was used by NVP (Nicira Virtualization Platform – the predecessor of NSX).

Speaking of NVP and NSX: VMware acquired NSX from Nicira (which was a company founded by Martin Casado, Nick McKeown and Scott Shenker – the same Scott Shenker whose video I was watching above). The product was called NVP back then and primarily ran on the Xen hypervisor. VMware renamed it to NSX and it was has two flavors. NSX-V is the version that runs on the VMware ESXi hypervisor, and is in active development. There’s also NSX-MH which is a “multi-hypervisor” version that’s supposed to be able to run on Xen, KVM, etc. but I couldn’t find much information on it. There’s some presentation slides in case anyone’s interested. 

Before I conclude here’s some more blog posts related to all this. They are in order of publishing so we get a feel of how things have progressed. I am starting to get a headache reading all this network stuff, most of which is going above my head, so I am going to take a break here and simply link to the articles (with minimal/ half info) and not go much into it. :)

  • This one talks about how the VXLAN specification doesn’t specify any control plane.
    • There is no way for hosts participating in a VXLAN network to know the MAC addresses of other hosts or VMs in the VXLAN so we need some way of achieving that. 
    • Nicira NVP uses OpenFlow as a control-plane protocol. 
  • This one talks about how OpenFlow is used by Nicira NVP. Some points of interest:
    • Each Open vSwitch (OVS) implementation has 1) a flow-based forwarding module loaded in the kernel; 2) an agent that communicates with the Controller; and 3) an OVS DB daemon that keeps track of of the local configuration. 
    • NVP had clusters of 3 or 5 controllers. These used OpenFlow protocol to download forwarding entries into the OVS and OVSDB (a.k.a. ovsdb-daemon) to configure the OVS itself  (creating/ deleting/ modifying bridges, interfaces, etc). 
    • Read that post on how the forwarding tables and tunnel interfaces are modified as new devices join the overlay network. 
    • Broadcast traffic, unknown Unicast traffic, and Multicast traffic (a.k.a. BUM traffic) can be handled in two ways – either by sending these to an extra server that replicates these to all devices in the overlay network; or the source hypervisor/ physical device can encapsulate the BUM frame and send it as unicast to all the other devices in that overlay. 
  • This one talks about how Nicira NVP seems to be moving away from OpenFlow or supplementing it with something (I am not entirely clear).
    • This is a good read though just that I was lost by this point coz I have been doing this reading for nearly 2 days and it’s starting to get tiring. 

One more post from the author of the three posts above. It’s a good read. Kind of obvious stuff, but good to see in pictures. That author has some informative posts – wish I was more brainy! :)

TIL: VXLAN is a standard

VXLAN == Virtual eXtensible LAN.

While reading about NSX I was under the impression VXLAN is something VMware cooked up and owns (possibly via Nicira, which is where NSX came from). But turns out that isn’t the case. It was originally created by VMware & Cisco (check out this Register article – a good read) and is actually covered under an RFC 7348. The encapsulation mechanism is standardized, and so is the UDP port used for communication (port number 4789 by the way). A lot of vendors now support VXLAN, and similar to NSX being an implementation of VXLAN we also have Open vSwitch. Nice!

(Note to self: got to read more about Open vSwitch. It’s used in XenServer and is a part of Linux. The *BSDs too support it). 

VXLAN is meant to both virtualize Layer 2 and also replace VLANs. You can have up to 16 million VXLANs (the NSX Logical Switches I mentioned earlier). In contrast you are limited to 4094 VLANs. I like the analogy of how VXLAN is to IP addresses how cell phones are to telephone numbers. Prior to cell phones, when everyone had landline numbers, your phone number was tied to your location. If you shifted houses/ locations you got a new phone number. In contrast, with cell phones numbers it doesn’t matter where you are as the number is linked to you, not your location. Similarly with VXLAN your VM IP address is linked to the VM, not its location. 

Update:

  • Found a good whitepaper by Arista on VXLANs. Something I hadn’t realized earlier was that the 24bit VXLAN Network Identifier is called VNI (this is what lets you have 16 millions VXLAN segments/ NSX Logical Switches) and that a VM’s MAC is combined with its VNI – thus allowing multiple VMs with the same MAC address to exist across the network (as long as they are on separate VXNETs). 
  • Also, while I am noting acronyms I might as well also mention VTEPs. These stand for Virtual Tunnel End Points. This is the “thing” that encapsulates/ decapsulates packets for VXLAN. This can be virtual bridges in the hypervisor (ESXi or any other); or even VXLAN aware VM applications or VXLAN capable switching hardware (wasn’t aware of this until I read the Arista whitepaper). 
  • VTEP communicates over UDP. The port number is 4789 (NSX 6.2.3 and later) or 8472 (pre-NSX 6.2.3).
  • A post by Duncan Epping on VXLAN use cases. Probably dated in terms of the VXLAN issues it mentions (traffic tromboning) but I wanted to link it here as (a) it’s a good read and (b) it’s good to know such issues as that will help me better understand why things might be a certain way now (because they are designed to work around such issues). 

New Gadgets

I didn’t realize I had a Gadgets category on this blog. Funny I forgot about it, considering I had blogged just a few months back about my new Kindles. 

Anyhoo. Two more gadget updates in case anyone’s interested. 

I bought a new phone for myself. The OnePlus 3T. 128GB/ Gunmetal version. Lovely phone! 

And I bought a new pair of headphones. The noise-canceling Bluetooth sort of headphones. :) Got myself a Sennheiser PXC 550 – which sounds so formal and uncool, but it’s a good pair of headphones nevertheless. It would be in the same category as the Bose QC 35 or the Sony MDX-1000X. I haven’t used either of them but I went with the Sennheiser as it had way more features than either of those, and I was able to get it at the same price point (well, slightly cheaper actually) as the Sony MDX-1000X (which is what I was eyeing, until I came across the PSX 550). 

Cool features of the Sennheiser PXC500 in a nutshell:

  1. It comes with a wired cable with a mic, so even if you run out of battery you can use the headset with no compromises (most other Bluetooth headsets that come with a cable don’t include a mic).
  2. It has a great battery life (30 hours or something, I think; I dunno, I just charge it every weekend or so).
  3. I like the touch controls – lets me easily pause, rewind, forward via a touch on the right ear cup.
  4. The headphones have an inbuilt DSP for modes like speech (useful for podcasts & audio books), movies (useful for movies or listening to film scores), club (I never use this), or none.
  5. Using the companion Android & iOS app you can create custom equalizer settings (I don’t use this) and also enable a cool feature that automatically pauses the music when you take off the headphones (but I turned it off since I discovered that I use the headphones a lot when walking, and the sweat that accumulates seems to confuse these sensors and they randomly pause the music). 
  6. There’s no off/ on button. Simply take off headphones and fold them flat (which is what I always do) and it powers them off! So nice. Unfold and they power on. 
  7. It can connect to 2 devices at the same time. Sooo convenient! It can remember up to 8 (or is it 10) devices – but it only connects automatically to the last 2. 
  8. You can turn off the noise-cancelling or set a percentage for it (which you set via the app). So it doesn’t have to be full noise-cancelling always. Personally, I don’t find any difference between full and a percentage. Which makes me wonder if it’s doing a proper noise-cancelling or not, but I know it blocks out most noise so it’s doing something. One thing I learn about “noise cancelling” is that it doesn’t not entirely cancel noise as the ads might have you believe. You can still hear train announcements and a bit of the background noise – so it’s not totally silent! 
  9. When there’s no music playing you can double tap the right ear cup to turn on a mode that pulls in the surrounding noise via the mics into your ears. So you can hear even more clearly what’s on your surroundings – say you want to talk to someone and don’t want to take off the headphones (and because they are around the ear they still block out noise even if noise cancelling is off). This mode’s useful for that. 

That’s all I think! 

vSphere Distributed Switches are Layer 2 devices (doh!)

This is a very basic post. Was trying to read up on NSX and before I could appreciate it I wanted to go down and explore how things are without NSX so I can better understand what NSX is trying to do. I wanted to put it down in writing as I spent some time on it, but there’s nothing new or grand here.

So. vSphere Distributed Switches (VDS). These are Layer 2 switches that exist on each ESX host and which contain port groups that you can connect VMs running on a host onto. In case it wasn’t obvious from the name “switch”, these are Layer 2. Which means that all the hosts connecting to a particular Distributed Switch must be on the same Layer 2. Once you create a Distributed Switch and add ESXi hosts and their physical NICs to it, you can create VMKernel ports for Management, vMotion, Fault Tolerance, etc but these VMKernel ports aren’t used by the port groups you create on the Distributed Switch. The port groups are just like Layer 2 switches – they communicate via broadcasting using the underlying physical NICs that are assigned to the Distributed Switch; but since there’s no IP address as such assigned to a port group there’s no routing involved. (This is an obvious point but I keep forgetting it).

For example say you have two ESX hosts – HostA and HostB – and these are on two separate physical networks (i.e. separated by a router). You create a new Distributed Switch comprising of a physical NIC each from each host. Then you make a port group on this switch and put VM-A on HostA and VM-B on HostB. When creating the Distributed Switch and adding physical NICs to it, VMware doesn’t care if the physical NICs aren’t in the same Layer 2 domain. It will happily add the NICs but when you try to send traffic from VM-A to VM-B it will fail. That’s because when VM-A tries to communicate with VM-B (let’s assume these two VMs know each others MAC address so there’s no need for ARP communication first), VM-A will send Ethernet frames to the Distributed Switch on HostA who will then broadcast it to the Layer 2 network its physical NIC assigned to the Distributed Switch is connected to. Since these broadcasted frames won’t reach the physical NIC of HostB the VM-B there never sees it, and so the two VMs cannot communicate with each other. 

So – keep in mind that all physical NICs connecting to the same Distributed Switch must be on the same Layer 2. If the underlying physical NICs are on separate Layer 3 networks, and these Layer 3 networks have connectivity to each other, it doesn’t matter – the VMs in the port groups will not be able to communicate. 

And this is where NSX comes in. Using the concept of VXLANs, NSX stretches a Layer 2 network across Layer 3. Basically it encapsulates the Layer 2 traffic within Layer 3 packets and gives the illusion of all VMs being on the same Layer 2 network – but this illusion is what Network Virtualization if all about, right? :) VXLAN is an overlay

VXLAN encapsulates Layer 2 frames in UDP packets. The VXLAN is like a tunnel to which all the hosts connecting to this VXLAN hook into. On each host there’s something called a Virtual Tunnel End Point (VTEP) which is the “thing” that actually hooks into the VXLAN. If a VXLAN is a Distributed Switch made up of physical NICs from the host, the VTEP is the VMKernel ports of this Distributed Switch that do the actual communication (like how vMotion traffic between two hosts happens via the VMKernel ports you assign for vMotion). In fact, during an NSX install you install three VIBs on the ESXi hosts – one of these enhances the existing Distributed Switch with VXLAN capabilities (the encapsulation stuff  I mentioned above). 

Once you have NSX you can create multiple Logical Switches. These are basically VXLAN switches that operate like Layer 2 switches but can actually stretch multiple Layer 3 networks. Logical Switches are overlay switches. ;o) Each Logical Switch corresponds to one VXLAN. 

ps. VXLAN is one of the cool features of NSX. The other cool features are the Distributed Logical Router (DLR) and the Distributed Firewall (DFW). I mentioned that a ESXi host has 3 VIBs installed as part of NSX, and that one of them is VXLAN functionality? Well the other two are DLR and DFW (god, so many acronyms!). Prior to DLR if an ESXi host had two VMs connected to different Distributed Switches, and if these two hosts wanted to talk to each other, the traffic would go down from one of the VMs, to the host, to the underlying physical network router, and back to the host and up to the VM on the other Distributed Switch. But with DLR, the ESXi hypervisor kernel can do Layer 3 routing too, so it will simply send traffic directly to the VM in the other Distributed Switch. 

Similarly, DFW just means each ESXi hypervisor can also apply firewall rules to the packets, so you don’t need one centralized firewall place any more. You simply create rules and push it out to the ESXi hosts and they can do firewalling between VMs. Everything is virtual! :)

pps. Some other jargon. East-West traffic means network traffic that’s usually within or between servers (ESXi hosts in our case). North-South traffic means any other network traffic – basically, traffic that goes out of this layer of ESXi hosts. With NSX you try and have more traffic East-West rather than North-South. 

TIL: Transparent Page Sharing (TPS) between VMs is disabled by default

(TIL is short for “Today I Learned” by the way).

I always thought an ESXi host did some page sharing of VM memory between the VMs running on it. The feature is called Transparent Page Sharing (TPS) and it was something I remember from my VMware course and also read in blog posts such as this and this. The idea is that if you have (say) a bunch of Server 2012R2 VMs running on a host, it’s quite likely these VMs have quite a bit of common stuff between them in RAM, so it makes sense for the ESXi host to share that common stuff between the hosts. So even if each VM has (say) 4 GB RAM assigned to it, and there’s about 2GB worth of stuff common between the VMs, the host only needs to use 2GB shared RAM + 2 x 2GB private RAM for a total of 6GB RAM. 

Apart from this as the host is under increased memory pressure it resorts to techniques like ballooning and memory swapping to free up some RAM for itself. 

I even made a script today to list out all the VMs in our environment that have greater than 8GB RAM assigned to them and are powered on and to list the amount of shared RAM (just for my own info). 

Anyhow – around 2015 VMware stopped page sharing of VM memory between VMs. VMware calls this sort of RAM sharing as inter-VM TPS. Apparently this is a security risk and VMware likes to ship their products as secure by default, so via some patches to the 5.x series (and as default in the 6.x series) it turned off inter-VM TPS and introduced some controls that allow IT Admins to turn this on if they so wish. Intra-VM TPS is still enabled – i.e. the ESXi host will do page sharing within each VM – but it not longer does page sharing between VMs by default. 

Using the newly introduced controls, however, it is possible to enable inter-VM TPS for all VMs, or selectively between some VMs. Quoting from this blog post

You can set a virtual machine’s advanced parameter sched.mem.pshare.salt to control its ability to participate in transparent page sharing.  

TPS is only allowed within a virtual machine (intra-VM TPS) by default, because the ESXi host configuration option Mem.ShareForceSalting is set to 2, the sched.mem.pshare.salt is not present in the virtual machine configuration file, and thus the virtual machine salt value is set to unique value. In this case, to allow TPS among a specific set of virtual machines, set the sched.mem.pshare.salt of each virtual machine in the set to an identical value.  

Alternatively, to enable TPS among all virtual machines (inter-VM TPS), you can set Mem.ShareForceSalting to 0, which causes sched.mem.pshare.salt to be ignored and to have no impact.

Or, to enable inter-VM TPS as the default, but yet allow the use of sched.mem.pshare.salt to control the effect of TPS per virtual machine, set the value of Mem.ShareForceSalting to 1. In this case, change the value of sched.mem.pshare.salt per virtual machine to prevent it from sharing with all virtual machines and restrict it to sharing with those that have an identical setting.

Nice! 

I wonder if intra-VM TPS has much memory savings. Looking at the output from my script for our estate I see that many of our server VMs have about half their allocated RAM as shared, so it does make an impact. I guess it will also make a difference when moving to a container architecture wherein a single VM might have many containers. 

I would also like to point out to this blog post and another blog post I came across from it on whether inter-VM TPS even makes much sense in today’s environments and also on the kind of impact it can have during vMotion etc. Good stuff. I am still reading these but wanted to link to them for reference. Mainly – nowadays we have larger page sizes and so the probability of finding an identical page to be shared between two VMs is low; then there is NUMA that places memory pages closer to the CPU and TPS could disrupt that; and also, TPS is a process that runs periodically to compare pages, so there is an operational cost as it runs and finds a match and then does a full compare of the two pages to ensure they are really identical. 

Good to know. 

Brief notes on the Citrix STA

Wanted to point out this PDF from Citrix on the XenApp/ XenDesktop architecture – especially pages 21, 22 which are on how authentication works. During my Citrix course the instructor had talked about it but like an idiot I didn’t take notes and now I can’t find much info on what he was explaining. 

The part which is of interest to me is the STA (Secure Ticket Authority). 

There’s a couple of steps that happens when a user logs in to access a Citrix solution. First: the StoreFront authenticates the user against AD. Or if the user is accessing remotely, the NetScaler gateway authenticates the user and passes on details to the StoreFront. Then the StoreFront passes on this information to the Delivery Controller so the latter can give a list of resources the user has access to. The Delivery Controllers in turn authenticate the user AD. The Delivery Controller then sends a list of resources the user has access to, to the StoreFront, which sends this on to the user’s Citrix Receiver or Browser. This is when the user sees what is available to them, and can select what they want.

When the user selects what they want, this is information is passed on to the StoreFront, which then passes the info to the Delivery Controller – who then finds an appropriate host that can fulfill the requirement and sends this information to the StoreFront. 

The next step is where the STA comes in. 

In case the user is accessing Citrix locally, the StoreFront can create an ICA file with details of the host and send it over to the user’s Citrix Receiver or Browser and the latter can then directly talk to the VDA agent installed on the host (note the StoreFront & Delivery Controller have no more role to play). But what if the user is accessing remotely? We don’t want to send these sensitive details over the public Internet. So, as a workaround, Citrix creates a “ticket” (which is a randomly generated sequence of 32 uppercase alphabetic or numeric characters) and associates the ticket with the details of the host that the Citrix Receiver or Browser need to contact to access the requested resources. This ticket is what is sent over to Citrix Receiver or Browser in the ICA file, using which it can contact the NetScaler gateway and the NetScaler gateway can validate this and initiate a connection with the VDA on the host on behalf of the user. 

So, as we can see the STA only comes into play in case of remote access. The STA is a part of the Citrix XML Service (once again linking to this excellent post!), which is installed as part of the Delivery Controller (so the STA is a part of the Delivery Controller). It is written as an ISAPI extension (called CtxSta.dll) for the IIS WebServer and runs the /Scripts/CtxSta.dll URL. The STA has an ID called the STA_ID, and this along with the TICKET and an STA_VERSION field are what is put into the ICA file. I am not sure whether the STA requires IIS, or it can run standalone (as I blogged previously the Citrix XML Service can run standalone so I would assume the STA can do the same). The Citrix STA FAQ says IIS is required, but that could be outdated.

The Citrix StoreFront is configured with the STA details in the NetScaler Gateway section (remember you only need to use the STA in case of remote users, for which you would have to configure a NetScaler Gateway). 

Similarly the NetScaler itself is configured with the STA details. 

It is important to keep in mind that there are thus TWO places where the STA details are input, and that the details in both places must be the same. The StoreFront uses its configured details to generate a ticket and put it in the ICA file. And the StoreFront uses its configured details to validate that ticket with an STA and identify what resources it should connect to. If the two details are not identical then you will not be able to launch any resources! (I had this problem at work today which is why I decided to refresh my knowledge about STAs and thought of writing this blog post. If the two details are not identical you will get a “Cannot start App:” error because the ticket the client has cannot be validated or used by the NetScaler). 

Just as an aside to myself – the port used to talk to the VDA is 1494 or 2598. This is the case if the Citrix Receiver or Browser contacts the VDA, or if the NetScaler gateway does so on behalf of these. I like to remember port numbers. :o) 

Also – there is nothing that ties a particular STA generated ticket to the device where the request was made from. That is, in theory a remote user could make a request from Computer A, get the ICA file and run it on Computer B – and NetScaler + STA will happily let the user access resources. A ticket only has a 100 seconds validity, so they’d have to do this switch-over quickly though. ;o) Also, a ticket can only be used once. (Also this and more info are from the very informative Citrix STA FAQ by the way).