Contact

Subscribe via Email

Subscribe via RSS

Categories

Creative Commons Attribution 4.0 International License
© Rakhesh Sasidharan

Elsewhere

Refresher to myself StoreFront and Delivery Controller authentication

In a previous post I had written about the flow of communication between Citrix Storefront and Delivery Controllers during user authentication. Here’s some more based on a Citrix blog post I am reading. 

Here’s what I had written in my previous post:

There’s a couple of steps that happens when a user logs in to access a Citrix solution. First: the StoreFront authenticates the user against AD. Or if the user is accessing remotely, the NetScaler gateway authenticates the user and passes on details to the StoreFront. Then the StoreFront passes on this information to the Delivery Controller so the latter can give a list of resources the user has access to. The Delivery Controllers in turn authenticate the user AD. The Delivery Controller then sends a list of resources the user has access to, to the StoreFront, which sends this on to the user’s Citrix Receiver or Browser. This is when the user sees what is available to them, and can select what they want.

When the user selects what they want, this is information is passed on to the StoreFront, which then passes the info to the Delivery Controller – who then finds an appropriate host that can fulfill the requirement and sends this information to the StoreFront. 

Emphasis mine. The Storefront communicates with the Delivery Controller using the XML Service. 

Here’s a list of authentication methods supported by the Storefront. 

When the Storefront communicates the user authentication information to the Delivery Controller, it may or may not include the password too (sent in clear-text) in this communication. If “User name and password” or “Pass-through from NetScaler” is selected, then the password is included. If “Domain pass-through” or “Smart card” is selected, then the password is not. The blog post doesn’t say anything about these, but I think “SAML Authentication” (used for ADFS) will not include the password, while “HTTP Basic” will. 

The StoreFront and Delivery Controller communicates twice (the two times I emphasized above). The first time is when the user authenticates and the StoreFront sends this information to the Delivery Controller to get a list of resources. The second time is when the user makes a selection and this information is passed on to the Delivery Controller so that an appropriate host can be selected. In both instances the password could be sent from the StoreFront to the Delivery Controller.

A nice quote from Maigret “Night at the Crossroads”

On men. I liked it. Found it to be very insightful and true. 

I liked Michonnet. 

He didn’t want to protect me or kiss me, or own me. He just saw a scheme where he could make some money. 

Men aren’t usually that honest with themselves. Women are a fantasy, or a path to redemption, or a way they can escape their life. 

Maybe all men want to trap you in the end. And I was sick of that. 

Binge watching updates…

After a long time I spent the past two days (today & yesterday) doing nothing but binge watch. Family gone over to India for a few days, I am all to myself. Didn’t do any NSX or Citrix or study – simply plonked my feet up on the coach, hogged food, and watched TV. 

Legion

First up was Legion, which I had high expectations from coz it’s by Noah Hawley (of Fargo TV series fame). It was good but I wasn’t too impressed mainly coz I had high hopes I guess. Think I expected something like Fargo, while this was different. It’s visually stunning – the way the scenes are taken, the music, the performances – but wasn’t entirely my cup of tea. I know it’s a “me” thing so please don’t take this as a review/ comment on the show itself. I can’t even imagine what sort of a creative mind someone must have to imagine and execute the stuff on that show. It’s simply mind blowing!

I didn’t realize the lead character Dan Stevens was the same whose voice I knew from audiobooks. I had listened to him in the Agatha Christie audiobook “And Then There Were None”, loved his voice in that and searched for more audiobooks, found he’s also done Mary Shelley’s “Frankenstein” (downloaded, not listened to yet) and the first James Bond book “Casino Royale” in the celebrity recordings (loved that performance!). Only when Dan Stevens began talking with the British accent did I feel that hmm this sounds familiar and realize that I had heard his voice in Audible. 

Anyways. Nice show. Very well taken. Wasn’t entirely my cup of tea. (Like for instance, Stranger Things or The OA – which are similar of a similar mood and I loved and associated with a lot more). 

11.22.63

Honestly, I thought this must be some horror show considering it’s Stephen King. Didn’t realize it was about time traveling and preventing the JFK assaination. It was wonderful! I loved this show. And James Franco was awesome. 

Interesting aside on James Franco – I am nowadays listening to his performance of “Slaughter House 5” by Kurt Vonnegut. He’s great in that. It’s a great book and James Franco has done an amazing job of it. Interesting how that book also has time traveling and talk of how everything just is and we are all in amber and questions of cause & effect & why are just human limitations etc. And then I see 11.22.63 which touches on similar stuff, especially with the pash pushing back etc. 

Also, the 60s set and simpler culture was a pleasure to watch. At the same time sad to see some of the stuff like treatment of women and blacks. Every age has its pluses and minus. :-/

Anyhoo. 11.22.63. Nice show. And loved James Franco!

Maigret’s Dead Man

Came across this by mistake. Checked it out coz it’s got Rowan Atkinson in it. Enjoyed it. Didn’t realize it’s actually the second episode of a reboot show. Got to watch the other episodes now. 

This show too is set in an older time. Was fun to see that. A very well taken movie/ episode over all. 

Maigret: Night at the Crossroads

Managed to watch this later on. This is the first episode in the second season. The previous one I had seen was the second episode in the first season. I haven’t managed to get hold of the first episode of the first season; and I believe there’s one more episode in this second season. 

Anyways. It was a good watch. I thoroughly enjoyed it. It also reminded me a lot of “Foyle’s War” – which is a show I had similarly enjoyed. Both shows have similar pacing and music. Slow procedural mysteries with a main detective and his subordinates. 

Maigret sets a trap

Hurray, managed to watch this one too! I actually saw this and “Night at the Crossroads” after “Split” but thought I’d put them together with the first Maigret episode I watched. 

Am surprised “Maigret sets a trap” was the first episode of the reboot. It’s very different from the rest. Maigret is under pressure, his superiors want him off the case coz they believe he is not delivering, Maigret is moody himself due to this and clutching at straws, even his subordinates are a bit unsure if Maigret can pull this one off. The case itself is a very odd one. No clues, no connections, and we the viewers are left in suspense till the end as to whether Maigret caught the wrong man. It’s all kind of flimsy after all. But no – Maigret did catch the right man, and it’s all explained very well actually. A different but very nice episode. Fitting, in a way, for me to have ended my binge watching with this one. This is the kind of episode I’d have put across as a season finale. 

Looking forward to the next episode!

Split

Ok so this one wasn’t how I expected it to be. I was expecting some psychological thriller or more focus on the personalities themselves. Totally didn’t expect The Beast to actually appear in the end! It’s sort of like how I never expected aliens in Shyamalan’s “Signs” and boom! they make an appearance. Great performances by James McAvoy and a well taken movie over all. 

Oh. And the “Unbreakable” reference in the end? Totally didn’t expect that. Ooooh. “Unbreakable” is one of my favorite Shyamalan movies (THE favorite movie I’d say). 

I have to stop thinking of Shyamalan as a director with a twist in the end. It’s all coz of “The Sixth Sense” and “Unbreakable” and “Signs”. Got to keep in mind that one can expect monster and aliens all that stuff. He is more into the horror thrill genre now. 

Miss Sloane

I started watching this movie thinking it would be action thriller like the Bourne movies or something. ;-) After I realized it was about lobbying and senate hearings and bill passing etc I had a good mind to stop watching … but for the character of Miss Sloane! Boy she was something. What a character. An odd, cold, personality … it was something! A great movie. More than that, a great character. And a good insight into the kind of stuff that happens as part of lobbying (most of which made no sense to me and was of no interest). 

That’s all for now!

Update:

The Dressmaker

Saw this the next day but thought I’d add it with the rest anyways. God, what a bore of a movie. The synopsis mentioned this being a revenge story or something, so I imagined something alone the lines of “The Count of Monte Cristo”. There’s some revenge alright – towards the end – but it’s a drag until then with some nice moments interpresed here and there. The movie’s nearly 2 hours long. Think I could have done something way useful with that time! Bleh. 

Hugo Weaving’s character was quite good by the way. Very different to his other roles. The story is good; the movie is good too, am sure, for others – just wasn’t my cup of tea. This is a revenge story with a lot of drama. I want a revenge story with a lot more action and speed. 

[Aside] The Ultimate Guide To Being An Introvert – Altucher Confidential

I tweeted this link but then thought I should put it on my blog too mainly as a reference to myself. Sometimes I wander through my blog looking for wisdom and I hope to find this post then. A great read, especially if you are an introvert and view that/ have been told that it’s a bad thing. 

http://www.jamesaltucher.com/2017/04/ultimate-guide-being-introvert/

Read the full article (it is long); here’s an excerpt I liked. 

Being an introvert has nothing to do with being shy. Or being outgoing or not outgoing. Or being socially awkward.

All it means is that some people recharge when they are by themselves (introverts).

Other people recharge when they are interacting with many other people (extraverts) and most people are in the middle.

I lose energy very quickly when in a group of people. Getting invited to a party is horrible for me.

I say “no” to almost every social situation. Because I know they will take energy away from me doing the things I love.
If I’m giving a talk it’s no problem. Because I’m by myself on the stage. It’s one to many instead of me just one in a mess of people. I recharge on the stage.

Notes on MCS disks

Primer 1. Primer 2. MCS Prep overview (good post, I don’t refer to all its points below). 

  • MCS creates a snapshot of the master VM you specify, but if you specify a snapshot it will not create another one. 
  • This snapshot is used to create to create a full clone. A full snapshot, so to say. 
    • This way the image used by the catalog is independent of the master VM. 
    • During the preparation of this full snapshot an “instruction disk” is attached to the VM that is temporarily created using the full snapshot. This disk enables DHCP on all interfaces of the full snapshot; does some KMS related tasks; and runs vDisk inventory collection if required.
  • This full snapshot is stored on each storage repository that is used by Desktop Studio. 
    • This full snapshot is shared by all VMs on that storage repository. 
  • Each storage repository will also have an identity disk (16 MB) per VM.
  • Each storage repository will also have a delta/ difference disk per VM.
    • This is thin provisioned if the storage supports it.
    • Can increase up to the maximum size of the VM.

Remember my previous post on the types:

  • Random.
    • Delta disk is deleted during reboot. 
  • Static + Save changes.
    • Changes are saved to a vDisk. 
    • Delta disk not used?
  • Static + Dedicated VM.
    • Delta disk is not deleted during reboot. 
    • Important to keep in mind: if the master image in the catalog is updated, existing VMs do not automatically start using it upon next reboot. Only newly created dedicated VMs use the new image. 
    • The delta disk is deleted when the master image is updated and existing VMs are made to use the new image (basically, new VMs are created and the delta disk starts from scratch; user customizations are lost). 
    • Better to use desktop management tools (of the OS) to keep dedicated VMs up to date coz of the above issue. 
  • Static + Discard changes.
    • Delta disk is deleted during reboot. 

A post on sealing the vDisk after changes. Didn’t realize there’s so many steps to be done. 

[Aside] Windows Update tools

Wanted to link to these as I came across them while searching for something Windows Updates related.

  • ABC-Update – didn’t try it out but looks useful from a client side point of view. Free.
  • WuInstall – seems to be a client and server thing. Putting it here so I find it if ever needed in future. Paid.
  • Windows Update PowerShell Module – you had me at PowerShell! :0)
    • A blog post explaining this module. Just in case.

[Aside] Memory Resource Management in ESXi

Came across this PDF from VMware while reading on memory management. It’s dated, but a good read. Below are some notes I took while reading it. Wanted to link to the PDF and also put these somewhere; hence this post.

Some terminology:

  • Host physical memory <–[mapped to]– Guest physical memory (continuous virtual address space presented by Hypervisor to Guest OS) <–[mapped to]– Guest virtual memory (continuous virtual address space presented by Guest OS to its applications).
    • Guest virtual -> Guest physical mapping is in Guest OS page tables
    • Guest physical -> Host physical mapping is in pmap data structure
      • There’s also a shadow page table that the Hypervisor maintains for Guest virtual -> Guest physical
      • A VM does Guest virtual -> Guest physical mapping via hardware Translation Lookup Buffers (TLBs). The hypervisor intercepts calls to these; and uses these to keep its shadow page tables up to date.
  • Guest physical memory -> Guest swap device (disk) == Guest level paging.
  • Guest physical memory -> Host swap device (disk) == Hypervisor swapping.

Some interesting bits on the process:

  • Applications use OS provided interfaces to allocate & de-allocate memory.
  • OSes have different implementations on how memory is classified as free or allocated. For example: two lists.
  • A VM has no pre-allocated physical memory.
  • Hypervisor maintains its own data structures for free and allocated memory for a VM.
  • Allocating memory for a VM is easy. When the VM Guest OS makes a request to a certain location, it will generate a page fault. The hypervisor can capture that and allocate memory.
  • De-allocation is tricky because there’s no way for the hypervisor to know the memory is not in use. These lists are internal to the OS. So there’s no straight-forward way to take back memory from a VM.
  • The host physical memory assigned to a VM doesn’t keep growing indefinitely though as the guest OS will free and allocate within the range assigned to it, so it will stick within what it has. And side by side the hypervisor tries to take back memory anyways.
    • Only when the VM tries to access memory that is not actually mapped to host physical memory does a page fault happen. The hypervisor will intercept that and allocate memory.
  • For de-allocation, the hypervisor adds the VM assigned memory to a free list. Actual data in the physical memory may not be modified. Only when that physical memory is subsequently allocated to some other VM does it get zeroed out.
  • Ballooning is one way of reclaiming memory from the VM. This is a driver loaded in the Guest OS.
    • Hypervisor tells ballooning driver how much memory it needs back.
    • Driver will pin those memory pages using Guest OS APIs (so the Guest OS thinks those pages are in use and should not assign to anyone else).
    • Driver will inform Hypervisor it has done this. And Hypervisor will remove the physical backing of those pages from physical memory and assign it to other VMs.
    • Basically the balloon driver inflates the VM’s memory usage, giving it the impression a lot of memory is in use. Hence the term “balloon”.
  • Another way is Hypervisor swapping. In this the Hypervisor swaps to physical disk some of the physical memory it has assigned to the VM. So what the VM thinks is physical memory is actually on disk. This is basically swapping – just that it’s done by Hypervisor, instead of Guest OS.
    • This is not at all preferred coz it’s obviously going to affect VM performance.
    • Moreover, the Guest OS too could swap the same memory pages to its disk if it is under memory pressure. Hence double paging.
  • Ballooning is slow. Hypervisor swapping is fast. Ballooning is preferred though; Hypervisor swapping is only used when under lots of pressure.
  • Host (Hypervisor) has 4 memory states (view this via esxtop, press m).
    • High == All Good
    • Soft == Start ballooning. (Starts before the soft state is actually reached).
    • Hard == Hypervisor swapping too.
    • Low == Hypervisor swapping + block VMs that use more memory than their target allocations.

 

TIL: vCenter inherited permissions are not cumulative

Say you are part of two groups. Group A has full rights on the vCenter. Group B has limited rights on a cluster.

You would imagine that since you are a member of Group A and that has full rights on vCenter itself, your rights on the cluster in question won’t be limited. But nope, you are wrong. Since you are a member of Group B and that has limited rights on the cluster, your rights too are restricted. Bummer if you are a member of multiple groups and some of these groups have limited rights on child objects! :o)

Workaround is to add yourself or Group A explicitly on that cluster, with full rights. Then the permissions become cumulative.

MCS choices (RAM cache & Disk cache)

Just a reminder to myself …

When creating a Desktop based Machine Catalog here are my choices:

If I choose Random then I get the option to allocate some of my RAM towards a cache, and also create a disk cache. RAM cache means all data is written to RAM first and then to disk as RAM fills up. And disk cache is like the Write Cache disk in PVS – you can specify a separate disk (maybe local to the host, or SSD storage) where data is written to.

Important to keep in mind here that the actual VM disk will not have any data written to it. All data writes either goes to the RAM cache or Disk cache. First RAM cache, then Disk cache. Both are optional; best to have both (or at least don’t do RAM cache only unless you have oodles or RAM!).

Read this post – it’s a good one. Also, check out the official post from Citrix introducing this feature in XenDesktop 7.9. MCS (Machine Creation Services) that makes use of RAM or Disk cache is known as MCSIO (Machine Creation Services Storage Optimization (beats me how that acronym works! :p)).

MCS VMs have two disks apart from the OS base disk – an identity disk and a delta disk. MCSIO VMs too have the identity disk and delta disk, but the delta disk is only used for maintenance tasks. Hence my comment above that when using either of these cache options, the size you allocate for these is your write cache/ delta disk. 

If I choose static I have three further options. 

If I go with static + save changes to a personal vDisk, I don’t get the option for cache disk etc. I can only choose my vDisk letter and size. 

 If I go with static + create a dedicated VM, again I don’t get any option for cache disk; I can only choose the copy mode (i.e. a linked clone or a full clone). 

If I go with static + discard all changes, then I get the option for cache disk and RAM allocation towards cache. Basically, static + discard is similar to random. You are not storing any changes, so it makes sense to use cache (RAM and/ or write cache). 

In the case of Server OS, I don’t have any choices (it’s always random) and I get the option for cache disk and RAM allocation.

MCSIO is only for non-persistent experiences. 

Notes to self on XenServer storage

Playing with XenServer in my testlab (basically as a VM in VMware Workstation hah!) I ran into trouble while creating a Machine Catalog via Citrix Desktop Studio. I forget the exact message but it was about lack of resources. I could see that in the create catalog process it was creating a snapshot and making a copy VM, powering it on and off successfully, and then it was failing. I kept an eye on my storage during this and saw that indeed it was exceeding the allocated space. I had thought it would do thin provisioning but in retrospect I realize XenServer never asked me about thick or thin when I added my iSCSI storage. Hmm.

Well turns out that for iSCSI XenServer has only thick provisioning. You get thin provisioning only if you are using the ext3 filesystem or NFS. Since iSCSI uses LVM, bummer! 

Here’s a forum post on how to identify if your SR is thick or not. 

Regarding thin provisioning – it is only for locally attached storage (which can use ext3) or NFS. Block attached storage is thick.

Before I realized all this I had spent some Googling on how to create a thin provisioned SR (Storage Repository). I felt that maybe it’s a GUI restriction and I can workaround by using the CLI. Turns out I was wrong. Here’s an article that explains SRs in XenServer anyway. It’s a good read. Here’s an article just on enabling thin provisioning for ext3 SRs via the CLI. 

While on the topic of storage, this is something I wanted to blog about earlier but never got around to. When using SMB/ CIFS shares, XenServer only supports NTLMv1. Here’s instructions on using NTLMv2

Also, smbclient is a good tool to test SMB connects from a XenServer. Example:

That seems to work, but I get a logon failure. This is because I didn’t put the username in quotes. 

That works!

I have no idea what the three commands below except that they are to do with mounting an SMB/ CIFS share on a XenServer permanently. I had noted these commands as part of my would be blog post, but it’s been a while now and I forget. Sometime when I get around to doing SMB3 or NTLMv2 with XenServer again I hope to refer to these again and better explain. I don’t want to spend too much time on XenServers now and get sidetracked …

After issue the above commands I think the shared folder is mounted only on one host in the pool. But right clicking on it and doing a repair will get it mounted on all hosts in the pool.

XenServer 7.0 and above support SMB for VM disk storage too. Prior versions support SMB only for ISO storage. 

Add-DnsServerZoneDelegation with multiple nameservers

Only reason I am creating this post is coz I Googled for the above and didn’t find any relevant hits

I know I can use the Add-DnsServerZoneDelegation cmdlet to create a new delegated zone (basically a sub-domain of a zone hosted by our DNS server, wherein some other DNS server hosts that sub-domain and we merely delegate any requests for that sub-domain to this other DNS server). But I wasn’t sure how I’d add multiple name servers. The switches give an option to give an array of IP addresses, but that’s just any array of IP addresses for a single name server. What I wanted was to have an array of name servers each with their own IP.

Anyways, turns out all I had to do was run the command for each name server. 

Above I create a delegation from my zone “parentzone.com” to 3 DNS servers “DNS0[1-3].somedomain” (also specified by their respective IPs) for the sub-domain “subzone.parentzone.com”.

NSX Firewall no working on Layer3; OpenBSD VMware Tools; IP Discovery, etc.

I have two security groups. Network 1 VMs (a group that contains my VMs in the 192.168.1.0/24) and Network 2 VMs (similar, for 192.168.2.0/24 network). 

Both are dynamic groups. I select members based on whether the VM name contains -n1 or -n2. (The whole exercise is just for fun/ getting to know this stuff). 

I have two firewall rules making use of these rules. Layer 2 and Layer 3. 

The Layer 2 rule works but the Layer 3 one does not! Weird. 

I decided to troubleshoot this via the command line. Figured it would be a good opportunity.

To troubleshoot I have to check the rules on the hosts (because remember, that’s where the firewall is; it’s a kernel module in each host). For that I need to get the host-id. For which I need to get the cluster-id. Sadly there’s no command to list all hosts (or at least I don’t know of any). 

So now I have my host-ids.

Let’s also take a look the my VMs (thankfully it’s a short list! I wonder how admins do this in real life):

We can see the filters applying to each VM.  To summarize:

And are these filters applying on the hosts themselves?

Hmm, that too looks fine. 

Next I picked up one of the rule sets and explored it further:

The Layer 3 & Layer 2 rules are in separate rule sets. I have marked the ones which I am interested in. One works, the other doesn’t. So I checked the address sets used by both:

Tada! And there we have the problem. The address set for the Layer 3 rule is empty. 

I checked this for the other rules too – same situation. I modified my Layer 3 rule to specifically target the subnets:

And the address set for that rule is not empty:

And because of this the firewall rules do work as expected. Hmm.

I modified this rule to be a group with my OpenBSD VMs from each network explicitly added to it (i.e. not dynamic membership in case that was causing an issue). But nope, same result – empty address set!

But the address set is now empty. :o)

So now I have an idea of the problem. I am not too surprised by this because I vaguely remember reading something about VMware Tools and IP detection inside a VM (i.e. NSX makes use of VMware Tools to know the IP address of a VM) and also because I am aware OpenBSD does not use the official VMware Tools package (it has its own and that only provides a subset of functions).

Googling a bit on this topic I came across the IP address Discovery section in the NSX Admin guide – prior to NSX 6.2 if VMware Tools wasn’t installed (or was stopped) NSX won’t be able to detect the IP address of the VM. Post NSX 6.2 it can do DHCP & ARP snooping to work around a missing/ stopped VMware Tools. We configure the latter in the host installation page:

I am going to go ahead and enable both on all my clusters. 

That helped. But it needs time. Initially the address set was empty. I started pings from one VM to another and the source VM IP was discovered and put in the address set; but since the destination VM wasn’t in the list traffic was still being allowed. I stopped pings, started pings, waited a while … tried again … and by then the second VM IP to was discovered and put in the address set – effectively blocking communication between them. 

Side by side I installed a Windows 8.1 VM with VMware Tools etc and tested to see if it was being automatically picked up (I did this before enabling the snooping above). It was. In fact its IPv6 address too was discovered via VMware Tools and added to the list:

Nice! Picked up something interesting today. 

Nested XenServer crashes when scrubbing memory

In case anyone else runs into this. I noticed that both XenServer 6.5 and 7.0 crash at the memory scrubbing stage during boot up when run as a VM within VMware Workstation (and possibly other virtualization products too – I didn’t try it with anything else). 

Am guessing the crash happens because the memory is not really available (this being a nested VM) and so the process crashes. Anyhoo, the workaround is to disable memory scrubbing. Check this blog post for instructions. 

In brief, the instructions are to add the option bootscrub=false to the boot options. This is via the file /boot/extlinux.conf in XenServer 6.5; or via /boot/grub/grub.cfg in XenServer 7.0.

Notes to self while installing NSX 6.3 (part 4)

Reading through the VMware NSX 6.3 Install Guide after having installed the DLR and ESG in my home lab. Continuing from the DLR section.

As I had mentioned earlier NSX provides routing via DLR or ESG.  

  • DLR == Distributed Logical Router.
  • ESG == Edge Services Gateway

DLR consists of an appliance that provides the control plane functionality. This appliance does not do any routing itself. The actual routing is done by the VIBs on the ESXi hosts. The appliance uses the NSX Controller to push out updates to the ESXi host. (Note: Only DLR. ESG does not depend on the Controller to push out route). Couple of points to keep in mind:

  • A DLR instance cannot connect to logical switches in different transport zones. 
  • A DLR cannot connect to a dvPortgroup with VLAN ID 0.
  • A DLR cannot connect to a dvPortgroup with VLAN ID if that DLR also connects to logical switches spanning more than one VDS. 
    • This confused me. Why would a logical switch span more than one VDS? I dunno. There are reasons probably, same way you could have multiple clusters in same data center having different VDSes instead of using the same one. 
  • If you have portgroups on different VDSes with the same VLAN ID, and these VDSes share some hosts, then DLR cannot connect these. 

I am not entirely clear with the above points. It’s more to enforce the transport zones and logical switches align correctly, but I haven’t entirely understood it so I am simply going to make note as above and move on …

In a DLR the firewall rules only apply to the uplink interface and are limited to traffic destined for the edge virtual appliance. In other words they don’t apply to traffic between the logical switches a DLR instance connects. (Note that this refers to the firwall settings found under the DLR section, not in the Firewall section of NSX). 

A DLR has many interfaces. The one exposed to VMs for routing is the Logical InterFace (LIF). Here’s a screenshot from the interfaces on my DLR. 

The ones of type ‘Internal’ are the LIFs. These are the interfaces that the DLR will route between. Each LIF connects to a separate network – in my case a logical switch each. The IP address assigned to this LIF will be the address you set as gateway for the devices in that network. So for example: one of the LIFs has an IP address 192.168.1.253 and connects to my 192.168.1.0/24 segment. All the VMs there will have 192.168.1.253 as their default gateway. Suppose we ignore the ‘Uplink’ interface for now (it’s optional, I created it for the external routing to work), and all our DLR had were the two ‘Internal’ LIFs, and VMs on each side had the respective IP address set as their default gateway, then our DLR will enable routing between these two networks. 

Unlike a physical router though, which exists outside the virtual network and which you can point to as “here’s my router”, there’s no such concept with DLRs. The DLR isn’t a VM which you can point to as your router. Nor is it a VM to which packets between these networks (logical switches) are sent to for routing. The DLR, as mentioned above, is simply your ESXi hosts. Each ESXi host that has logical switches which a DLR connects into has this LIF created in them with that LIF IP address assigned to it and a virtual MAC so VMs can send packets to it. The DLR is your ESXi host. (That is pretty cool, isn’t it! I shouldn’t be amazed because I had mentioned it earlier when reading about all this, but it is still cool to actually “see” it once I have implemented).

Above screenshot is from my two VMs on the same VXLAN but on different hosts. Note that the default gateway (192.168.1.253) MAC is the same for both. Each of their hosts will respond to this MAC entry. 

(Note to self: Need to explore the net-vdr command sometime. Came across it as I was Googling on how to find the MAC address table seen by the LIF on a host. Didn’t want to get side-tracked so didn’t explore too much. There’s something called a VDR (not encountered it yet in my readings).

  • net-vdr -I -l will list all the VDRs on a host.
  • net-vdr -L -l <vdrname> will list the LIFs.
  • net-vdr -N -l <vdrname> will list the MAC addresses (ARP info)

)

When creating a DLR it is possible to create it with or without the appliance. Remember that the appliance provides the control plane functionality. It is the appliance that learns of new routes etc and pushes to the DLR modules in the ESXi hosts. Without an appliance the DLR modules will do static routing (which might be more than enough, especially in a test environment like my nested lab for instance) so it is ok to skip it if your requirements are such. Adding an appliance means you get to (a) select if it is deployed in HA config (i.e. two appliance), (b) their locations etc, (c) IP address and such for the appliance, as well as enabling SSH. The appliance is connected to a different interface for HA and SSH – this is independent of the LIFs or Uplink interfaces. That interface isn’t used for any routing. 

Apart from the control plane, the appliance also controls the firewall on the DLR. If there’s no appliance you can’t make any firewall changes to the DLR – makes sense coz there’s nothing to change. You won’t be connecting to the DLR for SSH or anything coz you do that to the appliance on the HA interface. 

According to the docs you can’t add an appliance once a DLR instance is deployed. Not sure about that as I do see an option to deploy an appliance on my non-appliance DLR instance. Maybe it will fail when I actually try and create the appliance – I didn’t bother trying. 

Discovered this blog post while Googling for something. I’ve encountered & linked to his posts previously too. He has a lot of screenshots and step by step instructions. So worth a check out if you want to see some screenshots and much better explanation than me. :) Came across some commands from his blog which can be run on the NSX Controller to see the DLRs it is aware of and their interfaces. Pasting the output from my lab here for now, I will have to explore this later …

I have two DLRs. One has an appliance, other doesn’t. I made these two, and a bunch of logical switches to hook these to, to see if there’s any difference in functionality or options.

One thing I realized as part of this exercise is that a particular logical switch can only connect to one DLR. Initially I had one DLR which connected to 192.168.1.0/24 and 192.168.2.0/24. Its uplink was on logical switch 192.168.0.0/24 which is where the ESG too hooked into. Later when I made one more DLR with its own internal links and tried to connect its uplink to the 192.168.0.0/24 network used by the previous DLR, I saw that it didn’t even appear in the list of options. That’s when I realized its better to use a smaller range logical switch for the uplinks – like say a /30 network. This way each DLR instance connects to an ESG on its own /30 network logical switch (as in the output above). 

A DLR can have up to 8 uplink interfaces and 1000 internal interfaces.


Moving on to ESG. This is a virtual appliance. While a DLR provides East-West routing (i.e. within the virtual environment), an ESG provides North-South routing (i.e. out of the virtual environment). The ESG also provides services such as DHCP, NAT, VPN, and Load Balancing. (Note to self: DLR does not provide DHCP or Load Balancing as one might expect (at least I did! :p). DLR provides DHCP Relay though). 

The uplink of an ESG will be a VDS (Distributed Switch) as that’s what eventually connects an ESXi environment to the physical network. 

An ESG needs an appliance to be deployed. You can enable/ disable SSH into this appliance. If enabled you can SSH into the ESG appliance from the uplink address or from any of the internal link IP addresses. In contrast, you can only SSH into a DLR instance if it has an associated appliance. Even then, you cannot SSH into the appliance from the internal LIFs (coz these don’t really exist, remember … they are on each ESXi host). With a DLR we have to SSH into the interface used for HA (this can be used even if there’s only one appliance and hence no HA). 

When deploying an ESG appliance HA can be enabled. This deploys two appliances in an active/passive mode (and the two appliances will be on separate hosts). These two appliances will talk to each other to keep in sync via one of the internal interfaces (we can specify one, or NSX will just choose any). On this internal interface the appliances will have a link local IP address (a /30 subnet from 169.254.0.0/16) and communicate over that (doesn’t matter that there’s some other IP range actually used in that segment, as these are link local addresses and unlikely anyone’s going to actually use them). In contrast, if a DLR appliance is deployed with HA we need to specify a separate network from the networks that it be routing between. This can be a logical switch or a DVS, and as with ESG the two appliances will have link local IP addresses (a /30 subnet from 169.254.0.0/16) for communication. Optionally, we can specify an IP address in this network via which we can SSH into the DLR appliance (this IP address will not be used for HA, however).

After setting up all this, I also created two NAT rules just for kicks. 

And with that my basic setup of NSX is complete! (I skipped OSPF as I don’t think I will be using it any time soon in my immediate line of work; and if I ever need to I can come back to it later). Next I need to explore firewalls (micro-segmentation) and possibly load balancing etc … and generally fiddle around with this stuff. I’ve also got to start figuring out the troubleshooting and command-line stuff. But the base is done – I hope!

Yay! (VXLAN) contd. + Notes to self while installing NSX 6.3 (part 3)

Finally continuing with my NSX adventures … some two weeks have past since my last post. During this time I moved everything from VMware Workstation to ESXi. 

Initially I tried doing a lift and shift from Workstation to ESXi. Actually, initially I went with ESXi 6.5 and that kept crashing. Then I learnt it’s because I was using the HPE customized version of ESXi 6.5 and since the server model I was using isn’t supported by ESXi 6.5 it has a tendency to PSOD. But strangely the non-HPE customized version has no issues. But after trying the HPE version and failing a couple of times, I gave up and went to ESXi 5.5. Set it up, tried exporting from VMware Workstation to ESXi 5.5, and that failed as the VM hardware level on Workstation was newer than ESXi. 

Not an issue – I fired up VMware Converter and converted each VM from Workstation to ESXi. 

Then I thought hmm, maybe the MAC addresses will change and that will cause an issue, so I SSH’ed into the ESXi host and manually changed the MAC addresses of all my VMs to whatever it was in Workstation. Also changed the adapters to VMXNet3 wherever it wasn’t. Reloaded the VMs in ESXi, created all the networks (portgroups) etc, hooked up the VMs to these, and fired them up. That failed coz the MAC address ranges were of VMware Workstation and ESXi refuses to work with those! *grr* Not a problem – change the config files again to add a parameter asking ESXi to ignore this MAC address problem – and finally it all loaded. 

But all my Windows VMs had their adapters reset to a default state. Not sure why – maybe the drivers are different? I don’t know. I had to reconfigure all of them again. Then I turned to OpnSense – that too had reset all its network settings, so I had to configure those too – and finally to nested ESXi hosts. For whatever reason none of them were reachable; and worse, my vCenter VM was just a pain in the a$$. The web client kept throwing some errors and simply refused to open. 

That was the final straw. So in frustration I deleted it all and decided to give up.

But then …

I decided to start afresh. 

Installed ESXi 6.5 (the VMware version, non-HPE) on the host. Created a bunch of nested ESXi VMs in that from scratch. Added a Windows Server 2012R2 as the shared iSCSI storage and router. Created all the switches and port groups etc, hooked them up. Ran into some funny business with the Windows Firewall (I wanted to assign some interface as Private, others as Public, and enable firewall only only the Public ones – but after each reboot Windows kept resetting this). So I added OpnSense into the mix as my DMZ firewall.

So essentially you have my ESXi host -> which hooks into an internal vSwitch portgroup that has the OpnSense VM -> which hooks into another vSwitch portgroup where my Server 2012R2 is connected to, and that in turn connects to another vSwitch portgroup (a couple of them actually) where my ESXi hosts are connected to (need a couple of portgroup as my ESXi hosts have to be in separate L3 networks so I can actually see a benefit of VXLANs). OpnSense provides NAT and firewalling so none of my VMs are exposed from the outside network, yet they can connect to the outside network if needed. (I really love OpnSense by the way! An amazing product). 

Then I got to the task of setting these all up. Create the clusters, shared storage, DVS networks, install my OpenBSD VMs inside these nested EXSi hosts. Then install NSX Manager, deploy controllers, configure the ESXi hosts for NSX, setup VXLANs, segment IDs, transport zones, and finally create the Logical Switches! :) I was pissed off initially at having to do all this again, but on the whole it was good as I am now comfortable setting these up. Practice makes perfect, and doing this all again was like revision. Ran into problems at each step – small niggles, but it was frustrating. Along the way I found that my (virtual) network still does not seem to support large MTU sizes – but then I realized it’s coz my Server 2012R2 VM (which is the router) wasn’t setup with the large MTU size. Changed that, and that took care of the MTU issue. Now both Web UI and CLI tests for VXLAN succeed. Finally!

Third time lucky hopefully. Above are my two OpenBSD VMs on the same VXLAN, able to ping each other. They are actually on separate L3 ESXi hosts so without NSX they won’t be able to see each other. 

Not sure why there are duplicate packets being received. 

Next I went ahead and set up a DLR so there’s communicate between VXLANs. 

Yeah baby! :o)

Finally I spent some time setting up an ESG and got these OpenBSD VMs talking to my external network (and vice versa). 

The two command prompt windows are my Server 2012R2 on the LAN. It is able to ping the OpenBSD VMs and vice versa. This took a bit more time – not on the NSX side – as I forgot to add the routing info on the ESG for my two internal networks (192.168.1.0/24 and 192.168.2.0/24) as well on the Server 2012R2 (192.168.0.0/16). Once I did that routing worked as above. 

I am aware this is more of a screenshots plus talking post rather than any techie details, but I wanted to post this here as a record for myself. I finally got this working! Yay! Now to read the docs and see what I missed out and what I can customize. Time to break some stuff finally (intentionally). 

:o)