Contact

Subscribe via Email

Subscribe via RSS/JSON

Categories

Creative Commons Attribution 4.0 International License
© Rakhesh Sasidharan

Elsewhere

Internet not working in Chrome but works fine in IE

Today, Internet browsing via Chrome stopped working at my office. IE was not affected, only Chrome. The error was just that the site couldn’t be reached.

I fired up Chrome and went to “chrome://net-internals/“. In the page that opened I went to the “Proxy” in the left sidepane and saw that although the original proxy settings were “auto detect” the effective proxy settings were “direct”. That didn’t make sense – Chrome was set the proxy settings of IE, but IE was working fine and detecting a proxy but Chrome wasn’t. A quick Google search showed me that if Chrome is having trouble finding a proxy, it resorts to a direct Internet connection. Seems to be by design. So why was Chrome having trouble finding a proxy? IE was set with a WPAD file location so I went to the “Events” in the side pane of “chrome://net-internals/” to see if it was having trouble finding the WPAD file. It didn’t, but there were errors like these:

The line referred to was the last line of the WPAD file so clearly it was reading it and there was something wrong with the syntax of the file. I opened up the file in Notepad++, set the language to JavaScript (so I get syntax highlighting and braces matching etc), and went through the various script blocks in the file. Sure enough one section had a missing ending brace “}” and that was tripping up Chrome. Not sure why IE was able to move past this error, but there you go. I added the missing brace and Chrome began working. :)

[Aside] Various DNS stuff

No point to this post except as a reference for my future self. I wanted to mention some of the links here to a colleague of mine today but couldn’t remember them. Finally had to search through my browser history. Easier to just put them here for later reference. :)

Via this Pi-Hole page – OpenNIC and DNS.Watch. Both are for uncensored results etc., with the former having additional TLDs too. Sadly neither supports edns-client-subnet so I can’t really use it. :( If I query www.google.com via one of these I get results that are 150-220ms away. Same query via Google DNS or OpenDNS gives me results that are 8ms away!

I hope to implement DNSCrypt-proxy on my Asus router this weekend (time permitting). Seems to be straight-forward to setup on Asus Merlin as there’s an installer and also available via AMTM. My colleague is currently using DNSCrypt.nl as the upstream resolver, but he also mentioned an alternative he hopes to try.

It’s funny there’s a lot more talk about DNS encryption these days. I happened to get on it coz I got the Asus Merlin running at home again recently and also coz of the CloudFlare DNS announcement. I’ve generally been in a geeky mode since then and checking out things like Pi-Hole etc. And just the other day I read an Ars Technica article about DNS encryption and today it turns out my colleague implemented DNSCrypt at his home just today morning.

Something else I hope to try – dunno where though – is the Knot DNS Resolver.

Lastly, totally unrelated but as a reference to myself – I didn’t know there was an open source version of the Synology OS called XPEnology, and I didn’t know of these picoPSU power supplies. So cool! Also, Netgear R7800 seems to be a good router to keep in mind for the future.

Why multiple temporary IPv6 addresses when using SLAAC

Since enabling SLAAC as per my previous post I noticed that Android now has two IPv6 addresses (in addition to the link local one it already had) and Windows has the link-local one, a DHCPv6 one (marked as preferred), and two SLAAC IPv6 addresses (marked as “Temporary IPv6 Address”). Trying to find out why brought me to this superuser page that answered my question.

The long and short of it is that since SLAAC IPv6 addresses are not “centralized” (i.e. not from a DHCPv6 server), the client is at liberty to create multiple IPv6 addresses for privacy purposes. This is mainly to protect your privacy, so servers on the Internet are not able to track you consistently (nor try and collect your IPv6 address and try to make contact with your client I guess). Via the netsh interface ipv6 show addresses command on my Windows 10 machine I see that they have a duration of an hour after which they are presumably regenerated.

The netsh interface ipv6 show privacy command shows whether temporary IPv6 addresses are enabled or not. Linux has something similar.

Sure enough when I now visit https://www.whatismyip.com/ on my browser it no longer shows the DHCP assigned IPv6 address but one of the temporary ones (and no, it does not even show the SLAAC generated IPv6 address based on the EUI-64 MAC address; it’s a temporary random address that appears in ipconfig or netsh interface ipv6 show addresses as temporary).

 

Brief note on IPv6 flags and Dnsmasq modes

Discovered that my Android phone only had a link-local IPv6 address and learnt that it doesn’t support DHCPv6 (who thought?!). So I want to enable SLAAC in addition to DHCPv6 in my network. Was checking out Dnsmaq options (as Asus uses that) and came across its various modes.

IPv6 Router Advertisement (RA) messages can contain the following flags:

  • M (“Managed address configuration”) – indicates that IPv6 addresses are available via DHCPv6. This is also referred to as Stateful DHCP.
  • O (“Other configuration”) – no IPv6 address, but other configuration information like DNS etc. are available via DHCPv6. This is also referred to as Stateless DHCP.
  • A (“Autonomous Address Configuration”) – indicates that the prefix present with the flag can be used for SLAAC (StateLess Auto Address Configuration).

Note that if the M flag is present the O flag doesn’t matter – coz clients are getting information via DHCPv6 anyway.

Dnsmasq allows the following modes when defining an IPv6 range (from its man page):

For IPv6, the mode may be some combination of ra-only, slaac, ra-names, ra-stateless, ra-advrouter, off-link.

ra-only tells dnsmasq to offer Router Advertisement only on this subnet, and not DHCP.

slaac tells dnsmasq to offer Router Advertisement on this subnet and to set the A bit in the router advertisement, so that the client will use SLAAC addresses. When used with a DHCP range or static DHCP address this results in the client having both a DHCP-assigned and a SLAAC address.

ra-stateless sends router advertisements with the O and A bits set, and provides a stateless DHCP service. The client will use a SLAAC address, and use DHCP for other configuration information.

ra-names enables a mode which gives DNS names to dual-stack hosts which do SLAAC for IPv6. Dnsmasq uses the host’s IPv4 lease to derive the name, network segment and MAC address and assumes that the host will also have an IPv6 address calculated using the SLAAC algorithm, on the same network segment. The address is pinged, and if a reply is received, an AAAA record is added to the DNS for this IPv6 address. Note that this is only happens for directly-connected networks, (not one doing DHCP via a relay) and it will not work if a host is using privacy extensions. ra-names can be combined with ra-stateless and slaac.

ra-advrouter enables a mode where router address(es) rather than prefix(es) are included in the advertisements. This is described in RFC-3775 section 7.2 and is used in mobile IPv6. In this mode the interval option is also included, as described in RFC-3775 section 7.3.

off-link tells dnsmasq to advertise the prefix without the on-link (aka L) bit set.

This is a bit confusing so thought I should put it into a nice table. Note that this is my understanding, I could be wrong:

ra-only

no M or O flags; only A flag

clients can use the RA to configure their SLAAC IPv6 address. no DHCPv6 is offered.
slaac

if a DHCPv6 range is specified then M and A flags; else only A flag. no O flag, but as I said above the O flag doesn’t matter anyways if M flag is present.

I’d say M and A flags always (see my point in the next column)

clients can use RA to configure their SLAAC address. DHCPv6 too is offered if a range is configured. thus clients can have two IPv6 addresses – a SLAAC one and a DHCPv6 one.

slaac sounds like ra-only if no DHCP range is configured. I wonder why the DHCP range is presented as it is an optional thing. The DHCP range is what makes slaac different from ra-only, so you kind of actually need it.

ra-stateless only O and A flags; no M flag clients can use RA to configure their SLAAC address and look to DHCPv6 for the DNS etc. information.
ra-names no M or O flags; only A flag this one didn’t make much sense to me; but then again it is meant for dual stacked clients and I am not looking at that scenario. it sounds like ra-only, the difference being that Dnsmasq will assume the client’s SLAAC IPv6 address is based on its MAC address and thus derive a possible IPv6 address and ping it and if there’s a reply then create an AAAA record mapping the client’s name to this SLAAC IPv6 address.
ra-names,slaac M and A flags (assuming it is same as the slaac mode) same as above, just that clients will have a DHCPv6 address in addition to the SLAAC one. and Dnsmasq will create the AAAA DNS record.
ra-names,ra-stateless O and A flags; no M flag same as above, just that clients don’t have any DHCPv6 address but use RA to configure DNS etc.
ra-advrouter ignoring it for now – it’s to do with mobile IPv6 and didn’t make much sense to me :)  
off-link ignoring for now; didn’t make much sense to me  

So in my case it looks like I have to enable the slaac mode. This way all my clients will have both DHCPv6 and SLAAC addresses (with the exception of Android who will get the SLAAC address only).

IPv6 at home!

Whee! I enabled IPv6 at home today. :)

It’s pretty straight-forward so not really an accomplishment on my part actually. I didn’t really have to do anything except flip a switch, but I am glad I thought of doing it and actually did it, and pretty happy to see that it works. Nice!

Turns out Etislalat started rolling out IPv6 to home users in Dubai back in November 2016. I obviously didn’t know of it. Nice work Etisalat!

Also, my Asus router supports IPv6. Windows and iOS etc. supports IPv6 too, so all the pieces are really in place.

All I had to do on the Asus router was go to the IPv6 section, set Connection Type as “Native”, Interface as “PPP”, enable “DHCP-PD” and enable “Release prefix on exit”. DHCP-PD stands for “DHCP Prefix Delegation”. In IPv4 the ISP gives your home router a single public IP and everything behind the home router is NAT’d into that single pubic IP by the router. In IPv6 you are not limited to a single public IP. IPv6 has tons of addresses after all, so every device can have a pubic IP. Thus the ISP gives you not a single IPv6 address, but a /64 publicly accessible prefix itself and all your home devices can take addresses from that pool. Thus “DHCP-PD” means your router asks the ISP to give it a prefix, and “Release prefix on exit” means the router gives that prefix back to the ISP when disconnecting or whatever.

I also decided to use the Google DNS IPv6 servers.

Here’s a list of IPv6 only websites if you want to visit and feel good. :p

Check out this website to test IPv6. It also has a dual stack version that checks if your browser prefers IPv4 over IPv6 even though it may have IPv6 connectivity. Initially I was using this test site. The test succeeded there but I got the following error: “Your browser has real working IPv6 address – but is avoiding using it. We’re concerned about this.”. Turns out Chrome and Firefox start an internal counter when a site has an IPv6 and IPv4 address and if the IPv4 address responds faster then they prefer the IPv4 version. Crazy huh! In Firefox I found these two options in about:config and that seemed to fix this – network.http.fast-fallback-to-IPv4 (set this to false) and network.notify.IPv6 (set to true – I am not sure this setting matters for my scenario but I changed it anyways).

Here’s Comcast’s version of SpeedTest over IPv6.

Back to my router settings. I decided to go with “Stateful” auto configuration for the IPv6 LAN and set an appropriate range. With IPv6 you can have the router dole out IPv6 addresses to clients (in the prefix it has) or you have have clients auto configure their IPv6 address by asking the router for the prefix information but creating their own address based on that. The former is “Stateful”, the latter is “Stateless”. I decided to go with “Stateful” (though I did play around with “Stateless” too). Also, leave the “Router Advertisements” section Enabled.

That’s pretty much it.

In my case I ended up wasting about an hour after this as I noticed that my Windows 10 laptop would work on IPv6 for a while and then stop working. It wasn’t able to ping the router either. After a lot of trial and error and fooling around I realized that it’s because a long time ago I had disabled a lot of firewall rules on my Windows 10 laptop and in the process dis-allowed my IPv6 rules that were enabled by default. Silly of me! I changed all those to their default state and now the laptop works fine without an issue.

Before moving on – double check that the IPv6 firewall on your router is enabled. Now that every machine in your LAN (that has an IPv6 address) is publicly accessible one has to be careful.

Etisalat and 3rd party routers

I shifted houses recently and rather than shift my Internet connection (as that has a 4 days downtime) I decided to apply for a new connection at the new premises (had an offer going on wherein the installation charge is zero) and then disconnect the existing connection once I have shifted. A downside of this – which I later realized – is that Etisalat seems to have stopped giving customers the Internet password.

Turns out Etisalat (like many other ISPs) now autoconfigure their routers. You simply plug it into the network and it contacts Etisalat’s servers and configures itself. This is using a protocol called TR-069, which I don’t know much of, but it seems to have some security risks. I have an Asus RT-AC68U router anyways which I have setup the way I want, so I wanted to move over from the Etisalat D-Link router to this one. When I spoke to the chap who installed my new Internet connection he said Etisalat does not allow users to install their own routers apparently. Found many Reddit posts too where people have complained of having to contact Etisalat and not been given this password and also about having to set a VLAN etc (e.g. this post). Seemed to be a lot of trouble.

Anyhow, I decided to try my luck. First I contacted them via email (care -at- etisalat.ae) asking to reset my password. A helpful agent called me up after a while and reset the password for it. It didn’t even affect my Internet connection coz the auto-configuring ensured that the Etisalat router picked up the new info. So far so good. I tried using these details with the Asus router to see if it will work straightaway, but it didn’t. So I sent them another email asking for the VLAN details. Next day another chap called me up and gave the VLAN details. He also mentioned that I’ll have to leave PnP on in my Asus router, or else he can raise a ticket to disable it. I said I’d like to have it disabled. About 4 hours later someone else called me up and said they are going to disable it now and would I like any assistance etc. I said nope, I’ll take care of it on my own.

Once they disabled PnP the Etisalat router stopped working. So I swapped it with the Asus one, and set the VLAN to what they agent gave me (it’s under LAN > IPTV Settings confusingly). I also changed the MAC of the Asus router to that of the Etisalat one – though I am not sure if that was really needed (I just did it beforehand, before unplugging the Etisalat router). This didn’t get things working though. Which stumped me for a while, until on a whim I decided to remove the VLAN stuff and just try with the username password like I had done yesterday. And yay that worked! So it wasn’t too much of a hassle after all. The phone and TV (eLife) still seem to be working so looks like I didn’t break anything either.

So, to summarize. If you want to use your own router with Etisalat (new connections) send them an email asking for the password to be reset and also make changes such as disabling Plug & Play so you can use your own router. Ask for the VLAN too just in case. Once you get these details connect the new router and put in the username password. If that doesn’t work put in the VLAN info too. That’s all! I was pleased with the quick turnaround and support, and it didn’t turn out to be a hassle at all like I was expecting. Nice one! :)

Couple of DNS stuff

So CloudFlare announced the 1.1.1.1 DNS resolver service the other day. Funny, coz I had been looking into various DNS options for my home network recently. What I had noticed at home was that when I use the Google DNS or OpenDNS resolvers I get a different (and much closer!) result for google.com while with other DNS servers (e.g. Quad9, Yandex) I get a server that’s farther away.

I was aware that using 3rd party DNS resolvers like this could result in me getting not ideal results, because the name server of the service I am querying would see my queries coming from this 3rd party resolver and hence give me a result from the region of this resolver (e.g. if Google.com has servers in UAE and US, and I am based in UAE, Google.com’s name servers will see that the request from www.google.com is coming from a server in the US and hence give me a result from the US thinking that’s where I am located). But that didn’t explain why Google DNS and OpenDNS were actually giving me good results.

Reading about that I came across this performance page from the Google DNS team and learnt about the edns-client-subnet (ECS) option (also see this FAQ entry). This is an option that name servers can support wherein the client can send over its IP/ subnet along with the query and the name server will look at that and modify its response accordingly. And if the DNS resolver support this, then it can send along this info to the name servers being queried and thus get better results. Turns out only Google DNS and OpenDNS support this and Google actually queries the name servers it knows with ECS queries and caches the results to keep track of which name servers support ECS. This way it can send those servers the ECS option. That’s pretty cool, and a good reason to stick with Google DNS! (I don’t think CloudFlare DNS currently does this, because I get non-ideal results with it too).

From this “how it works” page:

Today, if you’re using OpenDNS or Google Public DNS and visiting a website or using a service provided by one of the participating networks or CDNs in the Global Internet Speedup then a truncated version of your IP address will be added into the DNS request. The Internet service or CDN will use this truncated IP address to make a more informed decision in how it responds so that you can be connected to the most optimal server. With this more intelligent routing, customers will have a better Internet experience with lower latency and faster speeds. Best of all, this integration is being done using an open standard that is available for any company to integrate into their own platform.

While on DNS, I came across DNS Perf via the CloudFlare announcement. Didn’t know of such a service. Also useful, in case you didn’t know already, is this GRC tool.

Lastly, I came across Pi-Hole recently and that’s what I use at home nowadays. It’s an advertisement black hole. Got a good UI and all. It uses DNS (all clients point to the local Pi-Hole install for DNS) and is able to block advertisements and malware this way.

ADFS monitoring on NSX

Was looking at setting up monitoring of my ADFS servers on NSX.

I know what to monitor on the ADFS and WAP servers thanks to this article.

http://<Web Application Proxy name>/adfs/probe
http://<ADFS server name>/adfs/probe
http://<Web Application Proxy IP address>/adfs/probe
http://<ADFS IP address>/adfs/probe

Need to get an HTTP 200 response for these.

So I created a service monitor in NSX along these lines:

And I associated it with my pool:

Bear in mind the monitor has to check port 80, even though my pool might be on port 443, so be sure to change the monitor port as above.

The “Show Pool Statistics” link on the “Pools” section quickly tells us whether the member servers are up or not:

The show service loadbalancer pool command can be used to see what the issue is in case the monitor appears down. Here’s an example when things aren’t working:

Here’s an example when all is well:

Thanks to this document for pointing me in the right troubleshooting direction. Quoting from that document, the list of error codes:

UNK: Unknown

INI: Initializing

SOCKERR: Socket error

L4OK: Check passed on layer 4, no upper layers testing enabled

L4TOUT: Layer 1-4 timeout

L4CON: Layer 1-4 connection problem. For example, “Connection refused” (tcp rst) or “No route to host” (icmp)

L6OK: Check passed on layer 6

L6TOUT: Layer 6 (SSL) timeout

L6RSP: Layer 6 invalid response – protocol error. May caused as the:

Backend server only supports “SSLv3” or “TLSv1.0”, or

Certificate of the backend server is invalid, or

The cipher negotiation failed, and so on

L7OK: Check passed on layer 7

L7OKC: Check conditionally passed on layer 7. For example, 404 with disable-on-404

L7TOUT: Layer 7 (HTTP/SMTP) timeout

L7RSP: Layer 7 invalid response – protocol error

L7STS: Layer 7 response error. For example, HTTP 5xx

Nice!

Quick note to self on NSX Load Balancing

Inline mode == Transparent mode (the latter is the terminology in the UI).

In this mode the load balancer is usually the default gateway for the servers it load balances. Traffic comes to the load balancer, it sends to the appropriate server (after changing the destination IP of the packet – hence DNAT), and replies come to it as it is the default gateway for the server. Note that as far as the destination server is concerned the source IP address is not the load balancer but the client who made the request. Thus the destination server knows who is making the request.

When the load balancer replies to the client who made the request it changes the source IP of the reply from the selected server to its own IP (hence SNAT when replying only).

One-Armed mode == Proxy mode

In this mode the load balancer is not the default gateway. The servers it load balance don’t have any changes required to be made to them. The load balancer does a DNAT as before, but also changes the source IP to be itself rather than the client (hence SNAT). When the selected server replies this time, it thinks the source is the load balancer and so replies to it rather than the client. Thus there’s no changes required on the server side. Because of this though, the server doesn’t know who made the request. All requests appear to come from the load balancer (unless you use some headers to capture the info).

As before, when the load balancer replies to the client who made the request it changes the source IP of the reply from the selected server to its own IP (hence SNAT when replying too).

You set the inline/ transparent vs. one-armed/ proxy mode per pool.

To have load balancing in NSX you need to deploy an ESG (Edge Services Gateway). I don’t know why, but I always associated an ESG with just external routing so it took me by surprise (and still does) when I think I need to deploy an ESG for load balancing, DHCP, and other edge- sort of services (VPN, routing, etc). I guess the point to remember is that it’s not just a gateway – it’s an edge services gateway. :)

Anyways, feel free to deploy as many ESGs as you feel like. You can have one huge ESG that takes care of all your load balancing needs, or you can have multiple small ones and hand over control to the responsible teams.

This is a good starting point doc from VMware.

You can have L4 and L7 load balancing. If you need only L4 (i.e. TCP, UDP, port number) the UI calls it acceleration. It’s a global configuration, on the ESG instance itself, so bear that in mind.

If you enable acceleration on an ESG, you have to also enable it per virtual server.

L4 load balancing is packet based (obviously, coz it doesn’t need to worry about the application as such). L7 load balancing is socket based. Quoting from this doc (highlight mine):

Packet-based load balancing is implemented on the TCP and UDP layer. Packet-based load balancing does not stop the connection or buffer the whole request, it sends the packet directly to the selected server after manipulating the packet. TCP and UDP sessions are maintained in the load balancer so that packets for a single session are directed to the same server. You can select Acceleration Enable in both the global configuration and relevant virtual server configuration to enable packet-based load balancing.

Socket-based load balancing is implemented on top of the socket interface. Two connections are established for a single request, a client-facing connection and a server-facing connection. The server-facing connection is established after server selection. For HTTP socket-based implementation, the whole request is received before sending to the selected server with optional L7 manipulation. For HTTPS socket-based implementation, authentication information is exchanged either on the client-facing connection or on the server-facing connection. Socket-based load balancing is the default mode for TCP, HTTP, and HTTPS virtual servers.

Also worth noting:

The L4 VIP (“acceleration enabled” in the VIP configuration and no L7 setting such as AppProfile with cookie persistence or SSL-Offload) is processed before the edge firewall, and no edge firewall rule is required to reach the VIP. However, if the VIP is using a pool in non-transparent mode, the edge firewall must be enabled (to allow the auto-created SNAT rule).

The L7 HTTP/HTTPS VIPs (“acceleration disabled” or L7 setting such as AppProfile with cookie persistence or SSL-Offload) are processed after the edge firewall, and require an edge firewall allow rule to reach the VIP.

Application Profiles define common application behaviors client SSL, server SSL, x-forwarded-for, and persistence. These can be reused across virtual server and is mandatory when defining a virtual server. This is also where you can do HTTP redirects.

NSX Firewall no working on Layer3; OpenBSD VMware Tools; IP Discovery, etc.

I have two security groups. Network 1 VMs (a group that contains my VMs in the 192.168.1.0/24) and Network 2 VMs (similar, for 192.168.2.0/24 network). 

Both are dynamic groups. I select members based on whether the VM name contains -n1 or -n2. (The whole exercise is just for fun/ getting to know this stuff). 

I have two firewall rules making use of these rules. Layer 2 and Layer 3. 

The Layer 2 rule works but the Layer 3 one does not! Weird. 

I decided to troubleshoot this via the command line. Figured it would be a good opportunity.

To troubleshoot I have to check the rules on the hosts (because remember, that’s where the firewall is; it’s a kernel module in each host). For that I need to get the host-id. For which I need to get the cluster-id. Sadly there’s no command to list all hosts (or at least I don’t know of any). 

So now I have my host-ids.

Let’s also take a look the my VMs (thankfully it’s a short list! I wonder how admins do this in real life):

We can see the filters applying to each VM.  To summarize:

And are these filters applying on the hosts themselves?

Hmm, that too looks fine. 

Next I picked up one of the rule sets and explored it further:

The Layer 3 & Layer 2 rules are in separate rule sets. I have marked the ones which I am interested in. One works, the other doesn’t. So I checked the address sets used by both:

Tada! And there we have the problem. The address set for the Layer 3 rule is empty. 

I checked this for the other rules too – same situation. I modified my Layer 3 rule to specifically target the subnets:

And the address set for that rule is not empty:

And because of this the firewall rules do work as expected. Hmm.

I modified this rule to be a group with my OpenBSD VMs from each network explicitly added to it (i.e. not dynamic membership in case that was causing an issue). But nope, same result – empty address set!

But the address set is now empty. :o)

So now I have an idea of the problem. I am not too surprised by this because I vaguely remember reading something about VMware Tools and IP detection inside a VM (i.e. NSX makes use of VMware Tools to know the IP address of a VM) and also because I am aware OpenBSD does not use the official VMware Tools package (it has its own and that only provides a subset of functions).

Googling a bit on this topic I came across the IP address Discovery section in the NSX Admin guide – prior to NSX 6.2 if VMware Tools wasn’t installed (or was stopped) NSX won’t be able to detect the IP address of the VM. Post NSX 6.2 it can do DHCP & ARP snooping to work around a missing/ stopped VMware Tools. We configure the latter in the host installation page:

I am going to go ahead and enable both on all my clusters. 

That helped. But it needs time. Initially the address set was empty. I started pings from one VM to another and the source VM IP was discovered and put in the address set; but since the destination VM wasn’t in the list traffic was still being allowed. I stopped pings, started pings, waited a while … tried again … and by then the second VM IP to was discovered and put in the address set – effectively blocking communication between them. 

Side by side I installed a Windows 8.1 VM with VMware Tools etc and tested to see if it was being automatically picked up (I did this before enabling the snooping above). It was. In fact its IPv6 address too was discovered via VMware Tools and added to the list:

Nice! Picked up something interesting today. 

Notes to self while installing NSX 6.3 (part 4)

Reading through the VMware NSX 6.3 Install Guide after having installed the DLR and ESG in my home lab. Continuing from the DLR section.

As I had mentioned earlier NSX provides routing via DLR or ESG.  

  • DLR == Distributed Logical Router.
  • ESG == Edge Services Gateway

DLR consists of an appliance that provides the control plane functionality. This appliance does not do any routing itself. The actual routing is done by the VIBs on the ESXi hosts. The appliance uses the NSX Controller to push out updates to the ESXi host. (Note: Only DLR. ESG does not depend on the Controller to push out route). Couple of points to keep in mind:

  • A DLR instance cannot connect to logical switches in different transport zones. 
  • A DLR cannot connect to a dvPortgroup with VLAN ID 0.
  • A DLR cannot connect to a dvPortgroup with VLAN ID if that DLR also connects to logical switches spanning more than one VDS. 
    • This confused me. Why would a logical switch span more than one VDS? I dunno. There are reasons probably, same way you could have multiple clusters in same data center having different VDSes instead of using the same one. 
  • If you have portgroups on different VDSes with the same VLAN ID, and these VDSes share some hosts, then DLR cannot connect these. 

I am not entirely clear with the above points. It’s more to enforce the transport zones and logical switches align correctly, but I haven’t entirely understood it so I am simply going to make note as above and move on …

In a DLR the firewall rules only apply to the uplink interface and are limited to traffic destined for the edge virtual appliance. In other words they don’t apply to traffic between the logical switches a DLR instance connects. (Note that this refers to the firwall settings found under the DLR section, not in the Firewall section of NSX). 

A DLR has many interfaces. The one exposed to VMs for routing is the Logical InterFace (LIF). Here’s a screenshot from the interfaces on my DLR. 

The ones of type ‘Internal’ are the LIFs. These are the interfaces that the DLR will route between. Each LIF connects to a separate network – in my case a logical switch each. The IP address assigned to this LIF will be the address you set as gateway for the devices in that network. So for example: one of the LIFs has an IP address 192.168.1.253 and connects to my 192.168.1.0/24 segment. All the VMs there will have 192.168.1.253 as their default gateway. Suppose we ignore the ‘Uplink’ interface for now (it’s optional, I created it for the external routing to work), and all our DLR had were the two ‘Internal’ LIFs, and VMs on each side had the respective IP address set as their default gateway, then our DLR will enable routing between these two networks. 

Unlike a physical router though, which exists outside the virtual network and which you can point to as “here’s my router”, there’s no such concept with DLRs. The DLR isn’t a VM which you can point to as your router. Nor is it a VM to which packets between these networks (logical switches) are sent to for routing. The DLR, as mentioned above, is simply your ESXi hosts. Each ESXi host that has logical switches which a DLR connects into has this LIF created in them with that LIF IP address assigned to it and a virtual MAC so VMs can send packets to it. The DLR is your ESXi host. (That is pretty cool, isn’t it! I shouldn’t be amazed because I had mentioned it earlier when reading about all this, but it is still cool to actually “see” it once I have implemented).

Above screenshot is from my two VMs on the same VXLAN but on different hosts. Note that the default gateway (192.168.1.253) MAC is the same for both. Each of their hosts will respond to this MAC entry. 

(Note to self: Need to explore the net-vdr command sometime. Came across it as I was Googling on how to find the MAC address table seen by the LIF on a host. Didn’t want to get side-tracked so didn’t explore too much. There’s something called a VDR (not encountered it yet in my readings).

  • net-vdr -I -l will list all the VDRs on a host.
  • net-vdr -L -l <vdrname> will list the LIFs.
  • net-vdr -N -l <vdrname> will list the MAC addresses (ARP info)

)

When creating a DLR it is possible to create it with or without the appliance. Remember that the appliance provides the control plane functionality. It is the appliance that learns of new routes etc and pushes to the DLR modules in the ESXi hosts. Without an appliance the DLR modules will do static routing (which might be more than enough, especially in a test environment like my nested lab for instance) so it is ok to skip it if your requirements are such. Adding an appliance means you get to (a) select if it is deployed in HA config (i.e. two appliance), (b) their locations etc, (c) IP address and such for the appliance, as well as enabling SSH. The appliance is connected to a different interface for HA and SSH – this is independent of the LIFs or Uplink interfaces. That interface isn’t used for any routing. 

Apart from the control plane, the appliance also controls the firewall on the DLR. If there’s no appliance you can’t make any firewall changes to the DLR – makes sense coz there’s nothing to change. You won’t be connecting to the DLR for SSH or anything coz you do that to the appliance on the HA interface. 

According to the docs you can’t add an appliance once a DLR instance is deployed. Not sure about that as I do see an option to deploy an appliance on my non-appliance DLR instance. Maybe it will fail when I actually try and create the appliance – I didn’t bother trying. 

Discovered this blog post while Googling for something. I’ve encountered & linked to his posts previously too. He has a lot of screenshots and step by step instructions. So worth a check out if you want to see some screenshots and much better explanation than me. :) Came across some commands from his blog which can be run on the NSX Controller to see the DLRs it is aware of and their interfaces. Pasting the output from my lab here for now, I will have to explore this later …

I have two DLRs. One has an appliance, other doesn’t. I made these two, and a bunch of logical switches to hook these to, to see if there’s any difference in functionality or options.

One thing I realized as part of this exercise is that a particular logical switch can only connect to one DLR. Initially I had one DLR which connected to 192.168.1.0/24 and 192.168.2.0/24. Its uplink was on logical switch 192.168.0.0/24 which is where the ESG too hooked into. Later when I made one more DLR with its own internal links and tried to connect its uplink to the 192.168.0.0/24 network used by the previous DLR, I saw that it didn’t even appear in the list of options. That’s when I realized its better to use a smaller range logical switch for the uplinks – like say a /30 network. This way each DLR instance connects to an ESG on its own /30 network logical switch (as in the output above). 

A DLR can have up to 8 uplink interfaces and 1000 internal interfaces.


Moving on to ESG. This is a virtual appliance. While a DLR provides East-West routing (i.e. within the virtual environment), an ESG provides North-South routing (i.e. out of the virtual environment). The ESG also provides services such as DHCP, NAT, VPN, and Load Balancing. (Note to self: DLR does not provide DHCP or Load Balancing as one might expect (at least I did! :p). DLR provides DHCP Relay though). 

The uplink of an ESG will be a VDS (Distributed Switch) as that’s what eventually connects an ESXi environment to the physical network. 

An ESG needs an appliance to be deployed. You can enable/ disable SSH into this appliance. If enabled you can SSH into the ESG appliance from the uplink address or from any of the internal link IP addresses. In contrast, you can only SSH into a DLR instance if it has an associated appliance. Even then, you cannot SSH into the appliance from the internal LIFs (coz these don’t really exist, remember … they are on each ESXi host). With a DLR we have to SSH into the interface used for HA (this can be used even if there’s only one appliance and hence no HA). 

When deploying an ESG appliance HA can be enabled. This deploys two appliances in an active/passive mode (and the two appliances will be on separate hosts). These two appliances will talk to each other to keep in sync via one of the internal interfaces (we can specify one, or NSX will just choose any). On this internal interface the appliances will have a link local IP address (a /30 subnet from 169.254.0.0/16) and communicate over that (doesn’t matter that there’s some other IP range actually used in that segment, as these are link local addresses and unlikely anyone’s going to actually use them). In contrast, if a DLR appliance is deployed with HA we need to specify a separate network from the networks that it be routing between. This can be a logical switch or a DVS, and as with ESG the two appliances will have link local IP addresses (a /30 subnet from 169.254.0.0/16) for communication. Optionally, we can specify an IP address in this network via which we can SSH into the DLR appliance (this IP address will not be used for HA, however).

After setting up all this, I also created two NAT rules just for kicks. 

And with that my basic setup of NSX is complete! (I skipped OSPF as I don’t think I will be using it any time soon in my immediate line of work; and if I ever need to I can come back to it later). Next I need to explore firewalls (micro-segmentation) and possibly load balancing etc … and generally fiddle around with this stuff. I’ve also got to start figuring out the troubleshooting and command-line stuff. But the base is done – I hope!

Yay! (VXLAN) contd. + Notes to self while installing NSX 6.3 (part 3)

Finally continuing with my NSX adventures … some two weeks have past since my last post. During this time I moved everything from VMware Workstation to ESXi. 

Initially I tried doing a lift and shift from Workstation to ESXi. Actually, initially I went with ESXi 6.5 and that kept crashing. Then I learnt it’s because I was using the HPE customized version of ESXi 6.5 and since the server model I was using isn’t supported by ESXi 6.5 it has a tendency to PSOD. But strangely the non-HPE customized version has no issues. But after trying the HPE version and failing a couple of times, I gave up and went to ESXi 5.5. Set it up, tried exporting from VMware Workstation to ESXi 5.5, and that failed as the VM hardware level on Workstation was newer than ESXi. 

Not an issue – I fired up VMware Converter and converted each VM from Workstation to ESXi. 

Then I thought hmm, maybe the MAC addresses will change and that will cause an issue, so I SSH’ed into the ESXi host and manually changed the MAC addresses of all my VMs to whatever it was in Workstation. Also changed the adapters to VMXNet3 wherever it wasn’t. Reloaded the VMs in ESXi, created all the networks (portgroups) etc, hooked up the VMs to these, and fired them up. That failed coz the MAC address ranges were of VMware Workstation and ESXi refuses to work with those! *grr* Not a problem – change the config files again to add a parameter asking ESXi to ignore this MAC address problem – and finally it all loaded. 

But all my Windows VMs had their adapters reset to a default state. Not sure why – maybe the drivers are different? I don’t know. I had to reconfigure all of them again. Then I turned to OpnSense – that too had reset all its network settings, so I had to configure those too – and finally to nested ESXi hosts. For whatever reason none of them were reachable; and worse, my vCenter VM was just a pain in the a$$. The web client kept throwing some errors and simply refused to open. 

That was the final straw. So in frustration I deleted it all and decided to give up.

But then …

I decided to start afresh. 

Installed ESXi 6.5 (the VMware version, non-HPE) on the host. Created a bunch of nested ESXi VMs in that from scratch. Added a Windows Server 2012R2 as the shared iSCSI storage and router. Created all the switches and port groups etc, hooked them up. Ran into some funny business with the Windows Firewall (I wanted to assign some interface as Private, others as Public, and enable firewall only only the Public ones – but after each reboot Windows kept resetting this). So I added OpnSense into the mix as my DMZ firewall.

So essentially you have my ESXi host -> which hooks into an internal vSwitch portgroup that has the OpnSense VM -> which hooks into another vSwitch portgroup where my Server 2012R2 is connected to, and that in turn connects to another vSwitch portgroup (a couple of them actually) where my ESXi hosts are connected to (need a couple of portgroup as my ESXi hosts have to be in separate L3 networks so I can actually see a benefit of VXLANs). OpnSense provides NAT and firewalling so none of my VMs are exposed from the outside network, yet they can connect to the outside network if needed. (I really love OpnSense by the way! An amazing product). 

Then I got to the task of setting these all up. Create the clusters, shared storage, DVS networks, install my OpenBSD VMs inside these nested EXSi hosts. Then install NSX Manager, deploy controllers, configure the ESXi hosts for NSX, setup VXLANs, segment IDs, transport zones, and finally create the Logical Switches! :) I was pissed off initially at having to do all this again, but on the whole it was good as I am now comfortable setting these up. Practice makes perfect, and doing this all again was like revision. Ran into problems at each step – small niggles, but it was frustrating. Along the way I found that my (virtual) network still does not seem to support large MTU sizes – but then I realized it’s coz my Server 2012R2 VM (which is the router) wasn’t setup with the large MTU size. Changed that, and that took care of the MTU issue. Now both Web UI and CLI tests for VXLAN succeed. Finally!

Third time lucky hopefully. Above are my two OpenBSD VMs on the same VXLAN, able to ping each other. They are actually on separate L3 ESXi hosts so without NSX they won’t be able to see each other. 

Not sure why there are duplicate packets being received. 

Next I went ahead and set up a DLR so there’s communicate between VXLANs. 

Yeah baby! :o)

Finally I spent some time setting up an ESG and got these OpenBSD VMs talking to my external network (and vice versa). 

The two command prompt windows are my Server 2012R2 on the LAN. It is able to ping the OpenBSD VMs and vice versa. This took a bit more time – not on the NSX side – as I forgot to add the routing info on the ESG for my two internal networks (192.168.1.0/24 and 192.168.2.0/24) as well on the Server 2012R2 (192.168.0.0/16). Once I did that routing worked as above. 

I am aware this is more of a screenshots plus talking post rather than any techie details, but I wanted to post this here as a record for myself. I finally got this working! Yay! Now to read the docs and see what I missed out and what I can customize. Time to break some stuff finally (intentionally). 

:o)

Yay! (VXLAN) contd. + Notes to self while installing NSX 6.3 (part 2)

In my previous post I said the following (in gray). Here I’d like to add on:

  • A VDS uses VMKernel ports (vmk ports) to carry out the actual traffic. These are virtual ports bound to the physical NICs on an ESXi host, and there can be multiple vmk ports per VDS for various tasks (vMotion, FT, etc). Similar to this we need to create a new vmk port for the host to connect into the VTEP used by the VXLAN. 
    • Unlike regular vmk ports though we don’t create and assign IP addresses manually. Instead we either use DHCP or create an IP pool when configuring the VXLAN for a cluster. (It is possible to specify a static IP either via DHCP reservation or as mentioned in the install guide).
      • The number of vmk ports (and hence IP addresses) corresponds to the number of uplinks. So a host with 2 uplinks will have two VTEP vmk ports, hence two IP addresses taken from the pool. Bear that in mind when creating the pool.
    • Each cluster uses one VDS for its VXLAN traffic. This can be a pre-existing VDS – there’s nothing special about it just that you point to it when enabling VXLAN on a cluster; and the vmk port is created on this VDS. NSX automatically creates another portgroup, which is where the vmk port is assigned to.
    • VXLANs are created on this VDS – they are basically portgroups in the VDS. Each VXLAN has an ID – the VXLAN Network Identifier (VNI) – which NSX refers to as segment IDs. 
      • Before creating VXLANS we have to allocate a pool of segment IDs (the VNIs) taking into account any VNIs that may already be in use in the environment.
      • The number of segment IDs is also limited by the fact that a single vCenter only supports a maximum of 10,000 portgroups
      • The web UI only allows us to configure a single segment ID range, but multiple ranges can be configured via the NSX API
  • Logical Switch == VXLAN -> which has an ID (called segment ID or VNI) == Portgroup. All of this is in a VDS. 

While installing NSX I came across “Transport Zones”.

Remember ESXi hosts are part of a VDS. VXLANs are created on a VDS. Each VXLAN is a portgroup on this VDS. However, not all hosts need be part of the same VXLANs, but since all hosts are part of the same VDS and hence have visibility to all the VXLANs we need same way of marking which hosts are part of a VXLAN. We also need some place to identify if a VXLAN is in unicast, multicast, or hybrid mode. This is where Transport Zones come in.

If all your VXLANs are going to behave the same way (multicast etc) and have the same hosts, then you just need one transport zone. Else you would create separate zones based on your requirement. (That said, when you create a Logical Switch/ VXLAN you have an option to specify the control plane mode (multicast mode etc). Am guessing that overrides the zone setting, so you don’t need to create separate zones just to specify different modes). 

Note: I keep saying hosts above (last two paragraphs) but that’s not correct. It’s actually clusters. I keep forgetting, so thought I should note it separately here rather the correct my mistake above. 1) VXLANs are configured on clusters, not hosts. 2) All hosts within a cluster must be connected to a common VDS (at least one common VDS, for VXLAN purposes). 3) NSX Controllers are optional and can be skipped if you are using multicast replication? 4) Transport Zones are made up of clusters (i.e. all hosts in a cluster; you cannot pick & choose just some hosts – this makes sense when you think that a cluster is for HA and DRS so naturally you wouldn’t want to exclude some hosts from where a VM can vMotion to as this would make things difficult). 

Worth keeping in mind: 1) A cluster can belong to multiple transport zones. 2) A logical switch can belong to only one transport zone. 3) A VM cannot be connected to logical switches in different transport zones. 4) A DLR (Distributed Logical Router) cannot connect to logical switches in multiple transport zones. Ditto for an ESG (Edge Services Gateway). 

After creating a transport zone, we can create a Logical Switch. This assigns a segment ID from the pool automatically and this (finally!!) is your VXLAN. Each logical switch creates yet another portgroup. Once you create a logical switch you can assign VMs to it – that basically changes their port group to the one created by the logical switch. Now your VMs will have connectivity to each other even if they are on hosts in separate L3 networks. 

Something I hadn’t realized: 1) Logical Switches are created on Transport Zones. 2) Transport Zones are made up of / can span clusters. 3) Within a cluster the logical switches (VXLANs) are created on the VDS that’s common to the cluster. 4) What I hadn’t realized was this: no where in the previous statements is it implied that transport zones are limited to a single VDS. So if a transport zone is made up of multiple clusters, each / some of which have their own common VDS, any logical switch I create will be created on all these VDSes.  

Sadly, I don’t feel like saying yay at the this point unlike before. I am too tired. :(

Which also brings me to the question of how I got this working with VMware Workstation. 

By default VMware Workstation emulates an e1000 NIC in the VMs and this doesn’t support an MTU larger than 1500 bytes. We can edit the .VMX file of a VM and replace “e1000” with “vmxnet3” to replace the emulated Intel 82545EM Gigabit Etherne NIC with a paravirtual VMXNET3 NIC to the VMs. This NIC supports an MTU larger than 1500 bytes and VXLAN will begin working. One thing though: a quick way of testing if the VTEP VMkernel NICs are able to talk to each other with a larger MTU is via a command such as ping ++netstack=vxlan -I vmk3 -d -s 1600 xxx.xxx.xxx.xxx. If you do this once you add a VMXNET3 NIC though, it crashes the ESXi host. I don’t know why. It only crashes when using the VXLAN network stack; the same command with any other VMkernel NIC works fine (so I know the MTU part is ok). Also, when testing the Logical Switch connectivity via the Web UI (see example here) there’s no crash with a VXLAN standard test packet – maybe that doesn’t use the VXLAN network stack? I spent a fair bit of time chasing after the ping ++netstack command until I realized that even though it was crashing my host the VXLAN was actually working!

Before I conclude a hat-tip to this post for the Web UI test method and also for generally posting how the author set up his NSX test lab. That’s an example of how to post something like this properly, instead of the stream of thoughts my few posts have been. :)

Update: Short lived happiness. Next step was to create an Edge Services Gateway (ESG) and there I bumped into the MTU issues. And this time when I ran hte test via the Web UI it failed and crashed the hosts. Disappointed, I decided it was time to move on from VMware Workstation. :-/

Update 2: Continued here … 

Yay! (VXLAN)

I decided to take a break from my NSX reading and just go ahead and set up a VXLAN in my test lab. Just go with a hunch of what I think the options should be based on what the menus ask me and what I have read so far. Take a leap! :)

*Ahem* The above is actually incorrect, and I am an idiot. A super huge idiot! Each VM is actually just pinging itself and not the other. Unbelievable! And to think that I got all excited thinking I managed to do something without reading the docs etc. The steps below are incomplete. I should just delete this post, but I wrote this much and had a moment of excitement that day … so am just leaving it as it is with this note. 

Above we have two OpenBSD VMs running in my nested EXIi hypervisors. 

  • obsd-01 is running on host 1, which is on network 10.10.3.0/24.
  • obsd-02 is running on host 2, which is on network 10.10.4.0/24. 
  • Note that each host is on a separate L3 network.
  • Each host is in a cluster of its own (doesn’t matter but just mentioning) and they connect to the same VDS.
  • In that VDS there’s a port group for VMs and that’s where obsd-01 and obsd-02 connect to. 
  • Without NSX, since the hosts are on separate networks, the two VMs wouldn’t be able to see each other. 
  • With NSX, I am able to create a VXLAN network on the VDS such that both VMs are now on the same network.
    • I put the VMs on a 192.168.0.0/24 network so that’s my overlay network. 
    • VXLANs are basically port groups within your NSX enhanced VDS. The same way you don’t specify IP/ network information on the VMware side when creating a regular portgroup, you don’t do anything when creating the VXLAN portgroup either. All that is within the VMs on the portgroup.
  • A VDS uses VMKernel ports (vmk ports) to carry out the actual traffic. These are virtual ports bound to the physical NICs on an ESXi host, and there can be multiple vmk ports per VDS for various tasks (vMotion, FT, etc). Similar to this we need to create a new vmk port for the host to connect into the VTEP used by the VXLAN. 
    • Unlike regular vmk ports though we don’t create and assign IP addresses manually. Instead we either use DHCP or create an IP pool when configuring the VXLAN for a cluster. (It is possible to specify a static IP either via DHCP reservation or as mentioned in the install guide). 
    • Each cluster uses one VDS for its VXLAN traffic. This can be a pre-existing VDS – there’s nothing special about it just that you point to it when enabling VXLAN on a cluster; and the vmk port is created on this VDS. NSX automatically creates another portgroup, which is where the vmk port is assigned to. 

And that’s where I am so far. After doing this I went through the chapter for configuring VXLAN in the install guide and I was pretty much on the right track. Take a look at that chapter for more screenshots and info. 

Yay, my first VXLAN! :o)

p.s. I went ahead with OpenBSD in my nested environment coz (a) I like OpenBSD (though I have never got to play around much with it); (b) it has a simple & fast install process and I am familiar with it; (c) the ISO file is small, so doesn’t take much space in my ISO library; (d) OpenBSD comes with VMware tools as part of the kernel, so nothing additional to install; (e) I so love that it still has a simple rc based system and none of that systemd stuff that newer Linux distributions have (not that there’s anything wrong with systemd just that I am unfamiliar with it and rc is way simpler for my needs); (f) the base install has manpages for all the commands unlike minimal Linux ISOs that usually seem to skip these; (g) take a look at this memory usage! :o)

p.p.s. Remember to disable the PF firewall via pfctl -d.

Yay again! :o)

Update: Short-lived excitement sadly. A while later the VMs stopped communicating. Turns out VMware Workstation doesn’t support MTU larger than 1500 bytes, and VXLAN requires 1600 byte. So the VTEP interfaces of both ESXi hosts are unable to talk to each other. Bummer!

Update 2: I finally got this working. Turns out I had missed some stuff; and also I had to make some changes to allows VMware Workstation to with larger MTU sizes. I’ll blog this in a later post

Notes to self while installing NSX 6.3 (part 1)

(No sense or order here. These are just notes I took when installing NSX 6.3 in my home lab, while reading this excellent NSX for Newbies series and the NSX 6.3 install guide from VMware (which I find to be quite informative). Splitting these into parts as I have been typing this for a few days).

You can install NSX Manager in VMware Workstation (rather than in the nested ESXi installation if you are doing it in a home lab). You won’t get a chance to configure the IP address, but you can figure it from your DHCP server. Browse to that IP in a browser and login as username “admin” password “default” (no double quotes). 

If you want to add a certificate from your AD CA to NSX Manager create the certificate as usual in Certificate Manager. Then export the generated certificate and your root CA and any intermediate CA certificates as a “Base-64 encoded X.509 (.CER)” file. Then concatenate all these certificates into a single file (basically, open up Notepad and make a new file that has all these certificates in it). Then you can import it into NSX Manager. (More details here).

During the Host Preparation step on an ESXi 5.5 host it failed with the following error: 

“Could not install image profile: ([], “Error in running [‘/etc/init.d/vShield-Stateful-Firewall’, ‘start’, ‘install’]:\nReturn code: 1\nOutput: vShield-Stateful-Firewall is not running\nwatchdog-dfwpktlogs: PID file /var/run/vmware/watchdog-dfwpktlogs.PID does not exist\nwatchdog-dfwpktlogs: Unable to terminate watchdog: No running watchdog process for dfwpktlogs\nFailed to release memory reservation for vsfwd\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nResource pool ‘host/vim/vmvisor/vsfwd’ release failed. retrying..\nSet memory minlimit for vsfwd to 256MB\nFailed to set memory reservation for vsfwd to 256MB, trying for 256MB\nFailed to set memory reservation for vsfwd to failsafe value of 256MB\nMemory reservation released for vsfwd\nResource pool ‘host/vim/vmvisor/vsfwd’ released.\nResource pool creation failed. Not starting vShield-Stateful-Firewall\n\nIt is not safe to continue. Please reboot the host immediately to discard the unfinished update.”)” Error 3/16/2017 5:17:49 AM esx55-01.fqdn

Initially I thought maybe NSX 6.3 wasn’t compatible with ESXi 5.5 or that I was on an older version of ESXi 5.5 – so I Googled around on pre-requisites (ESXi 5.5 seems to be fine) and also updated ESXi 5.5 to the latest version. Then I took a closer look at the error message above and saw the bit about the 256MB memory reservation. My ESXi 5.5 host only had 3GB RAM (I had installed with 4GB and reduced it to 3GB) so I bumped it up to 4GB RAM and tried again. And voila! the install worked. So NSX 6.3 requires an ESXi 5.5 host with minimum 4GB RAM (well maybe 3.5 GB RAM works too – I was too lazy to try!) :o)

If you want, you can browse to “https://<NSX_MANAGER_IP>/bin/vdn/nwfabric.properties” to manually download the VIBs that get installed as part of the Host Preparation. This is in case you want to do a manual install (thought had crossed my mind as part of troubleshooting above).

NSX Manager is your management layer. You install it first and it communicates with vCenter server. A single NSX Manager install is sufficient. There’s one NSX Manager per vCenter. 

The next step after installing NSX Manager is to install NSX Controllers. These are installed in odd numbers to maintain quorum. This is your control plane. Note: No data traffic flows through the controllers. The NSX Controllers perform many roles and each role has a master controller node (if this node fails another one takes its place via election). 

Remember that in NSX the VXLAN is your data plane. NSX supports three control plane modes: multicast, unicast, and hybrid when it comes to BUM (Broadcast, unknown Unicast, and Multicast) traffic. BUM traffic is basically traffic that doesn’t have a specific Layer 3 destination. (More info: [1], [2], [3] … and many on the Internet but these three are what I came across initially via Google searches).

  • In unicast mode a host replicates all BUM traffic to all other hosts on the same VXLAN and also picks a host in every other VXLAN to do the same for hosts in their VXLANs. Thus there’s no dependence on the underlying hardware. There could, however, be increased traffic as the number of VXLANs increase. Note that in the case of unknown unicast the host first checks with the NSX Controller for more info. (That’s the impression I get at least from the [2] post above – I am not entirely clear). 
  • In multicast mode a host depends on the underlying networking hardware to replicate BUM traffic via multicast. All hosts on all VXLAN segments join multicast groups so any BUM traffic can be replicated by the network hardware to this multicast group. Obviously this mode requires hardware support. Note that multicast is used for both Layer 2 and Layer 3 here. 
  • In hybrid mode some of the BUM traffic replication is handed over to the first hop physical switch (so rather than a host sending unicast traffic to all other hosts connected to the same physical switch it relies on the switch to do this) while the rest of the replication is done by the host to hosts in other VXLANs. Note that multicast is used only for Layer 2 here. Also note that as in the unicast mode, in the case of unknown unicast traffic the Controller is consulted first. 

NSX Edge provides the routing. This is either via the Distributed Logical Router (DLR), which is installed on the hypervisor + a DLR virtual appliance; or via the Edge Services Gateway (ESG), which is a virtual appliance. 

  • A DLR can have up to 8 uplink interfaces and 1000 internal interfaces.
    • A DLR uplink typically connects to an ESG via a Layer 2 logical switch. 
    • DLR virtual appliance can be set up in HA mode – in an active/ standby configuration.
      • Created from NSX Manager?
    • The DLR virtual appliance is the control plane – it supports dynamic routing protocols and exchanges routing updates with Layer 3 devices (usually ESG).
      • Even if this virtual appliance is down the routing isn’t affected. New routes won’t be learnt that’s all.
    • The ESXi hypervisors have DLR VIBs which contain the routing information etc. got from the controllers (note: not from the DLR virtual appliance). This the data layer. Performs ARP lookup, route lookup etc. 
      • The VIBs also add a Logical InterFace (LIF) to the hypervisor. There’s one for each Logical Switch (VXLAN) the host connects to. Each LIF, of each host, is set to the default gateway IP of that Layer 2 segment. 
  • An ESG can have up to 10 uplink and internal interfaces. (With a trunk an ESG can have up to 200 sub-interfaces). 
    • There can be multiple ESG appliances in a datacenter. 
    • Here’s how new routes are learnt: NSX Edge Gateway (EGW) learns a new route -> This is picked up by the DLR virtual appliance as they are connected -> DLR virtual appliance passes this info to the NSX Controllers -> NSX Controllers pass this to the ESXi hosts.
    • The ESG is what connects to the uplink. The DLR connects to ESG via a Logical Switch. 

Logical Switch – this is the switch for a VXLAN. 

NSX Edge provides Logical VPNs, Logical Firewall, and Logical Load Balancer.