Contact

Subscribe via Email

Subscribe via RSS/JSON

Categories

Recent Posts

Creative Commons Attribution 4.0 International License
© Rakhesh Sasidharan

Elsewhere

Using SolarWinds to highlight servers in a pending reboot status

Had a request to use SolarWinds to highlight servers in a pending reboot status. Here’s what I did.

Sorry, this is currently broken. After implementing this I realized I need to enable PowerShell remoting on all servers for it to work, else the script just returns the result from the SolarWinds server. Will update this post after I fix it at my workplace. If you come across this post before that, all you need to do is enable PowerShell remoting across all your servers and change the script execution to “Remote Host”.

SolarWinds has a built in application monitor called “Windows Update Monitoring”. It does a lot more than what I want so I disabled all the components I am not interested in. (I could have also just created a new application monitor, I know, just was lazy).

winupdatemon-1

The part I am interested in is the PowerShell Monitor component. By default it checks for the reboot required status by checking a registry key: HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\WindowsUpdate\Auto Update\RebootRequired. Here’s the default script –

Inspired by this blog post which monitors three more registry keys and also queries ConfigMgr, I replaced the default PowerShell script with the following –

Then I added the application monitor to all my Windows servers. The result is that I can see the following information on every node –

winupdatemon-2

Following this I created alerts to send me an email whenever the status of the above component (“Machine restart status …”) went down for any node. And I also created a SolarWinds report to capture all nodes for which the above component was down.

winupdatemon-3

Then I assigned this to a schedule to run once in a month after our patching window to email me a list of nodes that require reboots.

 

Solarwinds AppInsight for IIS – doing a manual install – and hopefully fixing invalid signature (error code: 16007)

AppInsight from Solarwinds is pretty cool. At least the one for Exchange is. Trying out the one for IIS now. Got it configured on a few of our servers easily but it failed on one. Got the following error –

appinsight-error

Bummer!

Manual install it is then. (Or maybe not! Read on and you’ll see a hopeful fix that worked for me).

First step in that is to install PowerShell (easy) and the IIS PowerShell snap-in. The latter can be downloaded from here. This downloads the Web Platform Installer (a.k.a. “webpi” for short) and that connects to the Internet to download the goods. In theory it should be easy, in practice the server doesn’t have connectivity to the Internet except via a proxy so I have to feed it that information first. Go to C:\Program Files\Microsoft\Web Platform Installer for that, find a file called WebPlatformInstaller.exe.config, open it in Notepad or similar, and add the following lines to it –

This should be within the <configuration> -- </configuration> block. Didn’t help though, same error.

webpi-error

Time to look at the logs. Go to %localappdata%\Microsoft\Web Platform Installer\logs\webpi for those.

From the logs it looked like the connection was going through –

But the problem was this –

If I go to the link – https://www.microsoft.com/web/webpi/5.0/webproductlist.xml – via IE on that server I get the following –

untrusted-cert

 

However, when I visit the same link on a different server there’s no error.

Interesting. I viewed the untrusted certificate from IE on the problem server and compared it with the certificate from the non-problem server.

Certificate on the problem server

Certificate on the problem server

Certificate on a non-problem server

Certificate on a non-problem server

Comparing the two I can see that the non-problem server has a VeriSign certificate in the root of the path, because of which there’s a chain of trust.

verisign - g5

If I open Certificate Manager on both servers (open mmc > Add/ Remove Snap-Ins > Certificates > Add > Computer account) and navigate to the “Trusted Root Certification Authorities” store) on both servers I can see that the problem server doesn’t have the VeriSign certificate in its store while the other server has.

cert manager - g5

So here’s what I did. :) I exported the certificate from the server that had it and imported it into the “Trusted Root Certification Authorities” store of the problem server. Then I closed and opened IE and went to the link again, and bingo! the website opens without any issues. Then I tried the Web Platform Installer again and this time it loads. Bam!

The problem though is that it can’t find the IIS PowerShell snap-in. Grr!

no snap-in

no snap-in 2

That sucks!

However, at this point I had an idea. The SolarWinds error message was about an invalid signature, and what do we know of that can cause an invalid signature? Certificate issues! So now that I have installed the required CA certificate for the Web Platform Installer, maybe it sorts out SolarWinds too? So I went back and clicked “Configure Server” again and bingo! it worked this time. :)

Hope this helps someone.

Solarwinds – “The WinRM client cannot process the request”

Added the Exchange 2010 Database Availability Group application monitor to couple of our Exchange 2010 servers and got the following error –

error1

Clicking “More” gives the following –

error2

This is because Solarwinds is trying to run a PowerShell script on the remote server and the script is unable to run due to authentication errors. That’s because Solarwinds is trying to connect to the server using its IP address, and so instead of using Kerberos authentication it resorts to Negotiate authentication (which is disabled). The error message too says the same but you can verify it for yourself from the Solarwinds server too. Try the following command

This is what’s happening behind the scenes and as you will see it fails. Now replace “Negotiate” with “Kerberos” and it succeeds –

So, how to fix this? Logon to the remote server and launch IIS Manager. It’s under “Administrative Tools” and may not be there by default (my server only had “Internet Information Services (IIS) 6.0 Manager”), in which case add it via Server Manager/ PowerShell –

Then open IIS Manager, go to Sites > PowerShell and double click “Authentication”.

iis-1

Select “Windows Authentication” and click “Enable”.

iis-2

Now Solarwinds will work.

Create Solarwinds account limitations based on Custom Properties

I wanted to create account limitations in Solarwinds based on Custom Properties but the web console by default doesn’t give an option to do that.

default limitationsThen I came across this helpful video.

The trick is to logon to your Solarwinds server and find the “Account Limitation Builder”. Then click “Add” and create a new limitation similar to this –

new limitation

Now the limitation you create will come in the list –

new limitation2

Select that, and in the next screen you can choose the value you want to limit to –

new limitation3

Removing a monitored resource from multiple nodes in Solarwinds

Had to remove a drive from being monitored from multiple servers in Solarwinds. Rather than go to each node, edit its properties and untick the drive, I figured that if you do a search for the nodes and expand each of them it’s possible to tick the ones you don’t need and click “Delete”.

multiple nodes

Nice!

Using Solarwinds to monitor Windows Services

This is similar to how I monitored performance counters with Solarwinds.

I want to monitor a bunch of AppSense services.

Similar to the performance counters where I created an application monitor template so I could define the threshold values, here I need to create an application monitor so that the service appears in the alert manager. This part is easy (and similar to what I did for the performance counters) so I’ll skip and just put a screenshot like below –

applimonitor

I created a separate application monitor template but you could very well add it to one of the standard templates (if it’s a service that’s to be monitored on all nodes for instance).

Now for the part where you create alerts.

Initially I thought this would be a case of creating triggers when the above application monitor goes down. Something like this – alert1

And create an alert message like this –

alert1a

With this I was hoping to get a one or more alert messages only for the services that actually went down. Instead, what happened is that whenever any one service went down I’d get an alert for the service that went down and also a message for the services that were up. Am guessing since I was triggering on the application monitor, Solarwinds helpfully sent the status for each of its components – up or down.

alert2

The solution is to modify your trigger such that you target each component.

alert3

Now I get alerts the way I want.

Hope this helps!

Mute Solarwinds alerts during reboots/ maintenance windows

I wanted to mute Solarwinds alerts during our patch weekends when all servers are rebooted because they have to be and our mailboxes get flooded with Solarwinds alerts. I decided to use custom properties for this purpose. Here’s what I did.

Login to the Solarwinds web console. Go to the “Settings” page, and then “Manage Custom Properties” under “Node *& Group Management”.

Click “Add Custom Property”, select the default of “Nodes” from the drop down, and create something along the following lines –

customproperties

Select the nodes you’d like to apply this custom property to. I chose to apply it on all my Windows and VMware nodes. Set the value to be “No”.

customvalues

Now login to Orion Alerts Manager and pick an alert you’d like to mute during patch weekends. Go to its “Alert Suppression” tab and add a condition on the custom property we created earlier.

alert custom properties

alert custom properties2

And that’s it, really!

Update: Not sure why, but the above didn’t seem to work for me. So I added the Mute_Alerts check as part of the trigger condition itself.

new trigger

Note: If you don’t get the custom property in Orion, close and restart it as an administrator (i.e. right click and do “Run as Administrator” even if you are already running it with an admin account). Not sure why, but until I did that the custom property didn’t get picked up. You only need to do it one time; after that you can launch Orion normally.

Next time your server estate is being rebooted/ undergoing maintenance, login to Solarwinds webconsole and change the “Mute_Alerts” custom property to “Y” for a node/ nodes that you want to mute alerts for. Below I show how I will mute the alerts for all my Windows nodes.

Go to “Manage Nodes”. Group by “Vendor” and select Windows. Then select all nodes. (The checkbox to select all nodes got blanked out in the screenshot below but it’s easy to find).

apply custom property

Then click on “Custom Property Editor” to get to the screen below.

apply custom property

Here too select all the nodes and click “Edit multiple values”.

From the drop down, change the value for “Mute_Alerts” to “true”. Then save changes and that’s it. :)

 

Using Solarwinds to monitor Windows Performance Monitor (perfmon) Counters

Had a request from our Exchange admin to setup Solarwinds alerts for some of our Exchange servers based on Performance Monitor counters.

MSExchangeTransport Queues(_total)\Active Remote Delivery Queue Length       (above 200)
MSExchangeTransport Queues(_total)\Largest Delivery Queue Length                 (above 200)
MSExchangeTransport Queues(_total)\Messages Queued For Delivery                (above 200)
MSExchangeTransport Queues(_total)\Retry Remote Delivery Queue Length        (above 20)

Before setting up alerts I need to add them to Solarwinds first. Here’s how you do that.

First, open up the Solarwinds web console, go to Applications, and then SAM Settings.

applicationssam settings

Then go to Component Monitor Wizard.

component monitor

 

Select Windows Performance Counter Monitor.

perfmon

Notice that it says the data is collected using RPC. This means (1) the server must be monitored by Solarwinds using WMI and not SNMP. In case of the latter, switch to monitoring via WMI. And (2) RPC ports must be open between the Solarwinds server and the target server. If not, monitoring will fail.

Enter the name of a server you wish to target. This server would be one that contains the perfmon counters you are interested in. You use this server to setup monitoring for the counters you are interested in. Change to 64bit if 32bit doesn’t work.

target

Change the “Choose Credential” drop down according to your environment. To select the server it’s better to click “Browse” and find the server you are interested in if Solarwinds complains that it cannot find the name you type in.

Note: The next step will fail if you have not opened the required RPC ports.

Select the counters you are interested in. First select the object you want to monitor (MSExchangeTransport Queues, in the screenshot below) and then the counters.

select counters

The next screen will list all the counters you selected and give you a chance to set warning and critical thresholds. Customize these.

 

properties

Select where you would like these counters added to – a new application monitor/ monitor template, or an existing application monitor/ monitor template. I am going with a new application monitor template. Easier to make changes to templates than individual application monitors.

whereadd

 

Choose more nodes you would like to assign this application monitor to. Am skipping this screenshot. This step is optional as you can assign the application monitor to nodes later too.

An optional step – I also went to Manage Application Templates screen after the above steps, selected the template I created, and assigned it some tags and set a custom view.

defineview

A custom view lets you define what details are shown when anyone clicks this application monitor template on a particular node in the Solarwinds web console. You can customize the view by going to Settings (of Solarwinds) and selecting Manage Views.

Next step is to create an alert. For that you have to logon to the Solarwinds server itself, go to Alert Manager, create a new alert (skipping screenshots for all these) and create a new alert whose condition is as follows:solarwinds trigger

Note that the type of property to monitor is “APM: Component”. This is important for the correct variables to be visible in the alert message. Also, note that I am triggering for each of the component (with an “any” condition) and not for the application monitor itself. This lets me get alerts for individual components; if I don’t do this, and instead trigger on the application monitor itself, I will get alert emails for each component including the ones that don’t have an issue.

Here’s the alert message:

solarwinds message

Power cycle/ Reset an HP blade server

Was getting the following error on one of our servers. It’s from ESXi. None of the NICs were working for the server (the NICs seemed to be working, just that the driver wasn’t loading). 

error

Power cycle required. 

I switched off and switched on the server but that didn’t help. Turns out that doesn’t actually power cycle the server (because the server still has power – doh!). What you need to do is do something called an e-fuse reset. This power cycles the blade. You have to do this by opening an SSH session to the Onboard Administrator, finding the bay number of the blade you want to power cycle, and typing the command reset server <bay number>

Good to know!

Note: The command does not appear when you type help, but it’s there:

Also, to get a list of your bays and servers use the show server list command. To do the same for interconnects use the show interconnect list command.

Unable to login to vSphere because the admin@system-domain password cannot be reset

vSphere 5.1 has admin@system-domain as the default admin account. vSphere 5.5 changes that to administrator@vsphere.local. However, if you upgrade from 5.1 to 5.5 the default admin account remains admin@system-domain. Which is fine and dandy until the password for this account expires. Then you are unable to reset or login! See below. :)

Trying to login as usual

1 - login

Password has expired, needs a reset

2 - reset

Reset fails though coz you can only reset for the vsphere.local domain

3 - reset fails

Missed out on taking a screenshot but if you were to try and login with administrator@vsphere.local instead you get an error that the credentials are invalid (because that account doesn’t exist!). So you are stuck!

What do you do?

Solution is to reset the admin password

When you do this vSphere automatically creates the administrator@vsphere.local account. Follow the steps in this KB article.

4 - reset password

Now you can login with administrator@vsphere.local and the generated password.

Notes on NLB, VMware, etc

Just some notes to myself so I am clear about it while reading about it. In the context of this VMware KB article – Microsoft NLB not working properly in Unicast mode.

Before I get to the article I better talk about a regular scenario. Say you have a switch and it’s got a couple of devices connected to it. A switch is a layer 2 device – meaning, it has no knowledge of IP addresses and networks etc. All devices connected to a switch are in the same network. The devices on a switch use MAC addresses to communicate with each other. Yes, the devices have IPv4 (or IPv6) addresses but how they communicate to each other is via MAC addresses.

Say Server A (IPv4 address 10.136.21.12) wants to communicate with Server B (IPv4 address 10.136.21.22). Both are connected to the same switch, hence on the same LAN. Communication between them happens in layer 2. Here the machines identify each other via MAC addresses, so first Server A checks whether it knows the MAC address of Server B. If it knows (usually coz Server A has communicated with Server B recently and the MAC address is cached in its ARP table) then there’s nothing to do; but if it does not, then Server A finds the MAC address via something called ARP (Address Resolution Protocol). The way this works is that Server A broadcasts to the whole network that it wants the MAC address of the machine with IPv4 address 10.136.21.22 (the address of Server B). This message goes to the switch, the switch sends it to all the devices connected to it, Server B replies with its MAC address and that is sent to Server A. The two now communicate – I’ll come to that in a moment.

When it’s communication from devices in a different network to Server A or Server B, the idea is similar except that you have a router connected to the switch. The router receives traffic for a device on this network – it knows the IPv4 address – so it finds the MAC address similar to above and passes it to that device. Simple.

Now, how does the switch know which port a particular device is connected to. Say the switch gets traffic addresses to MAC address 00:eb:24:b2:05:ac – how does the switch know which port that is on? Here’s how that happens –

  • First the switch checks if it already has this information cached. Switches have a table called the CAM (Content Addressable Memory) table which holds this cached info.
  • Assuming the CAM table doesn’t have this info the switch will send the frame (containing the packets for the destination device) to all ports. Note, this is not like ARP where a question is sent asking for the device to respond; instead the frame is simply sent to all ports. It is broadcast to the whole network.
  • When a switch receives frames from a port it notes the source MAC address and port and that’s how it keeps the CAM table up to date. Thus when Server A sends data to Server B, the MAC address and switch port of Server A are stored in the switch’s CAM table.  This entry is only stored for a brief period.

Now let’s talk about NLB (Network Load Balancing).

Consider two machines – 10.136.21.11 with MAC address 00:eb:24:b2:05:ac and 10.136.21.12 with MAC address 00:eb:24:b2:05:ad. NLB is a form of load balancing wherein you create a Virtual IP (VIP) such as 10.136.21.10 such that any traffic to 10.136.21.10 is sent to either of 10.136.21.11 or 10.136.21.12. Thus you have the traffic being load balanced between the two machines; and not only that if any one of the machines go down, nothing is affected because the other machine can continue handling the traffic.

But now we have a problem. If we want a VIP 10.136.21.10 that should send traffic to either host, how will this work when it comes to MAC addresses? That depends on the type of NLB. There’s two sorts – Unicast and Multicast.

In Unicast the NIC that is used for clustering on each server has its MAC address changed to a new Unicast MAC address that’s the same for all hosts. Thus for example, the NIC that holds the NLB IP address 10.136.21.10 in the scenario above will have its MAC address changed from 00:eb:24:b2:05:ac and 00:eb:24:b2:05:ad respectively to (say) 00:eb:24:b2:05:af. Note that the MAC address is a Unicast MAC (which basically means the MAC address looks like a regular MAC address, such as that assigned to a single machine). Since this is a Unicast MAC address, and by definition it can only be assigned to one machine/ switch port, the NLB driver on each machines cheats a bit and changes the source MAC address address to whatever the original NIC MAC address was. That is to say –

  • Server IP 10.136.21.11
    • Has MAC address 00:eb:24:b2:05:ac
    • Which is changed to a MAC address of 00:eb:24:b2:05:af as part of the Unicast IP/ enabling NLB
    • However when traffic is sent out from this machine the MAC address is changed back to 00:eb:24:b2:05:ac
  • Same for Server 10.136.21.12

Why does this happen? This is because –

  • When a device wants to send data to the VIP address, it will try find the MAC address using ARP. That is, it sends a broadcast over the network asking for the device with this IP address to respond. Since both servers now have the same MAC address for their NLB NIC either server will respond with this common MAC address.
  • Now the switch receives frames for this MAC address. The switch does not have this in its CAM table so it will broadcast the frame to all ports – reaching either of the servers.
  • But why does outgoing traffic from either server change the MAC address of outgoing traffic? That’s because if outgoing frames have the common MAC address, then the switch will associate this common MAC address with that port – resulting in all future traffic to the common MAC address only going to one of the servers. By changing the outgoing frame MAC address back to the server’s original MAC address, the switch never gets to store the common MAC address in its CAM table and all frames for the common MAC address are always broadcast.

In the context of VMware what this means is that (a) the port group to which the NLB NICs connect to must allow changes to the MAC address and allow forged transmits; and (b) when a VM is powered on the port group by default notifies the physical switch of the VMs MAC address, since we want to avoid this because this will expose the cluster MAC address to the switch this notification too must be disabled. Without these changes NLB will not work in Unicast mode with VMware.

(This is a good post to read more about NLB).

Apart from Unicast NLB there’s also Multicast NLB. In this form the NLB NIC’s MAC address is not changed. Instead, a new Multicast MAC address is assigned to the NLB NIC. This is in addition to the regular MAC address of the NIC. The advantage of this method is that since each host retains its existing MAC address the communication between hosts is unaffected. However, since the new MAC address is a Multicast MAC address – and switches by default are set to ignore such address – some changes need to be done on the switch side to get Multicast NLB working.

One thing to keep in mind is that it’s important to add a default gateway address to your NLB NIC. At work, for instance, the NLB IPv4 address was reachable within the network but from across networks it wasn’t. Turns out that’s coz Windows 2008 onwards have a strong host behavior – traffic coming in via one NIC does not go out via a different NIC, even if both are in the same subnet and the second NIC has a default gateway set. In our case I added the same default gateway to the NLB NIC too and it was then reachable across networks. 

HP DL360 Gen9 with HP FlexFabric 534 adapter and HP Ethernet 530 adapter and ESXi

That’s a very vague subject line, I know, but I couldn’t think of anything concise. Just wanted to put some keywords so that if anyone else comes across the same problem and types something similar into Google hopefully they stumble upon this post.

At work we got some HP DL360 Gen9s to use as ESXi hosts. To these servers we added additional network cards –

  • HP FlexFabric 10Gb 2-port 534FLR-SFP+ Adapter; and
  • HP Ethernet 10Gb 2-port 530SFP+ Adapter.

Each of these adapters have two NICs each. Here’s a picture of the adapters in the server and the vmnic numbers ESXi assigns to them.

serverIn this picture –

  • vmnic5 & vmnic4 are the HP FlexFabric 10Gb 2-port 534FLR-SFP+ Adapter;
  • vmnic6 & vmnic7 are the HP Ethernet 10Gb 2-port 530SFP+ Adapter; and
  • vmnic0 – vmnic3 are HP Ethernet 1Gb 4-port 331i Adapter (which come in-built into the server);
  • iLO is the iLO port (which I’ll ignore for now).

We didn’t want to use vmnic0 – vmnic3 as they are only 1Gb. So the idea was the use vmnic4 – vmnic7. Two NICs would be for Management+vMotion (connecting to two different switches); two NICs would be for iSCSI (again connecting to different switches).

We came across two issues. First was that the FlexFabric NICs didn’t seem to support iSCSI. ESXi showed two iSCSI adapters but the NICs mapped to them were the regular Ethernet 10Gb ones, not the FlexFabric 10Gb ones. Second issue was that we wanted to use vmnic4 and vmnic6 for Management+vMotion and vmnic5 and vmnic7 for iSCSI – basically a NIC from each adapter such that even if an adapter were to fail there’s a NIC from another adapter for resiliency. This didn’t work for some reason. The Ethernet 10Gb NICs weren’t “connecting” to the network switch for some reason. They would connect in the sense that the link status appears as connected and the LEDs on the switch and NICs blink, but something was missing. There was no real connectivity.

Here’s what we did to fix these.

But first, for both these fixes you have to reboot the server and go into the System Utilities menu.

f9 system utils

Change 1: Enable iSCSI on the FlexFabric adapter (vmnic4 and vmnic5)

Once in the System Utilities menu select “System Configuration”.

system configurationSelect the first FlexFabric NIC (port1).

select flexfabricThen select the Device Hardware Configuration menu.

select device hardwareYou will see that the storage personality is FCoE.

current flex personalityThat’s the problem. This is why the FlexFabric adapters don’t show up as iSCSI adapters. Select the FCoE entry and change it to iSCSI.

new flex personalityNow press Esc to go back to the previous menus (you will be prompted to save the changes – do so). Then repeat the above steps for the second FlexFabric NIC (port 2).

With this change the FlexFabric NICs will appear as iSCSI adapters. Now for the second change.

Change 2: Enable DCB for the Ethernet adapters

From the System Configuration menu now select the first Ethernet NIC (port 1).

select ethernetThen select its Device Hardware Configuration menu.

select device hardware (ethernet)Notice the entry for “DCB Protocol”. Most likely it is “Disabled” (which is why the NICs don’t work for you).

current DCBChange that to “Enabled” and now the NICs will work.

new DCBThat’s it. Once again press Esc (choosing to save the changes when prompted) and then reboot the system. Now all the NICs will work as expected and appear as iSCSI adapters too.

rebootI have no idea what DCB does. From what I can glean via Google it seems to be a set of extensions to Ethernet that provide “hardware-based bandwidth allocation to a specific type of traffic and enhances Ethernet transport reliability with the use of priority-based flow control” (via TechNet) (also check out this Cisco whitepaper for more info). I didn’t read much into it because I couldn’t find anything that mentioned why DCB mattered in this case – as in why were the NICs not working when DCB was disabled? The NICs are connected to an HP 5920AF switch but I couldn’t find anything that suggested the switch requires DCB enabled for the ports to work. This switch supports DCB but that doesn’t imply it requires DCB.

Anyhow, the FlexFabric adapters have DCB enabled by default which is probably why they worked. That’s how I got the idea to enable DCB on the Ethernet adapters to see if it makes a difference – and it did! The only thing I can think of is that DCB also seems to include a DCBX (Data Centre Bridging Exchange) protocol which is about discovering peers, discovering mismatched configuration etc – so maybe the fact that DCB was disabled on these adapters made the switch not “see” these NICs and soft-disable them somehow. That’s my guess at least.

Windows DNS server subnet prioritization and round-robin

Consider the following multiple A records for a DNS record proxy.mydomain.com:

  • proxy.mydomain.com IN A 192.168.10.5
  • proxy.mydomain.com IN A 10.136.53.5
  • proxy.mydomain.com IN A 10.136.52.5
  • proxy.mydomain.com IN A 10.136.33.5
  • proxy.mydomain.com IN A 192.168.15.5

These records are defined on a DNS server. When a client queries the DNS server for the address to proxy.mydomain.com, the DNS server returns all the addresses above. However, the order of answers returned keeps varying. The first client asking for answers could get them in the following order for instance:

  • proxy.mydomain.com IN A 192.168.10.5
  • proxy.mydomain.com IN A 10.136.53.5
  • proxy.mydomain.com IN A 10.136.52.5
  • proxy.mydomain.com IN A 10.136.33.5
  • proxy.mydomain.com IN A 192.168.15.5

The second client could get them in the following order:

  • proxy.mydomain.com IN A 10.136.53.5
  • proxy.mydomain.com IN A 10.136.52.5
  • proxy.mydomain.com IN A 10.136.33.5
  • proxy.mydomain.com IN A 192.168.15.5
  • proxy.mydomain.com IN A 192.168.10.5

The third client could get:

  • proxy.mydomain.com IN A 10.136.52.5
  • proxy.mydomain.com IN A 10.136.33.5
  • proxy.mydomain.com IN A 192.168.15.5
  • proxy.mydomain.com IN A 192.168.10.5
  • proxy.mydomain.com IN A 10.136.53.5

This is called round-robin. Basically it rotates between the various IP addresses. All IP addresses are offered as answers, but the order is rotated so that as long as clients choose the first answer in the list every client chooses a different IP address.

Notice I said clients choose the first answer in the list. This needn’t always be the case though. When I said clients above, I meant the client computer that is querying the DNS server for an answer. But that’s not really who’s querying the server. Instead, an application on the client (e.g. Chrome, Internet Explorer) or the client OS itself is the one looking for an answer. These ask the DNS resolver which is usually a part of the OS for an answer, and it’s the resolver that actually queries the server and gets the list of answers above.

The DNS resolver can then return the list as it is to the requesting application, or it can apply a re-ordering of its own. For instance, if the client is from the 192.168.10.0 network, the resolver may re-order the answers such that the 192.168.10.5 answer is always first. This is called Subnet prioritization. Basically, the resolver prioritizes answers that are from the same subnet as the client. The idea being that client applications would prefer reaching out to a server in their same subnet (it’s closer to them, no need to go over the WAN link for instance) than one on a different subnet.

Subnet prioritization can be disabled on the resolver side by adding a registry key PrioritizeRecordData (link) with value 0 (REG_DWORD) at HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\DnsCache\Parameters. By default this key does not exist and has a default value of 1 (subnet prioritization enabled).

Subnet prioritization can also be set on the server side so it orders the responses based on the client network. This is controlled by the registry key LocalNetPriority (link) under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\DNS\Parameters\ on the DNS server. By default this is 0, so the server doesn’t do any subnet prioritization. Change this to 1 and the server will order its responses according to the client subnet.

By default the server also does round-robin for the results it returns. This can be turned off via the DNS Management tool (under server properties > advanced tab). If round-robin is off the server returns records in the order they were added.

More on subnet prioritization at this link.

That’s is not the end though. :)

Consider a server who has round-robin and subnet prioritization enabled. Now consider the DNS records from above:

  • proxy.mydomain.com IN A 192.168.10.5
  • proxy.mydomain.com IN A 10.136.53.5
  • proxy.mydomain.com IN A 10.136.52.5
  • proxy.mydomain.com IN A 10.136.33.5
  • proxy.mydomain.com IN A 192.168.15.5

The first and last records are from class C networks. The other three are from Class A networks. In reality though thanks to CIDR these are all class C addresses.

Now say there’s a client with IP address 10.136.50.2/24 asking the server for answers. On the face of it the client network does not match any of the answer record networks so the server will simply return answers as per round-robin, without any re-ordering. But in reality though the client 10.136.50.2/24 is in the same network as 10.136.52.5/24 and both are part of a larger 10.136.48.0/20 network that’s simply been broken into multiple /24 networks (to denote clients, servers, etc). What can we do so the server correctly identifies the proxy record for this client?

This is where the LocalNetPriorityNetMask registry key under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\DNS\Parameters\ on the DNS server comes into play. This key – which does not exist by default – tells the server what subnet mask to assume when it’s trying to subnet prioritize. By default the server assumes a /24 subnet, but by tweaking this key we can tell the server to use a different subnet in its calculations and thus correctly return an answer.

The LocalNetPriorityNetMask key takes a REG_DWORD value in a hex format. Check out this KB article for more info, but a quick run through:

A netmask can be written as xxx.xxx.xxx.xxx. 4 pairs of numbers. The LocalNetPriorityNetMask key is of format 0xaabbccdd – again, 4 pairs of hex numbers. This is a mask that’s applied on the mask of 255.255.255.255 so to calculate this number you subtract the mask you want from 255.255.255.255 and convert the resulting numbers into hex.

For example: you want a /8 netmask. That is 255.0.0.0. Subtracting this from 255.255.255.255 leaves you with 0.255.255.255.255. What’s that in hex? 00ffffff. So LocalNetPriorityNetMask will be 0x00ffffff. Easy?

So in the example above I want a /20 netmask. That is, I am telling the server to assume the clients and the record IPs it has to be in a /20 network, so subnet prioritize accordingly. A /20 netmask is 255.255.240.0. Subtract from 255.255.255.255 to get 0.0.15.255. Which in hex is 00000fff (15 decimal is F hex). So all I have to do is put this value as LocalNetPriorityNetMask on the DNS server, restart the service, and now the server will correctly return subnet prioritized answers for my /20 network.

Update: Some more links as I did some more reading on this topic later.

  • Ace Fekay’s post – a must read!
  • A subnet calculator (also gives you the wildcard, which you can use for calculating the LocalNetPriorityNetMask key)
  • I am not very clear on what happens if you disable RoundRobin but there are multiple entries from the same subnet. What order are they returned in? Here’s a link to the RoundRobin setting, doesn’t explain much but just linking it in case it helps in the future.
  • More as a note to myself (and any others if they wonder the same) – the subnet mask you specify is applied on the client. That is to say if you client has an IP address of say 10.136.20.10, by default the DNS server will assume a subnet mask of /24 (Class C is the default) and assume the client is in a 10.136.20.0/24 network. So any records from that range are prioritized. If you want to include other records, you specify a larger subnet mask. Thus, for example, if you specify a /20 then the client is assumed to have an IP address 10.136.20.10/20, so its network range is considered to be 10.136.16.1 – 10.136.31.254 (don’t wrack your brain – use the subnet calculator for this). So any record in this range is prioritized over records not in this range.
  • The Windows calculator can be used to find the LocalNetPriorityNetMask key value. Say you want a subnet mask of /19. The subnet calculator will tell you this has a wildcard of 0.0.31.255 – i.e. 00011111.11111111. Put this (13 1’s) into the Windows calculator to get the hex value 3FFF.  

vMotion is using the Management Network (and failing)

Was migrating one of our offices to a new IP scheme the other day and vMotion started failing. I had a good idea what the problem could be (coz I encountered something similar a few days ago in another context) so here’s a blog post detailing what I did.

For simplicity let’s say the hosts have two VMkernel NICs – vmk0 and vmk1. vmk0 is connected to the Management Network. vmk1 is for vMotion. Both are on separate VLANs.

When our Network admins gave out the new IPs they gave IPs from the same range for both functions. That is, for example, vmk0 had an IP 10.20.1.2/24 (and 10.20.1.3/24 and 10.20.4/24 on the other hosts) and vmk1 had an IP of 10.20.12/24 (and 10.20.1.13/24 and 10.20.1.14/24 on the other hosts).

Since both interfaces are on separate VLANs (basically separate LANs) the above setup won’t work. That’s because as far as the hosts are concerned both interfaces are on the same network yet physically they are on separate networks. Here’s the routing table on the hosts:

Notice that any traffic to the 10.20.1.0/24 network goes via vmk0. And that includes the vMotion traffic because that too is in the same network! And since the network that vmk0 is on is physically a separate network (because it is a VLAN) this traffic will never reach the vMotion interfaces of the other hosts because they don’t know of it.

So even though you have specific vmk1 as your vMotion traffic NIC, it never gets used because of the default routes.

If you could force the outgoing traffic to specifically use vmk1 it will work. Below are the results of vmkping using the default route vs explicitly using vmk1:

The solution here is to either remove the VLANs and continue with the existing IP scheme, or to keep using VLANs but assign a different IP network for the vMotion interfaces.

Update: Came across the following from this blog post while searching for something else:

If the management network (actually the first VMkernel NIC) and the vMotion network share the same subnet (same IP-range) vMotion sends traffic across the network attached to first VMkernel NIC. It does not matter if you create a vMotion network on a different standard switch or distributed switch or assign different NICs to it, vMotion will default to the first VMkernel NIC if same IP-range/subnet is detected.

Please be aware that this behavior is only applicable to traffic that is sent by the source host. The destination host receives incoming vMotion traffic on the vMotion network!

That answered another question I had but didn’t blog about in my post above. You see, my network admins had also set the iSCSI networks to be in the same subnet as the management network – but separate VLANs – yet the iSCSI traffic was correctly flowing over that VLAN instead of defaulting to the management VMkernel NIC. Now I understand why! It’s only vMotion that defaults to the first VMkernel NIC in the same IP range/ subnet as vMotion. 

 

How to fail-over the primary VC module to another

Probably obvious to most people but it was new to me. To failover a VC module from primary to backup login to the VC Manager, then go to “Tools > Reset Virtual Connect Manager”.

toolsClick “Yes”, but before that tick the box that says “Force Failover”.

force failoverThat’s all. Now you will be logged out and after a few minutes you can login to the module that is now primary. It takes a few minutes so be patient.

This has no impact on the network or FC connectivity.

Some more ways of doing this can be found at this HP link.