Subscribe via Email

Subscribe via RSS/JSON


Creative Commons Attribution 4.0 International License
© Rakhesh Sasidharan


Generating certificates with SAN in NetScaler (to make it work with Chrome and other browsers)

I want to create a certificate for my NetScaler and get it working in Chrome. Creating a certificate is easy – there are Citrix docs etc for it – but Chrome keeps complaining about missing subjectAlternativeName. This is because Chrome 58 and upwards ignore the Common Name (CN) field in a certificate and only check the Subject Alternative Names (SAN) field. Other browsers too might ignore the CN field if the SAN field is present (they are supposed to at least); so as a best practice it’s a good idea to fill the SAN field in my NetScaler certificate and put all the names (including the CN) in this field. 

Problem is the NetScaler web UI doesn’t have an option for specifying the SAN field. Windows CA (which is what I use internally) supports SAN when making requests, but since the CSR is usually created on the NetScaler and that doesn’t have a way of mentioning SAN, I need an alternative approach. 

Here’s one approach from a Citrix blog post. Typically the CLI loving geek in me would have taken that route and stopped at that, but today I feel like exploring GUI options. :)

So I came across the DigiCert Certificate Utility and a guide on how to generate a CSR using that. I don’t need to use the guide entirely as my CA is internal, but the tool (download link) is useful. So I downloaded it and created a certificate request. 

A bit of background on the above. I have two NetScalers: (IP and (IP in an HA pair. For management purposes I have a SNIP (DNS name which I can connect to without bothering which is the current primary. So I want to create a certificate that will be valid for all three DNS names and IP addresses. Hence in the Subject Alternative Names field I fill in all three names and IP address – note: all three names including the one I put in the common name, since Chrome ignores this field (and other browsers are supposed to ignore the CN if SAN is present).

I click Generate and the tool generates a new CSR. I save this someplace. 

Now I need to use this CSR to generate a certificate. Typically I would have gone with the WebServer template in my internal CA, but thing is eventually I’ll have to import this CSR, the generated certificate, and the private key of that certificate to the NetScaler – and the default WebServer template does not allow key exporting. 

So I make a new template on my CA. This is just a copy of the default “Web Server” template, but I make a change to allow exporting of the private key (see checkbox below).

Then I create a certificate on my CA using this CSR. 

The template name “WebServer_withKey” is the name of the template. Need to use that with the certreq command instead of the display name. 

This will create the certificate and save it at a location I specify. 

At this point I have the CSR and the certificate. I can’t import these into the NetScaler as that also requires the private key. The DigiCert tool generates the private key automatically and keeps it with itself, so we need to import this certificate into the tool and export with key from there. This exports the certificate, along with key, into a PFX format. 

This Citrix article is a good reference on the various certificate formats. It also gives instructions on how to import a PFX certificate into NetScaler.

Before proceeding however, a quick summary of the certificate formats from the same article for my own reference:

  • PFX is a format for storing a server certificate or any intermediate certificate along with private key in one encrypted file. 
    • PFX == PKCS#12 (i.e. both terms can be used interchangeably). 
  • PEM is another format. And a very common one actually. It can contain both certificates and keys, or only either separately. 
    • These are Base64 encoded ASCII files and have extensions such as .pem, .crt, .cer, or .key. 
  • DER is a binary form of the PEM format. (So while PEM formats can be opened in Notepad, for instance, as a text file, DER format cannot). 
    • These are binary files. Have extensions such as .cer and .der. (Note: .cer can be a PEM format too).

So I go ahead and import the PFX file.

And then I install a new certificate created from this imported PFX file. 

Note: After taking the screenshot I changed the first field (certificate-key pair name) to “ns105_rockylabs_zero_withKey” just to make it clear to my future self that this certificate includes the key with itself and that I won’t find a separate key file as is usually the case. The second field is the name of the PEM file that was previously created and is already on the appliance.

The certificate is successfully installed:

The next step is to go ahead replace the default NetScaler certificate with this one. This can be done via GUI or CLI as in this Citrix article. The GUI is a bit of a chore here, so I went ahead the CLI way. 

And that’s it! Now I can access my NetScalers over SSL using Chrome, with no issues. 

Clusters & Quorum

I spent yesterday refreshing my knowledge of clusters and quorum. Hadn’t worked on these since Windows Server 2003! So here’s a brief intro to this topic:

There are two types of clusters – Network Load Balancing (NLB) and Server Clusters.

Network Load Balancing is “share all” in that every server has a full copy of the app (and the data too if it can be done). Each server is active, the requests are sent to each of them randomly (or using some load distributing algorithm). You can easily add/ remove servers. Examples where you’d use NLB are SMTP servers, Web servers, etc. Each server in this case is independent of the other as long as they are configured identically.

Server Clusters is “share nothing” in that only one server has a full copy of the app and is active. The other servers are in a standby mode, waiting to take over if the active one fails. A shared storage is used, which is how standby servers can take over if the active server fails.

The way clusters work is that clients see one “virtual” server (not to be confused with virtual servers of virtualization). Behind this virtual server are the physical servers (called “cluster nodes” actually) that make up the cluster. As far as clients are concerned there’s one end point – an IP address or MAC address – and what happens behind that is unknown to them. This virtual server is “created” when the cluster forms, it doesn’t exist before that. (It is important to remember this because even if the servers are in a cluster the virtual server may not be created – as we shall see later).

In the case of server clusters something called “quorum” comes into play.

Imagine you have 5 servers in a cluster. Say server 4 is the active server and it goes offline. Immediately, the other servers detect this and one of them becomes the new active server. But what if server 4 isn’t offline, it’s just disconnected from the rest of the group. Now we’ll have server 4 continuing to be active, but the other servers can’t see this, and so one of these too becomes the active server – resulting in two active servers! To prevent such scenarios you have the concept of quorum – a term you might have heard of in other contexts, such as decision making groups. A quorum is the minimum number of people required to make a decision. Say a group of 10 people are deciding something, one could stipulate that at least 6 members must be present during the decision making process else the group is not allowed to decide. A similar concept applies in the case of clusters.

In its simplest form you designate one resource (a server or a disk) as the quorum and whichever cluster contains that resource sets itself as the active cluster while the other clusters deactivate themselves. This resource also holds the cluster database, which is a database containing the state of the cluster and its nodes, and is accessed by all the nodes. In the example above, initially all 5 servers are connected and can see the quorum, so the cluster is active and one of the servers in it is active. When the split happens and server 4 (the currently active server) is separated, it can no longer see the quorum and so disables the cluster (which is just itself really) and stops being the active server. The other 4 servers can still see the quorum, so they continue being active and set a new server as the active one.

In this simple form the quorum is really like a token you hold. If your cluster holds the token it’s active; if it does not you disband (deactivate the cluster).

This simple form of quorum is usually fine, but does not scale to when you have clusters across sites. Moreover, the quorum resource is a single point of failure. Say that resource is the one that’s disconnected or offline – now no one has the quorum, and worse, the cluster database is lost. For this reason the simple form of quorum is not used nowadays. (This mode of quorum is called “Disk Only” by the way).

There are three alternatives to the Disk Only mode of quorum.

One mode is called “Node Majority” and as the name suggests it is based on majority. Here each node has a copy of the cluster database (so there’s no single point of failure) and whichever cluster has more than half the nodes of the cluster wins. So in the previous example, say the cluster of 5 servers splits into one of 3 and 2 servers each – since the first cluster has more than half of the nodes, that wins. (In practice a voting takes place to decide this. Each node has a vote. So cluster one has 1+1+1 = 3 votes; cluster two has 1+1 = 2 votes. Cluster one wins).

Quorums based on majority have a disadvantage in that if the number of nodes are even then you have a tie. That is, if the above example were a 6 node cluster and it split into two clusters of 3 nodes each, both clusters will deactivate as neither have more than half the nodes (i.e. neither have the quorum). You need a tie breaker!

It is worth noting that the effect of quorum extends to the number of servers that can fail in a cluster. Say we have a 3 node cluster. For the cluster to be valid, it must have at least 2 servers. If one server fails, the cluster still has 2 servers so it will function as usual. But if one more server fails – or these two servers are disconnected from each other – the remaining cluster does not have quorum (there’s only 1 node, which is less than more than half the nodes) and so the cluster stops and the servers deactivate. This means even though we have one server still running, and intuitively one would expect that server to continue servicing requests (as it would have in the case of an NLB cluster), it does not do so in the case of a server cluster due to quorum! This is important to remember.

Another mode is called “Node & Disk Majority” and as the name suggests it is a mix of the “Node Majority” and “Disk Only” modes. This mode is for clusters with an even number of modes (where, as we know, “Node Majority” fails) and the way it works is that the cluster with more than half the nodes and which also contains the resource (a disk, usually called a “disk witness”) designated as quorum is the active one. Essentially the disk witness essentially acts as the tie breaker. (In practice the disk witness has an extra vote. So a cluster with 6 nodes will still require more than 3 nodes to consider it active and so if it splits into 3 nodes each, when it comes to voting one of the clusters will have (3+1=) 4 votes and hence win quorum).

In “Node & Disk Majority” mode, unlike the “Disk Only” mode the cluster database is present with all the nodes and so it is not a single point of failure either.

The last mode is called “Node & File Share Majority” and this is a variant of the “Node Majority” mode. This mode too is for clusters with an even number of nodes, and it works similar to “Node & Disk Majority” except for the fact that instead of a resource a file share is used. A file share (called a “file witness” in this case) is selected on any server – not necessarily a part of the cluster – and one node in the cluster locks a file on this share, effectively telling others that it “owns” the file share. So instead of using a resource as a tie breaker, ownership of the file share is used as the tie breaker. (As in the “Node & Disk Majority” mode the node that owns the file share has an additional vote as it owns the file share). Using the previous examples, if a cluster with 6 nodes splits into 3 nodes each, whichever cluster has the node owning the file share will win quorum while the other cluster will deactivate. If the cluster with 6 nodes splits into clusters of 4 and 2 nodes each, and say the 2 node cluster owns the file share, it will still lose as more than half the nodes are in the 4 node cluster (in terms of votes the winning cluster has 4 votes, the losing cluster has 3 votes). When the cluster deactivates, the current owner will release the lock and a node in the new cluster will take ownership.

An advantage of the “Node & File Share Majority” mode is that the file share can be anywhere – even on another cluster (preferably). The file share can also be on a node in the cluster, but that’s not preferred as if the node fails you lose two votes (that of being the file share owner as well as the node itself).

Here are some good links that contain more information (Windows specific):

At work we have two HP LeftHand boxes in a SAN cluster. The only quorum mode used by this is is the “Node Majority” one, which as we know fails for an even number of nodes, and so HP supplies a virtual appliance called “Failover Manager” that is installed on the ESX hosts and is used as the tie breaker. If the Failover Manager is powered off and both LeftHands are powered on together, suppose both devices happen to come on at the same time a cluster is not formed as neither has the quorum. To avoid such situations the Failover Manager has to be present when they power on, or we have to time the powering on such that one of the LeftHands is online before the other.