Contact

Subscribe via Email

Subscribe via RSS/JSON

Categories

Creative Commons Attribution 4.0 International License
© Rakhesh Sasidharan

Elsewhere

Notes on ADFS

I have been trying to read on ADFS nowadays. It’s my new area of interest! :) Wrote a document at work sort of explaining it to others, so here’s bits and pieces from that.

What does Active Directory Federation Services (ADFS) do?

Typically when you visit a website you’d need to login to that website with a username/ password stored on their servers, and then the website will give you access to whatever you are authorized to. The website does two things basically – one, it verifies your identity; and two, it grants you access to resources.

It makes sense for the website to control access, as these are resources with the website. But there’s no need for the website to control identity too. There’s really no need for everyone who needs access to a website to have user accounts and passwords stored on that website. The two steps – identity and access control – can be decoupled. That’s what ADFS lets us do.

With ADFS in place, a website trusts someone else to verify the identity of users. The website itself is only concerned with access control. Thus, for example, a website could have trusts with (say) Microsoft, Google, Contoso, etc. and if a user is able to successfully authenticate with any of these services and let the website know so, they are granted access. The website itself doesn’t receive the username or password. All it receives are “claims” from a user.

What are Claims?

A claim is a statement about “something”. Example: my username is ___, my email address is ___, my XYZ attribute is ___, my phone number is ____, etc.

When a website trusts our ADFS for federation, users authenticate against the ADFS server (which in turn uses AD or some other pool to authenticate users) and passes a set of claims to the website. Thus the website has no info on the (internal) AD username, password, etc. All the website sees are the claims, using which it can decide what to do with the user.

Claims are per trust. Multiple applications can use the same trust, or you could have a trust per application (latter more likely).

All the claims pertaining to a user are packaged together into a secure token.

What is a Secure Token?

A secure token is a signed package containing claims. It is what an ADFS server sends to a website – basically a list of claims, signed with the token signing certificate of the ADFS server. We would have sent the public key part of this certificate to the website while setting up the trust with them; thus the website can verify our signature and know the tokens came from us.

Relying Party (RP) / Service Provider (SP)

Refers to the website/ service who is relying on us. They trust us to verify the identity of our users and have allowed access for our users to their services.

I keep saying “website” above, but really I should have been more generic and said Relying Party. A Relying Party is not limited to a website, though that’s how we commonly encounter it.

Note: Relying Party is the Microsoft terminology.

ADFS cannot be used for access to the following:

  • File shares or print servers
  • Active Directory resources
  • Exchange (O365 excepted)
  • Connect to servers using RDP
  • Authenticate to “older” web applications (it needs to be claims aware)

A Relying Party can be another ADFS server too. Thus you could have a setup where a Replying Party trusts an ADFS service (who is the Claims Provider in this relationship), and the ADFS service in turn trusts a bunch of other ADFS servers depending on (say) the user’s location (so the trusting ADFS service is a Relying Party in this relationship).

Claims Provider (CP) / Identity Provider (IdP)

The service that actually validates users and then issues tokens. ADFS, basically.

Note: Claims Party is the Microsoft terminology.

Secure Token Service (STS)

The service within ADFS that accepts requests and creates and issues security tokens containing claims.

Relying Party Trust

Refers to the trust between a Relying Party and Identity Provider. Tokens from the Identity Provider will be signed with the Identity Provider’s token signing key – so the Relying Party knows it is authentic. Similarly requests from the Relying Party will be signed with their certificate (which we can import on our end when setting up the trust).

Web Application Proxy (WAP)

Access to an ADFS server over the Internet is via a Web Application Proxy. This is a role in Server 2012 and above – think of it as a reverse proxy for ADFS. The ADFS server is within the network; the WAP server is on the DMZ and exposed to the Internet (at least port 443). The WAP server doesn’t need to be domain joined. All it has is a reference to the ADFS server – either via DNS, or even just a hosts file entry. The WAP server too contains the public certificates of the ADFS server.

Miscellaneous

  • ADFS Federation Metadata – this is a cool link that is published by the ADFS server (unless we have disabled it). It is https://<your-adfs-fqdn>/FederationMetadata/2007-06/FederationMetadata.xml and contains all the info required by a Replying Party to add the ADFS server as a Claims Provider.
    • This also includes Base64 encoded versions of the token signing certificate and token decrypting certificates.
  • SAML Entity ID – not sure of the significance of this yet, but this too can be found in the Federation Metadata file. It is usually of the form http://<your-adfs-fqdn>/adfs/services/trust and is required by the Relying Party to setup a trust to the ADFS server.
  • SAML endpoint URL – this is the URL where users are sent to for authentication. Usually of the form http://<your-adfs-fqdn>/adfs/ls.  This information too can be found in the Federation Metadata file.
  • Link to my post on ADFS Certificates.

Certificate stuff (as a note to myself)

Helping out a bit with the CA at work, so just putting these down here so I don’t forget later.

For managing user certificates: certmgr.msc.

For managing computer certificates: certlm.msc.

Using CA Web enrollment pages and SAN attributes requires EDITF_ATTRIBUTESUBJECTALTNAME2 to be enabled on your CA.

Enable it thus:

When making a request, in the attributes field enter the following for the SANs: san:dns=corpdc1.fabrikam.com&dns=ldap.fabrikam.com.

 

Find users connected to a NetScaler gateway

Wanted to find out if a certain end-user had connected to our NetScaler gateway. Couldn’t figure out how. (And initially I went the long route of looking at the /tmp/aaadebug.log file – not really needed here!)

It’s easy. Login to the NetScaler device. Click on “NetScaler Gateway” in left pane. On the right you will find “Active user sessions” and “ICA Connections”. The former shows users who have authenticated against the gateway, and the latter is those who have an ICA connection open through the gateway. The lists could be different as a user might have timed out on the gateway but still have an ICA connection open. 

Via CLI the former is show aaa session. The latter is show vpn icaConnection. The latter will show connects to the VDA (port 2598 usually). 

Event ID 1046 – DHCP server says it is not authorized even though it is authorized!

This problem ate my head for the past 2 days and wasted a lot of time. For such a simple issue it drove me quite mad.

Built a bunch of DCs for our branch offices. One of them gave trouble with the DHCP server. I authorized it successfully, but the service kept complaining that it wasn’t authorized. Event ID 1046.

The DHCP/BINL service on the local machine, belonging to the Windows Administrative domain mydomain.dom, has determined that it is not authorized to start.  It has stopped servicing clients.  The following are some possible reasons for this: 

This machine is part of a directory service enterprise and is not authorized in the same domain.  (See help on the DHCP Service Management Tool for additional information). 

This machine cannot reach its directory service enterprise and it has encountered another DHCP service on the network belonging to a directory service enterprise on which the local machine is not authorized. 

Some unexpected network error occurred.

Did the obvious ones like reboot server :p and restart service :) and un-authorize and re-authorize the server (no errors either time). Also went ahead and removed the role itself and added back. Nothing helped!

Found a helpful post finally that pointed me in the right direction.

  1. I un-authorized the DHCP server.
  2. Opened up AD Sites and Services. 
  3. Browsed to the Services section (which can be enabled from the View menu if not already visible). 
  4. Browsed to the NetServices section within this. 
  5. On the right pane I had an entry for the IP address for the DHCP server I was trying to authorize. Not an entry by name, but by IP. Dunno why. (All other entries were by name, so I am guessing this is a leftover or a mistake by someone in the past). 
  6. I deleted this entry. 
  7. Waited a while, and then authorized the server. 
  8. No errors now!

Screenshot of the offending entry just for the heck of it (the blacked out part was an IP address):

Alternatively one can open ADSI Edit and go to CN=NetServices,CN=Services,CN=Configuration,DC=myDomain,DC=dom. Then delete the entry (as above) from there. 

What’s odd in my case is that the IP that I deleted was assigned to the DHCP server I wanted to authorize. Am guessing the CNF (short for conflict?) following by the GUID indicates some issue.

[Aside] Various Citrix links

Busy with a lot of Citrix and NetScaler work recently. Want to put the various links I came across someplace. 

Was also looking into why Citrix Receiver wasn’t creating shortcuts for our published apps even though we had ticked it in in Citrix Studio. The trick was to mark the application as a Favorite in Receiver and the shortcut would be created. Here’s some links I had found while reading on this:

Script to run esxcli unmap on all datastores attached to an ESXi host

It’s a good idea to periodically run the UNMAP command on all your thin-provisioned LUNs. This allows the storage system to reclaim deleted blocks. (What is SCSI UNMAP?)

The format of the command is:

I wanted to make a script to run this on all attached datastores so here’s what I came up with:

The esxcli storage filesystem list command outputs a list of datastores attached to the system. The second column is what I am interested in, so that’s what awk takes care for me. I don’t want to target any local datastores, so I use grep to filter out  the ones I am interested in. 

Next step would be to add this to a cron job. Got to follow the instructions here, it looks like. 

Windows CLI – find groups you are a member of

I knew of doing a gpresult /v and finding the group membership. An even better (and faster) way is whoami /groups.

Other useful whoami switches.

[Aside] Misc ADFS links

Update: To test ADFS as an end-user, go to https://<adfsfqdn>/adfs/ls/IdpInitiatedSignon.aspx. Should get a page where you can sign in and select what trusts are present.

Bored :)

Watching “Cosmos: A Space Time Odyssey” nowadays. 

Also completed “The Leftovers” Season 1 yesterday. Great show, especially the last few episodes where there’s a lot of talk about purpose and such – which is still in my head. 

Now sitting at a restaurant, bored, eating chicken kababs. Thinking: life is so easy for humans now. Ages ago we had to hunt for food. Each day was an unknown – whether we would get something or not. Life itself was unsure. Survival during hunting for instance. But now – here I am, a chubby spectacled nerd who probably wouldn’t have survived at all during the hunter days, sitting here eating a piece of chicken with a fork. How things have changed. 

I guess that’s why we have no sense of purpose now. Back then we had a purpose – survival; fight against nature. Now we have troubles but most of it isn’t of the survival kind. So there’s no sense of purpose – there’s an emptiness. Nothing to do. Work is what we try to find purpose in. But that’s not really purpose. Mostly it’s a means to earn money. And it’s filled with politics and whatnot. It’s not purely about “your” purpose – it’s about the company and the people in it etc. 

Ok time to go, cutting this short! :)

[Aside] AD Sites, Subnets, Trusts, etc.

  • How Domain Controllers are Located Across Trusts – this is a delightful article. I don’t know why, but I simply loved the way the author presented the information. Very logically written. Wish I could write blog posts with such clarity.
    • Praise aside, it is a good article on how subnet and site definitions are used to find a Domain Controller closest to you, and especially how it works across forest trusts.
  • Using Catch-All subnets in AD – Wanted to know how catch-all subnets in AD Sites will interact with specific ones. This one explained it. The specific one takes precedence. Which is exactly what you want. :)

[Aside] NetScaler tracing, telnet, etc.

It is not possible to do a telnet from the NetScaler to any server to troubleshoot connectivity issues. The telnet may or may not succeed, but it doesn’t mean anything as the telnet is initiated from the NSIP where all NetScaler communications to its services happen from the SNIP. Only option in such cases is to create a service bound to that port & protocol, and monitor that.

At work, for instance, we had STA issues. So I created an HTTP service, bound to port 80, for each Delivery Controller. Then I created a new HTTP monitor that checks for /Scripts/CtxSta.dll and expects return code 406. This also lets me create an nstrace against this service itself to see what’s happening.

  • nstrace reference
    • Set the packet size to 0 and file size to 0.
    • Expression will be something like CONNECTION.SVCNAME.EQ("mySTA_svc_name")
  • There’s also nstcpdump.sh, which is a lighter version of nstrace. Less details, but quicker to get up and running. I prefer nstrace. :)
  • A blog post with examples of both.

PowerShell – Find all AD users with ACL inheritance disabled

Quick one-liner to find all AD user objects with ACL inheritance disabled:

Another one:

 

ADFS errors and WID

Spent a bit of time today tracking down an ADFS/ WID issue. Turned out to be a silly one in the end (silly on my part actually, should have spotted the cause right away!) but it was a good learning exercise in the end. 

The issue was that ADFS refused to launch after a server reboot. The console gave an error that it couldn’t connect to the configuration database. The ADFS service refused to start and the event logs were filled with errors such as these:

The Federation Service configuration could not be loaded correctly from the AD FS configuration database.

Additional Data
Error:
ADMIN0012: OperationFault

There was an error in enabling endpoints of Federation Service. Fix configuration errors using PowerShell cmdlets and restart the Federation Service.

Additional Data
Exception details:
System.ServiceModel.FaultException`1[Microsoft.IdentityServer.Protocols.PolicyStore.OperationFault]: ADMIN0012: OperationFault (Fault Detail is equal to Microsoft.IdentityServer.Protocols.PolicyStore.OperationFault).

A SQL operation in the AD FS configuration database with connection string Data Source=np:\\.\pipe\microsoft##wid\tsql\query;Initial Catalog=AdfsConfiguration;Integrated Security=True failed.

Additional Data

Exception details:
A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server)

The last one repeated many times. 

I hadn’t installed the ADFS server in our firm so I had no clue how it was setup. Importantly, I didn’t know it used the Windows Internal Database (WID) which you can see in the error messages above. It is possible to have ADFS work with SQL for a larger setup but that wasn’t the case here. Following some blog posts on the Internet (this and this) I downloaded SQL Server Management Studio (SSMS) and tried connecting to the WID at the path given in the error (\\.\pipe\microsoft##wid\tsql\query). That didn’t work for me – it just gave me some errors that the SQL server was unreachable. 

BTW, according to one of the blog posts it is better to launch SSMS as the user who has rights to connect to the WID database (the service account under which your ADFS service runs for instance). That didn’t help in my case (not saying the advice is incorrect, my issue was something else). Found a Microsoft blog post too that confirmed I was connecting to the correct server name – \\.\pipe\microsoft##wid\tsql\query for Windows 2012 and above; \\.\pipe\MSSQL$MICROSOFT##SSEE\sql\query for Windows 2003 & 2008 – but no go. 

That’s when I realized the WID has its own service. I had missed this initially. Trying to start that gave an error that it couldn’t start due to a login failure. This service runs under an account NT SERVICE\MSSQL$MICROSOFT##WID and looks like it didn’t have logon as service rights. It looks like someone had played around with our GPOs (or moved this server to a different OU) and this account had lost its rights. 

The fix is simple – just give this account rights via GPO (or exclude the server from whatever GPO is fiddling with logon as a service rights; or move this server to some other OU). Since the NT SERVICE\MSSQL$MICROSOFT##WID is not a regular account, you can’t add it to GPO from any server (because the account is local and will only exist if WID is installed). So I opened GPMC on my ADFS server and modified the GPO to give this account logon as a service rights. 

Generating certificates with SAN in NetScaler (to make it work with Chrome and other browsers)

I want to create a certificate for my NetScaler and get it working in Chrome. Creating a certificate is easy – there are Citrix docs etc for it – but Chrome keeps complaining about missing subjectAlternativeName. This is because Chrome 58 and upwards ignore the Common Name (CN) field in a certificate and only check the Subject Alternative Names (SAN) field. Other browsers too might ignore the CN field if the SAN field is present (they are supposed to at least); so as a best practice it’s a good idea to fill the SAN field in my NetScaler certificate and put all the names (including the CN) in this field. 

Problem is the NetScaler web UI doesn’t have an option for specifying the SAN field. Windows CA (which is what I use internally) supports SAN when making requests, but since the CSR is usually created on the NetScaler and that doesn’t have a way of mentioning SAN, I need an alternative approach. 

Here’s one approach from a Citrix blog post. Typically the CLI loving geek in me would have taken that route and stopped at that, but today I feel like exploring GUI options. :)

So I came across the DigiCert Certificate Utility and a guide on how to generate a CSR using that. I don’t need to use the guide entirely as my CA is internal, but the tool (download link) is useful. So I downloaded it and created a certificate request. 

A bit of background on the above. I have two NetScalers: ns105-01.rockylabs.zero (IP 10.10.1.150) and ns105-02.rockylabs.zero (IP 10.10.1.160) in an HA pair. For management purposes I have a SNIP 10.10.1.170 (DNS name ns105.rockylabs.zero) which I can connect to without bothering which is the current primary. So I want to create a certificate that will be valid for all three DNS names and IP addresses. Hence in the Subject Alternative Names field I fill in all three names and IP address – note: all three names including the one I put in the common name, since Chrome ignores this field (and other browsers are supposed to ignore the CN if SAN is present).

I click Generate and the tool generates a new CSR. I save this someplace. 

Now I need to use this CSR to generate a certificate. Typically I would have gone with the WebServer template in my internal CA, but thing is eventually I’ll have to import this CSR, the generated certificate, and the private key of that certificate to the NetScaler – and the default WebServer template does not allow key exporting. 

So I make a new template on my CA. This is just a copy of the default “Web Server” template, but I make a change to allow exporting of the private key (see checkbox below).

Then I create a certificate on my CA using this CSR. 

The template name “WebServer_withKey” is the name of the template. Need to use that with the certreq command instead of the display name. 

This will create the certificate and save it at a location I specify. 

At this point I have the CSR and the certificate. I can’t import these into the NetScaler as that also requires the private key. The DigiCert tool generates the private key automatically and keeps it with itself, so we need to import this certificate into the tool and export with key from there. This exports the certificate, along with key, into a PFX format. 

This Citrix article is a good reference on the various certificate formats. It also gives instructions on how to import a PFX certificate into NetScaler.

Before proceeding however, a quick summary of the certificate formats from the same article for my own reference:

  • PFX is a format for storing a server certificate or any intermediate certificate along with private key in one encrypted file. 
    • PFX == PKCS#12 (i.e. both terms can be used interchangeably). 
  • PEM is another format. And a very common one actually. It can contain both certificates and keys, or only either separately. 
    • These are Base64 encoded ASCII files and have extensions such as .pem, .crt, .cer, or .key. 
  • DER is a binary form of the PEM format. (So while PEM formats can be opened in Notepad, for instance, as a text file, DER format cannot). 
    • These are binary files. Have extensions such as .cer and .der. (Note: .cer can be a PEM format too).

So I go ahead and import the PFX file.

And then I install a new certificate created from this imported PFX file. 

Note: After taking the screenshot I changed the first field (certificate-key pair name) to “ns105_rockylabs_zero_withKey” just to make it clear to my future self that this certificate includes the key with itself and that I won’t find a separate key file as is usually the case. The second field is the name of the PEM file that was previously created and is already on the appliance.

The certificate is successfully installed:

The next step is to go ahead replace the default NetScaler certificate with this one. This can be done via GUI or CLI as in this Citrix article. The GUI is a bit of a chore here, so I went ahead the CLI way. 

And that’s it! Now I can access my NetScalers over SSL using Chrome, with no issues. 

[Aside] SPNs

Trying to get people at work to clean up duplicate SPNs, and came across some links while reading about this topic. 

From the official MSDN article: A service principal name (SPN) is a unique identifier of a service instance. SPNs are used by Kerberos authentication to associate a service instance with a service logon account. This allows a client application to request that the service authenticate an account even if the client does not have the account name.

Basically when a client application tries to authenticate with a service instance and the domain controller needs to issues it Kerberos tickets, the domain controller needs to know whose password to use for the service instance – is it that of the server where this instance runs, or any service account responsible for it. This mapping of service -> service account/ computer account is an SPN. It’s of the format service/host:port and is associated with the AD account of the service account or computer account (stored in the servicePrincipalName attribute actually).

That’s all!