ADFS errors and WID

Spent a bit of time today tracking down an ADFS/ WID issue. Turned out to be a silly one in the end (silly on my part actually, should have spotted the cause right away!) but it was a good learning exercise in the end. 

The issue was that ADFS refused to launch after a server reboot. The console gave an error that it couldn’t connect to the configuration database. The ADFS service refused to start and the event logs were filled with errors such as these:

The Federation Service configuration could not be loaded correctly from the AD FS configuration database.

Additional Data
Error:
ADMIN0012: OperationFault

There was an error in enabling endpoints of Federation Service. Fix configuration errors using PowerShell cmdlets and restart the Federation Service.

Additional Data
Exception details:
System.ServiceModel.FaultException`1[Microsoft.IdentityServer.Protocols.PolicyStore.OperationFault]: ADMIN0012: OperationFault (Fault Detail is equal to Microsoft.IdentityServer.Protocols.PolicyStore.OperationFault).

A SQL operation in the AD FS configuration database with connection string Data Source=np:\\.\pipe\microsoft##wid\tsql\query;Initial Catalog=AdfsConfiguration;Integrated Security=True failed.

Additional Data

Exception details:
A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 – Could not open a connection to SQL Server)

The last one repeated many times. 

I hadn’t installed the ADFS server in our firm so I had no clue how it was setup. Importantly, I didn’t know it used the Windows Internal Database (WID) which you can see in the error messages above. It is possible to have ADFS work with SQL for a larger setup but that wasn’t the case here. Following some blog posts on the Internet (this and this) I downloaded SQL Server Management Studio (SSMS) and tried connecting to the WID at the path given in the error (\\.\pipe\microsoft##wid\tsql\query). That didn’t work for me – it just gave me some errors that the SQL server was unreachable. 

BTW, according to one of the blog posts it is better to launch SSMS as the user who has rights to connect to the WID database (the service account under which your ADFS service runs for instance). That didn’t help in my case (not saying the advice is incorrect, my issue was something else). Found a Microsoft blog post too that confirmed I was connecting to the correct server name – \\.\pipe\microsoft##wid\tsql\query for Windows 2012 and above; \\.\pipe\MSSQL$MICROSOFT##SSEE\sql\query for Windows 2003 & 2008 – but no go. 

That’s when I realized the WID has its own service. I had missed this initially. Trying to start that gave an error that it couldn’t start due to a login failure. This service runs under an account NT SERVICE\MSSQL$MICROSOFT##WID and looks like it didn’t have logon as service rights. It looks like someone had played around with our GPOs (or moved this server to a different OU) and this account had lost its rights. 

The fix is simple – just give this account rights via GPO (or exclude the server from whatever GPO is fiddling with logon as a service rights; or move this server to some other OU). Since the NT SERVICE\MSSQL$MICROSOFT##WID is not a regular account, you can’t add it to GPO from any server (because the account is local and will only exist if WID is installed). So I opened GPMC on my ADFS server and modified the GPO to give this account logon as a service rights.