Bug in Server 2016 and DNS zone delegation with CNAME records

A colleague at work mentioned a Server 2016 DNS zone delegation bug he had found. I found just one post on the Internet when I searched for this. According to my colleague Microsoft has now confirmed this as a bug in the support call he raised. 

DNS being an area of interest I wanted to replicate the issue and post it – so here goes. Hopefully it makes sense. :)

Imagine a split-DNS scenario for a zone rockylabs.zero

  • This zone is hosted externally by a DNS server (doesn’t matter what OS/ software) called data01
  • This zone is hosted internally by two DNS servers: a Server 2012R2 (called DC2012-01), and a Server 2016 (called DC2016-01). 

Now say there’s a record rakhesh.rockylabs.zero that is the same both internally and externally. As in, we want both internal and external users to get the same (external) IP address for this record. 

What you would typically do is add this record to your external DNS server and create a delegation from your two internal DNS servers, for this record, to the external DNS server. Here’s some screenshots:

The zone on my external DNS server. Notice I have an A record for rakhesh.rockylabs.zero.  

Ignore the rakhesh2.rockylabs.zero record for now. That comes in later. :)

Here’s a look at the delegation from my Server 2012R2 internal DNS server to the external DNS server for the rakhesh.rockylabs.zero record. Basically I create a delegation within the rockylabs.zero zone on the internal server, for the rakhesh domain, and point it to the external DNS server. On the external DNS server rakhesh.rockylabs.zero is defined as an A record so that will be returned as an answer when this delegation chain is followed. 

In my case both the internal DNS servers are also DCs, and the rockylabs.zero zone is AD integrated, so a similar delegation is automatically created on the other DNS server too. 

As would be expected, I am able to resolve this A record correctly from both internal DNS servers.

Now for the fun part!

Notice the rakhesh2.rockylabs.zero record on my external DNS server? Unlike rakhesh.rockylabs.zero this one is a CNAME record. This too should be common for both internal and external users. Shouldn’t be a problem really as it should work similarly to the A record. Following the chain of delegation when I resolve rakhesh2.rockylabs.zero to a CNAME record called rakhesh.com, my DNS server should automatically resolve the A record for rakhesh.com and give me its address as the answer for rakhesh2.rockylabs.zero.  It works with the Server 2012R2 internal DNS server as expected – 

But breaks for the 2016 internal DNS server!

And that’s it! That’s the bug basically. 

Here’s the odd bit though. If I were to query rakhesh.com (the domain to which the CNAME record points to), and then try to resolve the delegated record, it works!

If I go ahead and clear the cache on that 2016 internal server and try the name resolution again, it’s broken as before.

So the issue is that the 2016 DNS Server is able to follow the delegation for rakhesh2.rockylabs.zero to the external DNS server and resolve it to rakhesh.com, but it is doesn’t then go ahead and lookup rakhesh.com to get its A record. But if the A record for rakhesh.com is already cached with it, it is sensible enough to return that address. 

I dug a bit more into this by enabling debug logging on the 2016 server. Here’s what I found.

The 2016 server receives my query:

It passes this on to the external server (10.10.1.11 is data01 – external DNS server where rakhesh2.rockylabs.zero is delegated to). FYI, I am truncating the output here:

It gets a reply with the CNAME record. So far so good. 

Now it queries the external DNS server (data01 – 10.10.1.11) asking for the A record of rakhesh.com! That’s two wrong things: 1) Why ask the external DNS server (who as far as the internal DNS server knows is only delegated the rakhesh2.rockylabs.zero zone and has nothing to do with rakhesh.com) and 2) why ask it for the A record instead of the NS record so it can find the name servers for rakhesh.com and ask those for the IP address of rakhesh.com

It’s pretty much downhill from there, coz as expected the external DNS server replies saying “doh I don’t know” and gives it a list of root servers to contact. FYI, I truncated the output:

The 2016 internal DNS server now replies to the client with a fail message. As expected. 

So now we know why the 2016 server is failing. 

Until this is fixed one workaround would be to create a CNAME record directly in the internal DNS server to whatever the external DNS server points to. That is, don’t delegate to external – just create the same record internally too. Only for CNAME records; A is fine. Here’s an example of it working with a record called rakhesh3.rockylabs.zero where I simply made a CNAME on the internal 2016 DNS server. 

That’s all for now!

Update: This bug seems to be fixed now. I didn’t encounter it in some of my recent Server 2016 installs. I went through the update history from March 2018 but didn’t find any mention of this getting fixed. The closest DNS fix was this one for large queries. (I chose March 2018 randomly as I noticed the issue was fixed in July 2018 so it’s likely the issue was fixed much before). Update you Server 2016 to the latest patches and you should be fine!