Citrix breaks after removing the root zone from your DNS server?

Two years ago I had removed the root zone on our DNS servers at work. Coz who needs root zones if your DC is only answering internal queries, i.e. for zone sit has. Right?

Well, that change broke our Citrix environment. :) Users could connect to our NetScaler gateway but couldn’t launch any resource after that. 

Our Citrix chaps logged a call with our vendor etc and they gave some bull about the DNS server not responding to TCP queries etc. Yours truly wasn’t looking after Citrix or NetScalers back then, so the change was quietly rolled back as no one had any clue why it broke Citrix. 

Fast forward to yesterday, I had to do the change again coz now we want our DNS servers to resolve external names too – i.e. use root hints and all that goodness! I did the change, and Citrix broke! Damn. 

But luckily now Citrix has been rolled into our team and I know way more about how Citrix works behind the scenes. Plus I keep dabbling with NetScalers, so I am not totally clueless (or so I’d like to think!). 

I went into the DNS section of the NetScaler to see what’s up. Turns out the DNS virtual server was marked as down. Odd, coz I could SSH into the NetScaler and do name lookups against that DNS virtual server (which pointed to my internal DC basically). And yes, I could do dig +notcp to force it to do UDP queries only and nothing was broken. So why was the virtual server marked as down?!

I took a look at the monitor on the DNS service and it had the following:

Ok, so what exactly does this monitor do? Click “Edit Monitor” – nothing odd there – click on “Special Parameters” and what do I find? 

Yup, it was set to query for the root zone. Doh! No wonder it broke. 

I have no idea why the DNS monitor was assigned to this service. By default DNS-UDP has the ping-default monitor assigned to it while DNS-TCP has the tcp-default monitor assigned to it.  Am guessing that since our firewall block ICMP from the NetScalers to the DCs, someone decided to use the DNS monitor instead and left it at the default values of monitoring for the root zone. When I removed the root zone that monitor failed, the DNS virtual server was marked as down, and the NetScaler could no longer resolve DNS names for the resources users were trying to connect to. Hence the STA error above. Nice, huh!

Fix is simple. Change the query in the DNS monitor to a zone your DNS servers. Preferably the zone your resources are in. Easy peasy. Made that change, and Citrix began working. 

As might be noticeable from the tone of the post, I am quite pleased at having figured this out. Yes, I know it’s not a biggie … but just, it makes me happy at having figured it coz I went down a logical path instead of just throwing up my hands and saying I have no idea why the DNS service is down or why the monitor is red etc. So I am pleased at that! :)