DNS zone expired before it could be updated

Earlier this week Outlook for all our users stopped working. The Exchange server was fine but Outlook showed disconnected.

Checked the server. The name exchange.mydomain could not be resolved but the IP address itself was ping-able.

As a quick fix I wrote a PowerShell script that added a hosts file entry for exchange.mydomain to the IP address on all machines. This got all computers working while I investigated further. It is very easy to do this via PowerShell. All you need is the Remote Server Admin Tools via which you install the Active Directory module for PowerShell. This gives you cmdlets such as Get-ADComputer through which you can get a list of all computers in the OU. Pipe this though a ForEach-Object and put in a Test-Connection to only target computers that are online. I was lazy and made a fresh hosts hosts file on my computer with this mapping, and copied that to all the computers in our network. I could do this because I know all machines have the same hosts file, but it’s always possible to just insert the mapping into each computer’s file rather than copy a fresh one.

Anyhow, after that I checked the mydomain DNS server and noticed that the zone had unloaded. This was a secondary zone that refreshed itself from two masters in our US office. Tried pinging the servers – both were unreachable. Oops! Then I remembered that our firewall does not permit ICMP packets to these servers. So I tried telnetting to port 53 of the servers. First one worked, second did not. Ah ha! So one of the servers is definitely down. Still – the zone should have refreshed from the first server, so why didn’t it?

Next I checked the event logs of my DNS server. Found an entry stating the zone name expired before either server could be contacted and so the zone is disabled. Interesting. So I right clicked the zone and did a manual reload and sure enough it worked (the first server is reachable after all).

It’s odd the zone failed in the first place though! I checked its settings and noticed the expiry period is set to one day. So it looks like when the expiry period came about both servers must have been unreachable. Contacted our US team and sure enough they had some maintenance work going on – so it’s likely both servers were unreachable from our offices. Cool!