For no apparent reason my home testlab went wonky today! Not entirely surprising. The DCs in there are not always on/ connected; and I keep hibernating the entire lab as it runs off my laptop so there’s bound to be errors lurking behind the scenes.
Anyways, after a reboot my main DC was acting weird. For one it took a long time to start up – indicating DNS issues, but that shouldn’t be the case as I had another DC/ DNS server running – and after boot up DNS refused to work. Gave the above error message. The Event Logs were filled with two errors:
- Event ID 4000: The DNS server was unable to open Active Directory. This DNS server is configured to obtain and use information from the directory for this zone and is unable to load the zone without it. Check that the Active Directory is functioning properly and reload the zone. The event data is the error code.
- Event id 4007: The DNS server was unable to open zone <zone> in the Active Directory from the application directory partition <partition name>. This DNS server is configured to obtain and use information from the directory for this zone and is unable to load the zone without it. Check that the Active Directory is functioning properly and reload the zone. The event data is the error code.
A quick Google search brought up this Microsoft KB. Looks like the DC has either lost its secure channel with the PDC, or it holds all the FSMO roles and is pointing to itself as a DNS server. Either of these could be the culprit in my case as this DC indeed had all the FSMO roles (and hence was also the PDC), and so maybe it lost trust with itself? Pretty bad state to be in, having no trust in oneself … ;-)
The KB article is worth reading for possible resolutions. In my case since I suspected DNS issues in the first place, and the slow loading usually indicates the server is looking to itself for DNS, I checked that out and sure enough it was pointing to itself as the first nameserver. So I changed the order, gave the DC a reboot, and all was well!
In case the DC had lost trust with itself the solution (according to the KB article) was to reset the DC password. Not sure how that would reset trust, but apparently it does. This involves using the
netdom command which is installed on Server 2008 and up (as well as on Windows 8 or if RSAT is installed and can be downloaded for 2003 from the Support Tools package). The command has to be run on the computer whose password you want to reset (so you must login with an account whose initials are cached, or use a local account). Then run the command thus:
netdom resetpwd /server:<PDC.domain.com> /userd:<Domain\domain_admin> /passwordd:*
Of course the computer must have access to the PDC. And if you are running it on a DC the KDC service must be stopped first.
I have used
netdom in the past to reset my testlab computer passwords. Since a lot of the machines are usually offline for many days, and after a while AD changes the computer account password but the machine still has the old password, when I later boot up the machine it usually gives are error like this: “The trust relationship between this workstation and the primary domain failed.”
A common suggestion for such messages is to dis-join the machine from the domain and re-join it, effectively getting it a new password. That’s a PITA though – I just use
netdom and reset the password as above. :)