GPO errors due to SYSVOL replication issues

So the other day at my test lab I noticed many of my GPOs were giving errors. Ran gpupdate /force and it gave the following message:

User policy could not be updated successfully. The following errors were encountered:

The processing of Group Policy failed. Windows attempted to read the file \\rakhesh.local\SysVol\rakhesh.local\Policies\{F28486EC-7C9D-40D6-A243-F1F733979D5C}\gpt.ini from a domain controller and was not successful. Group Policy settings may not be applied until this event is resolved. This issue may be transient and could be caused by one or more of the following:
a) Name Resolution/Network Connectivity to the current domain controller.
b) File Replication Service Latency (a file created on another domain controller has not replicated to the current domain controller).
c) The Distributed File System (DFS) client has been disabled.
Computer Policy update has completed successfully.

To diagnose the failure, review the event log or run GPRESULT /H GPReport.html from the command line to access information about Group Policy results.

Looked like a temporary thing so I didn’t give it much thought first. But when policies were still not being read after half an hour or so I figured something is broken.

Running gpresult /h didn’t have much to add. I noticed the there were errors similar to the one above. I also noticed that another policy had a version mismatch to the one on the server (this wasn’t highlighted anywhere, I just happened to notice).

I went to the path in the error message on each of my DCs (replace rakhesh.local with your DC name) to find out which DC had a missing file. As the message indicated, it was a replication issue. I had updated the GPO on one of the DCs but looks like it didn’t replicate to some of the other DCs; and my clients were trying to find the policy files on this DC where it didn’t replicate to, thus throwing errors.

Event Logs on the clients didn’t have much to add except for error messages similar to the above.

Event Logs on the DCs were more useful. “Applications and Services Logs” > “DFS Replication” is the place to look here. Interestingly, the DC where I couldn’t find the GPO above reported no errors. It had an entry like this:

The DFS Replication service successfully established an inbound connection with partner WIN-DC01 for replication group Domain System Volume.

Additional Information:
Connection Address Used: WIN-DC01.rakhesh.local
Connection ID: 11C8A7A0-F8A5-4867-B277-78DDC66E3ED3
Replication Group ID: 7C0BF99B-677B-4EDA-9B47-944D532DF7CB

This gives you the impression that all is fine. The other DC though had many errors. One of them was event ID 4012:

The DFS Replication service stopped replication on the folder with the following local path: C:\Windows\SYSVOL\domain. This server has been disconnected from other partners for 118 days, which is longer than the time allowed by the MaxOfflineTimeInDays parameter (60). DFS Replication considers the data in this folder to be stale, and this server will not replicate the folder until this error is corrected.

To resume replication of this folder, use the DFS Management snap-in to remove this server from the replication group, and then add it back to the group. This causes the server to perform an initial synchronization task, which replaces the stale data with fresh data from other members of the replication group.

Additional Information:
Error: 9061 (The replicated folder has been offline for too long.)
Replicated Folder Name: SYSVOL Share
Replicated Folder ID: 0546D0D8-E779-4384-87CA-3D4ABCF1FA56
Replication Group Name: Domain System Volume
Replication Group ID: 7C0BF99B-677B-4EDA-9B47-944D532DF7CB
Member ID: 93D960C2-DE50-443F-B8A1-20B04C714E16

Good! So that explains it. Being a test lab, all my DCs aren’t usually online. In this case, WIN-DC01 is what I always have online but WIN-DC02 is only powered on when I want to test anything specific. Since the two weren’t in touch for more than 60 days, WIN-DC01 had stopped replicating with WIN-DC02. Makes sense – you wouldn’t want stale data from WIN-DC02 to overwrite fresh data on WIN-DC01 after all.

The steps in the Event Log entry didn’t work though. Being a SYSVOL folder there’s no option to remove a server from the replication group in DFS Management.

I searched online for any suggestions related to event ID 4012 and SYSVOL, adn found a Microsoft KB article. This article suggested running two commands which I’d like to post here as a reminder to myself as they are generally useful to quickly pinpoint the source of an error. Being a test lab I have just 2-3 DCs so I can go through them manually; but if I had many DCs it is useful to know how to find the faulty ones fast.

First command simply checks whether the SYSVOL folder is shared on each DC:

Next command uses WMIC to get the replication state of this folder on each DC.

A state of 5 indicates the group is in error. Interestingly, the “Access denied” message is one I had noticed earlier too when running dcdiag on the faulty DC.

I didn’t think much of it that time as there were no errors when running dcdiag from the working DC and also because I ran this command while I thought the error was a temporary thing. I still don’t think this is a permissions issue. My guess is that it is to do with the content freshness error I found in the Event Logs earlier. Possibly due to that the second DC is being denied access to replicate and that’s why I get the “access denied” error above.

From the KB article above I came across another one that has steps on how to force a of SYSVOL from one server to all the others. (In the linked article, the section ‘How to perform an authoritative synchronization of DFSR-replicated SYSVOL (like “D4” for FRS)’ is what I am talking about). Here’s what I did:

  1. I launched adsiedit.msc and connected with the default settings
  2. Then I opened CN=SYSVOL Subscription,CN=Domain System Volume,CN=DFSR-LocalSettings,CN=WIN-DC01,OU=Domain Controllers,DC=rakhesh,DC=local (replace the DC name and domain name with yours; in my case WIN-DC01 is the DC I want to set as authoritative) and …
  3. … I set msDFSR-Enabled=FALSE and msDFSR-options=1

    adsiedit

  4. Next, I opened CN=SYSVOL Subscription,CN=Domain System Volume,CN=DFSR-LocalSettings,CN=WIN-DC02,OU=Domain Controllers,DC=rakhesh,DC=local – WIN-DC02 being the DC that is out of sync – and set msDFSR-Enabled=FALSE.
  5. For AD replication through out the domain:

  6. The result of the above steps will be that SYSVOL replication is disabled through out the domain. I confirmed that by going to the “DFS Replication” logs and noting an event with ID 4114:

    The replicated folder at local path C:\Windows\SYSVOL\domain has been disabled. The replicated folder will not participate in replication until it is enabled. All data in the replicated folder will be treated as pre-existing data when this replicated folder is enabled.

    Additional Information:
    Replicated Folder Name: SYSVOL Share
    Replicated Folder ID: 0546D0D8-E779-4384-87CA-3D4ABCF1FA56
    Replication Group Name: Domain System Volume
    Replication Group ID: 7C0BF99B-677B-4EDA-9B47-944D532DF7CB
    Member ID: 09E14860-47F6-49EE-820E-43F7ED015FEB

  7. Then I went back to CN=SYSVOL Subscription,CN=Domain System Volume,CN=DFSR-LocalSettings,CN=WIN-DC01,OU=Domain Controllers,DC=rakhesh,DC=local but this time I set msDFSR-Enabled=TRUE.
  8. Again, I forced AD replication through out the domain.
  9. Next, I ran dfsrdiag pollad on WIN-DC01 (the DC with the correct copy) to perform an initial sync with AD. (Note: My DC was a Server 2012 Core install. This didn’t have dfsrdiag by default. I installed it via Install-WindowsFeature RSAT-DFS-Mgmt-Con in PowerShell).
  10. I checked the Event Logs of WIN-DC01 to confirm SYSVOL is now replicating (event ID 4602 should be present):

    The DFS Replication service successfully initialized the SYSVOL replicated folder at local path C:\Windows\SYSVOL\domain. This member is the designated primary member for this replicated folder. No user action is required. To check for the presence of the SYSVOL share, open a command prompt window and then type “net share”.

    Additional Information:
    Replicated Folder Name: SYSVOL Share
    Replicated Folder ID: 0546D0D8-E779-4384-87CA-3D4ABCF1FA56
    Replication Group Name: Domain System Volume
    Replication Group ID: 7C0BF99B-677B-4EDA-9B47-944D532DF7CB
    Member ID: 93D960C2-DE50-443F-B8A1-20B04C714E16
    Read-Only: 0

  11. Finally, I went back to CN=SYSVOL Subscription,CN=Domain System Volume,CN=DFSR-LocalSettings,CN=WIN-DC01,OU=Domain Controllers,DC=rakhesh,DC=local and set msDFSR-Enabled=TRUE. You can see what’s happening, don’t you? First we disabled SYSVOL replication on all the DCs. Then we enabled it on the authoritative DC and got it to sync to AD. And now we are re-enabling it on all the other DCs so they get a fresh copy from the authoritative DC.
  12. Lastly, I did a dfsrdiag pollad on WIN-DC02 … and while it should have worked without any issues, I got the following error:

    C:\Users>dfsrdiag pollad
    [ERROR] Access is denied when connecting to WMI services on computer: WIN-DC02.r
    akhesh.local

    Operation Failed

However, Event Logs on WIN-DC02 showed that SYSVOL was now replicating successfully and clients are now able to download GPOs successfully. So it looks like the “access denied” error is something else altogether (as I suspected initially, yaay!). I’ll have to look into that later – I know in the past I’ve fiddled with the WMI settings on this machine so maybe I made some change that I forgot to reset.

Good stuff!