Notes on AD Replication (contd.)

This post is a continuation to my previous one.

How the AD Replication Model Works

Conflict Resolution

Previously I mentioned that conflict resolution in AD does not depend on timestamp. What is used instead of the “volatility” of changes. Here’s how it works in practice.

Remember the replication metadata stored on each object? The ones you can view using repadmin /showobjmeta. I mentioned 5 metadata there – the Local USN, the Originating DSA, the Originating USN, the Originating Timestamp, and Version. Three of these metadata are used a conflict resolution stamp for every attribute:

Version, which as we know is updated each time the attribute is updated
Originating Timestamp, which is the timestamp from the DC where the update originated from
Originating DSA, which is not the DSA name or GUID as you’d expect from the repadmin output, but the invocationID of the DSA where the update originated from.

How is this stamp used? If there’s a conflict to an attribute – i.e. a change is made to an attribute on two separate DCs – first the Version is considered. Whichever update has the higher Version wins. Notice how the timestamp of the change doesn’t matter. Say WIN-DC01 had a change to an attribute twice (thus incrementing the Version twice) while WIN-DC02 had a change to the same attribute once, but at a later time, and both these changes reached WIN-DC03 together – the change from WIN-DC01 will win over the later change from WIN-DC02 because the number of changes were more there.

If two conflicting changes have the same Version then the timestamp is considered. This has a one-second resolution, and so unless the conflict changes happened at the exact same second this is usually enough to resolve the conflict.

However, if both Version and timestamp are unable to resolve the conflict, then the invocationID is considered. This is guaranteed to be different for each DC, and is random, so whichever change is from a DC with higher invocationID wins.

Replication Metadata

The Knowledge Consistency Checker (KCC) (will be discussed in a later post) is the component that is responsible for maintaining the replication topology. It is maintains connection objects with the replication partners and stores this information, for each domain partition, in a multivalued attribute called repsFrom in the root of that domain partition.

For example, here are the replication partners for WIN-DC02. Although not shown here, WIN-DC04 & WIn-DC05 are of a child domain.

win-dc02-replications

Now consider the repsFrom attribute of the domain partition on WIN-DC02:

repsFrom: dwVersion: 2 v1.cb: 500 v1.cConsecutive Failures: 0 v1.timeLastSuccess: 13065528686 v1.timeLastAttempt: 13065528686 v1.ulResultLastAttempt: 0 v1.cbOtherDraOffset: 216v1.cbOtherDra: 284v1.ulReplicaFlags: 805306448 v1.rtSchedule: <skipped> v1.usnvec.usnHighObjUpdate: 65780 v1.usnvec.usnHighPropUpdate: 65780 v1.pszUuidDsaObj: 33398129-7632-4014-a3b4-eabb2b74de8b v1.pszUuidInvocId: 56c05622-3023-450d-8807-5f51401be512 v1.pszUuidTransportObj: 00000000-0000-0000-0000-000000000000 v1.cbPASDataOffset: 0 v1~PasData: (none) v2~pdsa_rpc_inst v2.pszDSIServer 33398129-7632-4014-a3b4-eabb2b74de8b._msdcs.rakhesh.local v2.pszDSIAnnotation (null) v2.pszDSIInstance 33398129-7632-4014-a3b4-eabb2b74de8b._msdcs.rakhesh.local v2.pguidDSIInstance (null);

repsFrom: dwVersion: 2 v1.cb: 500 v1.cConsecutive Failures: 0 v1.timeLastSuccess: 13065528686 v1.timeLastAttempt: 13065528686 v1.ulResultLastAttempt: 0 v1.cbOtherDraOffset: 216v1.cbOtherDra: 284v1.ulReplicaFlags: 805306448 v1.rtSchedule: <skipped> v1.usnvec.usnHighObjUpdate: 65780 v1.usnvec.usnHighPropUpdate: 65780 v1.pszUuidDsaObj: 33398129-7632-4014-a3b4-eabb2b74de8b v1.pszUuidInvocId: 56c05622-3023-450d-8807-5f51401be512 v1.pszUuidTransportObj: 00000000-0000-0000-0000-000000000000 v1.cbPASDataOffset: 0 v1~PasData: (none) v2~pdsa_rpc_inst v2.pszDSIServer 33398129-7632-4014-a3b4-eabb2b74de8b._msdcs.rakhesh.local v2.pszDSIAnnotation (null) v2.pszDSIInstance 33398129-7632-4014-a3b4-eabb2b74de8b._msdcs.rakhesh.local v2.pguidDSIInstance (null);

And here’s the repsFrom from the Configuration partition:

repsFrom (3): dwVersion: 2 v1.cb: 500 v1.cConsecutive Failures: 0 v1.timeLastSuccess: 13065529586 v1.timeLastAttempt: 13065529586 v1.ulResultLastAttempt: 0 v1.cbOtherDraOffset: 216v1.cbOtherDra: 284v1.ulReplicaFlags: 805306448 v1.rtSchedule: <skipped> v1.usnvec.usnHighObjUpdate: 65843 v1.usnvec.usnHighPropUpdate: 65843 v1.pszUuidDsaObj: 33398129-7632-4014-a3b4-eabb2b74de8b v1.pszUuidInvocId: 56c05622-3023-450d-8807-5f51401be512 v1.pszUuidTransportObj: 00000000-0000-0000-0000-000000000000 v1.cbPASDataOffset: 0 v1~PasData: (none) v2~pdsa_rpc_inst v2.pszDSIServer 33398129-7632-4014-a3b4-eabb2b74de8b._msdcs.rakhesh.local v2.pszDSIAnnotation (null) v2.pszDSIInstance 33398129-7632-4014-a3b4-eabb2b74de8b._msdcs.rakhesh.local v2.pguidDSIInstance (null); dwVersion: 2 v1.cb: 500 v1.cConsecutive Failures: 0 v1.timeLastSuccess: 13065529586 v1.timeLastAttempt: 13065529586 v1.ulResultLastAttempt: 0 v1.cbOtherDraOffset: 216v1.cbOtherDra: 284v1.ulReplicaFlags: 112 v1.rtSchedule: <skipped> v1.usnvec.usnHighObjUpdate: 26332 v1.usnvec.usnHighPropUpdate: 26332 v1.pszUuidDsaObj: 3e82a06d-ec61-48a9-ac83-f68623fdfe85 v1.pszUuidInvocId: bc68cc5f-9baf-443b-85d5-ecb056b917fc v1.pszUuidTransportObj: 00000000-0000-0000-0000-000000000000 v1.cbPASDataOffset: 0 v1~PasData: (none) v2~pdsa_rpc_inst v2.pszDSIServer 3e82a06d-ec61-48a9-ac83-f68623fdfe85._msdcs.rakhesh.local v2.pszDSIAnnotation (null) v2.pszDSIInstance 3e82a06d-ec61-48a9-ac83-f68623fdfe85._msdcs.rakhesh.local v2.pguidDSIInstance (null); dwVersion: 2 v1.cb: 500 v1.cConsecutive Failures: 0 v1.timeLastSuccess: 13065529586 v1.timeLastAttempt: 13065529586 v1.ulResultLastAttempt: 0 v1.cbOtherDraOffset: 216v1.cbOtherDra: 284v1.ulReplicaFlags: 112 v1.rtSchedule: <skipped> v1.usnvec.usnHighObjUpdate: 16771 v1.usnvec.usnHighPropUpdate: 16771 v1.pszUuidDsaObj: 1e8f2e00-76c6-4e7c-86da-63a398ee2095 v1.pszUuidInvocId: 682509ae-5766-4690-9226-b969c23612b4 v1.pszUuidTransportObj: 00000000-0000-0000-0000-000000000000 v1.cbPASDataOffset: 0 v1~PasData: (none) v2~pdsa_rpc_inst v2.pszDSIServer 1e8f2e00-...

repsFrom (3): dwVersion: 2 v1.cb: 500 v1.cConsecutive Failures: 0 v1.timeLastSuccess: 13065529586 v1.timeLastAttempt: 13065529586 v1.ulResultLastAttempt: 0 v1.cbOtherDraOffset: 216v1.cbOtherDra: 284v1.ulReplicaFlags: 805306448 v1.rtSchedule: <skipped> v1.usnvec.usnHighObjUpdate: 65843 v1.usnvec.usnHighPropUpdate: 65843 v1.pszUuidDsaObj: 33398129-7632-4014-a3b4-eabb2b74de8b v1.pszUuidInvocId: 56c05622-3023-450d-8807-5f51401be512 v1.pszUuidTransportObj: 00000000-0000-0000-0000-000000000000 v1.cbPASDataOffset: 0 v1~PasData: (none) v2~pdsa_rpc_inst v2.pszDSIServer 33398129-7632-4014-a3b4-eabb2b74de8b._msdcs.rakhesh.local v2.pszDSIAnnotation (null) v2.pszDSIInstance 33398129-7632-4014-a3b4-eabb2b74de8b._msdcs.rakhesh.local v2.pguidDSIInstance (null); dwVersion: 2 v1.cb: 500 v1.cConsecutive Failures: 0 v1.timeLastSuccess: 13065529586 v1.timeLastAttempt: 13065529586 v1.ulResultLastAttempt: 0 v1.cbOtherDraOffset: 216v1.cbOtherDra: 284v1.ulReplicaFlags: 112 v1.rtSchedule: <skipped> v1.usnvec.usnHighObjUpdate: 26332 v1.usnvec.usnHighPropUpdate: 26332 v1.pszUuidDsaObj: 3e82a06d-ec61-48a9-ac83-f68623fdfe85 v1.pszUuidInvocId: bc68cc5f-9baf-443b-85d5-ecb056b917fc v1.pszUuidTransportObj: 00000000-0000-0000-0000-000000000000 v1.cbPASDataOffset: 0 v1~PasData: (none) v2~pdsa_rpc_inst v2.pszDSIServer 3e82a06d-ec61-48a9-ac83-f68623fdfe85._msdcs.rakhesh.local v2.pszDSIAnnotation (null) v2.pszDSIInstance 3e82a06d-ec61-48a9-ac83-f68623fdfe85._msdcs.rakhesh.local v2.pguidDSIInstance (null); dwVersion: 2 v1.cb: 500 v1.cConsecutive Failures: 0 v1.timeLastSuccess: 13065529586 v1.timeLastAttempt: 13065529586 v1.ulResultLastAttempt: 0 v1.cbOtherDraOffset: 216v1.cbOtherDra: 284v1.ulReplicaFlags: 112 v1.rtSchedule: <skipped> v1.usnvec.usnHighObjUpdate: 16771 v1.usnvec.usnHighPropUpdate: 16771 v1.pszUuidDsaObj: 1e8f2e00-76c6-4e7c-86da-63a398ee2095 v1.pszUuidInvocId: 682509ae-5766-4690-9226-b969c23612b4 v1.pszUuidTransportObj: 00000000-0000-0000-0000-000000000000 v1.cbPASDataOffset: 0 v1~PasData: (none) v2~pdsa_rpc_inst v2.pszDSIServer 1e8f2e00-...

Each entry starts from dwVersion and contains information like the number of failures, time of last successful sync, the DSA GUID, the database GUID, USNs, etc. Since only one DC is replicating with WIN-DC02 for the domain partition there’s only one value for that partition; while there are three DCs replicating for the Configuration partition and so there are three values for that partition.

Each DC polls the DSAs (DCs) in this attribute for changes (that’s for the scheduled changes, not the ones where the source DC sends and update to all its partners and they poll for changes). If a DC is demoted – i.e. its NTDS settings object is deleted (i.e. the DSA is no longer valid) – the KCC will remove this DSA from the attribute. This prevents replication attempts to demoted DCs. (Prior to Windows 2003 though, and even now if this attribute is assigned a value, there used to be an attribute called replTopologyStayOfExecution. This value had a default of 14 days, and a maximum value of half the tombstone lifetime (the period for which deleted objects are retained). In the presence of this attribute – which existed by default in Window Server 2003 and prior, and can be set if required in later versions – if the KCC detects an invalid DSA, instead of removing it from the repsFrom attribute it will let it remain until such time the duration of the object being deleted exceeds replTopologyStayOfExecution).

Atomicity

Atomicity is a term encountered in databases and operating systems (I first encountered it during my CS classes, specifically the OS course I think). An atomic operation can be thought of as an indivisible operation – meaning all events that take place during an atomic operation either take place together, or they don’t. (It comes from the idea that an atom was thought to be indivisible). With respect to databases, this is a guarantee that if a bunch of values are written in an atomic operation, either all the values are written or none of them. There’s no confusion that perhaps only a few values got committed while the rest were missed out. Even if one value didn’t get written, all others will not be written. That’s the guarantee!

In the context of AD, updates are written to the AD database. And all attribute updates to a single object will be written atomically in a single transaction. However:

If the attributes are linked attributes (remember the previous post where there are attributes with forward and back links, for e.g. member and memberOf) the updates won’t be atomic – not too surprising, they are for different objects after all, and also usually the back link is generated by the system not sent as an update.
Remember: the maximum number of values that can be written in a single transaction is 5000.
To ensure that (nonlinked) attributes to an object are written atomically, updates to nonlinked attributes are prioritized over updates to linked attributes. This happens when a source DC packages all the updates into replication packets. The DC prioritizes nonlinked attributes over linked attributes. When it comes to writing the updates to the destination DC database though:
- For linked attributes, because of parent-child relationships the objects might be written out of order to how the updates are received. This is to ensure that objects are created before any links are applied to that object.
- When an object already exists on the destination DC, even though nonlined attributes are replicated first, they are not guaranteed to be written first to the database. Generally they are applied first, but it’s not guaranteed. (Note to self: I am not very clear about this point).
Remember: the number of values in a replication packet is approximately 100. If there are more than 100 values, again the nonlinked attributes are tried to put in one packet, while the linked attributes can span multiple packets. In such cases, when they are written on the destination DC database, all updates to a single object can require multiple transactions. (They are still guaranteed to be written in the same replication cycle).
Note: Only originating updates must be applied in the same database transaction. Replicated updates can be applied in more than one database transaction.