Contact

Subscribe via Email

Subscribe via RSS/JSON

Categories

Creative Commons Attribution 4.0 International License
© Rakhesh Sasidharan

Elsewhere

PowerShell: Add a line to multiple files

Trivial stuff, but I don’t get to use PowerShell as much as I would like to so I end up forgetting elementary things that should just be second nature to me. Case in hand, I wanted to a add a line to a bunch of text files. Here’s what I came up with – just posting it here as a reference to my future selef.

For the background behind this, I use Private Internet Access for my VPN needs and since last month or so my ISP’s been blocking traffic to it. Private Internet Access offers a client that lets me connects to their servers via UDP or TCP. The UDP option began failing but the TCP option surprisingly worked. Of course I don’t want to use TCP as that’s slow and so I went to Private Internet Access’s website where they give a bunch of OpenVPN config files we can use. These are cool in the sense that some of them connect via IP (instead of server name) while others connect to well-known ports or use a well-known local port and so there’s less chance of being blocked by the ISP. In my case turned out just connecting via IP was more than enough so it looks like the ISP isn’t blocking OpenVPN UDP ports, it’s just blocking UDP traffic to these server names.

Anyhow, next step was to stop the client from prompting for an username/ password. OpenVPN has an option auth-user-pass which lets you specify a file-name where the usrname and password are on separate lines. So all I had to do was create this file and add a line such as auth-user-pass pia.txt to all the configuration files. That’s what the code snippet above does.

Active Directory: Troubleshooting with DcDiag (part 1)

This post originally began as notes on troubleshooting Domain Controller critical services. But when I started talking about DcDiag I went into a tangent explaining each of its tests. That took much longer than I expected – both in terms of effort and post length – so I decided to split it into a post of its own. My notes aren’t complete yet, what follows below is only the first part.

While writing this post I discovered a similar one from the Directory Services Team. It’s by NedPyle, who’s just great when it comes to writing cool posts that explain things, so you should definitely check it out.

DcDiag is your best friend when it comes to troubleshooting AD issues. It is a command-line tool that can identify issues with AD. By default DcDiag will run a series of “default” tests on the DC it is invoked, but it can be asked to run more tests and also test multiple DCs in the site (the /a switch) or across all sites (the /e switch). A quick glance at the DcDiag output is usually enough to tell me where to look further.

For instance, while typing this post I ran DcDiag to check all DCs in one of my sites:

I ran the above from WIN-DC01 and you can see I was straight-away alerted that WIN-DC03 could be having issues. I say “could be” because the errors only say that DcDiag cannot contact the RPC server on WIN-DC03 to check for those particular tests – this doesn’t necessarily mean WIN-DC03 is failing those tests, just that maybe there’s a firewall blocking communication or perhaps the RPC service is down. To confirm this I ran the same test on WIN-DC03 and they succeeded, indicating that WIN-DC03 itself is fine so there’s a communication problem between DcDiag on WIN-DC01 and WIN-DC03. Moreover, DcDiag from WIN-DC03 can query WIN-DC01 so the issue is with WIN-DC03. (In this particular case it was the firewall on WIN-DC03).

Here’s a list of the tests DcDiag can perform:

Advertising

  • Checks whether the Directory System Agent (DSA) is advertising itself. The DSA is a set of services and processes running on every DC. The DSA is what allows clients to access the Active Directory data store. Clients talk to DSA using LDAP (used by Window XP and above), SAM (used by Windows NT), MAPI RPC (used by Exchange server and other MAPI clients), or RPC (used by DCs/DSAs to talk to each other and replicate AD information). More info on the DSA can be found in this Microsoft document.
  • You can think of the DSA as the kernel of the DC – the DSA is what lets a DC behave like a DC, the DSA is what we are really talking about when referring to DCs.
  • Although DNS is used by domain members (and other DCs) to locate DCs in the domain, for a DC to be actually used by others the DSA must be advertising the roles it provides. The nltest command can be used to view what roles a DSA is advertising. For example:

    Notice the flags section. Among other things the DSA advertises that this DC holds the PDC FSMO role (PDC), is a Global Catalog (GC), and that it is a reliable time source (GTIMESERV). Compare the above output with another DC:

    The PDC, GC, and GTIMESERV flags advertised by WIN-DC01 are missing here because this DC does not provide any of those roles. Being a DC it can act as a time source for domain member, hence the TIMESERV flag is present.

  • When DCs replicate they refer to each other via the DSA name rather than the DC name (further enforcing my point from before that the DSA can be thought of as the kernel of the DC – it is what really matters).

    That is why even though a DC in my domain may have the DNS name WIN-DC01.rakhesh.local, in the additional structure that’s used by AD (which I’ll come to later) there’s an entry such as bdb02ab9-5103-4254-9403-a7687ba91488._msdcs.rakhesh.local which is a CNAME to the regular name. These CNAME entries are created by the Netlogon service and are of the format DsaGuid._msdcs.DNSForestName – the CNAME hostname is actually the GUID of the DSA.

  • If you open Active Directory Sites and Services, drill down to a site, then Servers, then expand a particular server – you’ll see the “NTDS Settings” object. This is the DSA. If you right click this object, go to Properties, and select the “Attribute Editor” tab, you will find an attribute called objectGUID. That is the GUID of the DSA – the same GUID that’s there in the CNAME entry.
    ntds-settings

CheckSDRefDom

Before talking about CheckSDRefDom it’s worth talking about directory partitions (also called as Naming Contexts (NC)).

An AD domain is part of a forest. A forest can contain many domains. All these domains share the same schema and configuration, but different domain data. Every DC in the forest thus has some data that’s particular to the domain it belongs to and is replicated with other DCs in the domain; and other data that’s common to the forest and replicated with all DCs in the forest. These are what’s referred to as directory partitions / naming contexts.

Every DC has four directory partitions. These can be viewed using ADSI Edit (adsiedit.msc) tool.

  • “Default naming context” (also known as “Domain”) which contains the domain specific data;
  • “Configuration” (CN=Configuration,DC=forestRootDomain) which contains the configuration objects for the entire forest; and
  • “Schema” (CN=Schema,CN=Configuration,DC=forestRootDomain) which contains class and attribute definitions for all existing and possible objects of the forest. Even though the Schema partition hierarchically looks like it is under the Configuration partition, it is a separate partition.
  • “Application” (CN=...,CN=forestRootDomain – there can be many such partitions) which was introduced in Server 2003 and are user/ application defined partitions that can contain any object except security principals. The replication of these partitions is not bound by domain boundaries – they can be replicated to selected DCs in the forest even if they are in different domains.
    • A common example of Application partitions are CN=ForestDnsZones,CN=forestRootDomain and CN=DomainDnsZones,CN=forestRootDomain which hold DNS zones replicated to all DNS servers in the forest and domain respectively (note that it is not replicated to all DCs in the forest and domain respectively, only a subset of the DCs – the ones that are also DNS servers).

 

If you open ADSI Edit and connect to the RootDSE “context”, then right click the RootDSE container and check its namingContexts attribute you’ll find a list of all directory partitions, including the multiple Application partitions.

rootDSE

Here you’ll also find other attributes such as:

  • defaultNamingContext (DN of the Domain directory partition the DC you are connected to is authoritative for),
  • configurationNamingContext (DN of the Configuration directory partition),
  • schemaNamingContext (DN of the Schema directory partition), and
  • rootNamingContext (DN of the Domain directory partition for the Forest Root domain)

The Configuration partition has a container called Partitions (CN=Partitions,CN=Configuration,DC=forestRootDomain) which contains cross-references to every directory partition in the forest – i.e. Application, Schema, and Configuration directory partitions, as well as all Domain directory partitions. The beauty of cross-references is that they are present in the Configuration partition and hence replicated to all DCs in the forest. Thus even if a DC doesn’t hold a particular NC it can check these cross-references and identify which DC/ domain might hold more information. This makes it possible to refer clients asking for more info to other domains.

The cross-references are actually objects of a class called crossRef.

  • What the CheckSDRefDom test does is that it checks whether the cross-references have an attribute called msDS-SDReferenceDomain set.
  • What does this mean?
    • An Application NC, by definition, isn’t tied to a particular domain. That makes it tricky from a security point of view because if its ACL has security descriptor referring to groups/ users that could belong to any domain (e.g. “Domain Admins”, “Administrator”) there’s no way to identify which domain must be used as the reference.
    • To avoid such situations, cross references to Application directory partitions contain an msDS-SDReferenceDomain attribute which specifies the reference domain.
  • So what the CheckSDRefDom test really does is that it verifies all the Application directory partitions have a reference domain set.
    • In case a reference domain isn’t set, you can always set it using ADSI Edit or other techniques. You can also delegate this.

CheckSecurityError

  • Checks for any security related errors on the DC that might be causing replication issues.
  • Some of the tests done are:
    1. Verify that KDC is working (not necessarily on the target DC, the test only checks that a KDC server is reachable anywhere in the domain, preferably in the same site; even if the target DC KDC service is down but some other KDC server is reachable the test passes)
    2. Verify that the DC”s computer object exists and is within the “Domain Controllers” OU and replicated to other DCs
    3. Check for any KDC packet fragmentation that could cause issues
    4. Check KDC time skew (remember I mentioned previously of the 5 minute tolerance)
    5. Check Service Principle Name (SPN) registration (I’ll talk about SPNs in a later post; check this link for a quick look at what they are and the errors they can cause)
  • This test is not run by default. It must be explicitly specified.
  • Can specify an optional parameter /replsource:... to perform similar tests on that DC and also check the ability to create a replication link between that DC and the DC we are testing against.

Connectivity

  • This is the only DcDiag test that you cannot skip. It runs by default, and is also run even if you perform a specific test.
  • It tests whether the DSAs are registered in DNS, whether they are ping-able, and have LDAP/ RPC connectivity.

CrossRefValidation

Before talking about CheckRefValidation it’s worth revisiting cross-references and application NCs.

Application NCs are actually objects of a class domainDNS with an instanceType attribute value of 5 (DS_INSTANCETYPE_IS_NC_HEAD | DS_INSTANCETYPE_NC_IS_WRITEABLE).

You can create an application NC, for instance, by opening up ADSI Edit and going to the Domain NC, right click, new object, of type domainDNS, enter a Domain Component (DC) value what you want, click Next, then click “More Attributes”, select to view Mandatory/ Both type of properties, find instanceType from the property drop list, and enter a value of 5.
dnsDomain
dnsDomain2The above can be done anywhere in the domain NC. It is also possible to nest application NCs within other application NCs.

Here’s what happens behind the scenes when you make an application NC as above:

  • The application NC isn’t created straight-way.
  • First, the the DSA will check the cross-references in CN=Partitions,CN=Configuration,DC=forestRootDomain to see if one already exists to an Application NC with the same name as you specified.
    • If a cross-reference is found and the NC it points to actually exists then an error will be thrown.
    • If a cross-reference is found but the NC it points to doesn’t exist, then that cross-reference will be used for the new Application NC.
    • If a cross-reference cannot be found, a new one will be created.
  • Cross references (objects of class crossRef) have some important attributes:
    1. CN – the CN of this cross-reference (could be a name such as “CN=SomeApp” or a random GUID “CN=a97a34e3-f751-489d-b1d7-1041366c2b32”)
    2. nCName – the DN of the application NC (e.g. DC=SomeApp,DC=rakhesh,DC=local)
    3. dnsRoot – the DNS domain name where servers that contain this NC can be found (e.g. SomeApp.rakhesh.local).

      (Note this as it’s quite brilliant!) When a new application NC is created, DSA also creates a corresponding zone in DNS. This zone contains all the servers that carry this zone. In the screenshot below, for instance, note the zones DomainDnsZones, ForestDnsZones, and SomeApp2 (which belongs to a zone I created). Note that by querying for all SRV records of name _ldap in _tcp.SomeApp2.rakhesh.local one can easily find the DCs carrying this partition: dnsRoot For the example above, dnsRoot would be “SomeApp2.rakhesh.local” as that’s the DNS domain name.

    4. msDS-NC-Replica-Locations – a list of Distinguished Names (DNs) of DSAs where this application NC is replicated to (e.g. CN=NTDS Settings,CN=WIN-DC01,CN=Servers,CN=COCHIN,CN=Sites,CN=Configuration,DC=rakhesh,DC=local, CN=NTDS Settings,CN=WIN-DC03,CN=Servers,CN=COCHIN,CN=Sites,CN=Configuration,DC=rakhesh,DC=local). replica-locations Initially this attribute has only one entry – the DC where the NC was first created. Other entries can be added later.
    5. Enabled – usually not set, but if it’s set to FALSE it indicates the cross-reference is not in use
  • Once a cross-reference is identified (an existing or a new one) the Configuration NC is replicated through the forest. Then the Application NC is actually created (an object of class domainDNS object as mentioned earlier with an instanceType attribute value of 5 (DS_INSTANCETYPE_IS_NC_HEAD | DS_INSTANCETYPE_NC_IS_WRITEABLE).
  • Lastly, all DCs that hold a copy of this Application NC have their ms-DS-Has-Master-NCs attribute in the DSA object modified to include a DN of this NC. masterNCs

Back to the CrossRefValidation test, it validates the cross-references and the NCs they point to. For instance:

  • Ensure dnsRoot is valid (see here and here for some error messages)
  • Ensure nCName and other attributes are valid
  • Ensure the DN (and CN) are not mangled (in case of conflicts AD can “mangle” the names to reflect that there’s a conflict) (see here for an example of mangled entries)

CutoffServers

If you open AD Sites and Services, expand down to each site, the servers within them, and the NTDS Settings object under each server (which is basically the DSA), you can see the replication partners of each server. For instance here are the partners for two of my servers in one site:

partners1

partners2

Reason WIN-DC01 has links to both WIN-DC03 (in the same site as it) and WIN-DC02 (in a different site) while WIN-DC03 only has links to WIN-DC01 (and not WIN-DC02 which is in a different site) is because WIN-DC01 is acting as a the bridgehead server. The bridgehead server is the server that’s automatically chosen by AD to replicate changes between sites. Each site has a bridgehead server and these servers talk to each other for replication across the site link. All other DCs in the site only get inter-site changes via the bridgehead server of that site. More on it later when I talk about bridgehead servers some other day … for now this is a good post to give an intro on bridgehead servers.

partners3

WIN-DC02, which is my DC in the other site, similarly has only one replication partner WIN-DC01. So WIN-DC01 is kind of link the link between WIN-DC02 and WIN-DC03. If WIN-DC01 were to be offline then WIN-DC02 and WIN-DC03 would be cut off from each other (for a period until the mechanism that creates the topology between sites kicks in and makes WIN-DC03 the bridgehead server between site; or even forever if I “pin” WIN-DC01 as my preferred bridgehead server in which case when it goes down no one else can takeover). Or if the link that connects the two sites to each other were to fail again they’d be cut-off from each other.

  • So what the CutoffServers test does is that it tells you if any servers are cut-off from each other in the domain.
  • This test is not run by default. It must be explicitly specified.
  • This test is best run with the /e switch – which tells DcDiag to test all servers in the enterprise, across sites. In my experience is it’s run against a specific server it usually passes the test even if replication is down.
  • Also in my experience a server is up and running but only LDAP is down (maybe the AD DS service is stopped for instance) – and so it can’t replicate with partners and they are cut-off – the test doesn’t identify the servers as being cut-off. If the server/ link is down then the other servers are highlighted as cut-off.
  • For example I set WIN-DC01 as the preferred bridgehead in my scenario above. Then I disconnect it from the network, leaving WIN-DC02 and WIN-DC03 cut-off.

    If I test WINDC-03 only it passes the test:

    That’s clearly misleading because replication isn’t happening:

    However if I run CutoffServers for the enterprise both WIN-DC02 and WIN-DC03 are correctly flagged:

    Not only is WIN-DC01 flagged in the Connectivity tests but the CutoffServers test also fails WIN-DC02 and WIN-DC03.

  • The /v switch (verbose) is also useful with this test. It will also show which NCs are failing due to the server being cut-off.

DcPromo

  • Checks whether from a DNS point of view the target server can be made a Domain Controller. If the test fails suggestions given.
  • The test has some mandatory switches:
    • /dnsdomain:...
    • /NewForest (a new forest) or /NewTree (a new domain in the forest you specify via /ForestRoot:...)or /ChildDomain (a new child domain) or /ReplicaDC (another DC in the same domain)
  • Needless to say this test isn’t run by default.

DNS

  • Checks the DNS health of the whole enterprise. It has many sub-tests. By default all sub-tests except one are run, but you can do specific sub-tests too.
  • This TechNet page is a comprehensive source of info on what the DNS test does. Tests include checking for zones, network connectivity, client configuration, delegations, dynamic updates, name resolution, and so on.
  • This test is not run by default.
  • Since it is an enterprise-wide test DcDiag requires Enterprise Admin credentials to run tests.

FrsEvent

  • Checks for any errors with the File Replication System (FRS).
  • It doesn’t seem to do an active test. It only checks the FRS Event Logs for any messages in the last 24 hours. If FRS is not used in the domain the test is silently skipped. (Specifying the /v switch will show that it’s being skipped).
  • Take the results with a pinch of salt. Chances are you had some errors but they are now resolved, but since the last 24 hours worth of logs are checked the test will flag previous error messages. Also, FRS may being used for non-SYSVOL replication and these might have errors but that doesn’t really matter as far as the DCs are concerned.
  • There may also be spurious errors a server’s Event Log is not accessible remotely and so the test fails.

DFSREvent

  • Checks for any errors with the Distributed File System Replication (DFSR).
  • Similar to the FrsEvent test. Same caveats apply.

SysVolCheck

  • Checks whether the SYSVOL share is ready.
  • In my experience this doesn’t test doesn’t seem to actually check whether the SYSVOL share is accessible. For example, consider the following:

    Notice SYSVOL exists. Now I delete it.

    But SysVolCheck will happy clear the DC as passing the test:

    So take these test results with a pinch of salt!

  • As an aside, in the case above the Netlogons test will flag the share as missing:
  • There is a registry key HKLM\System\CurrentControlSet\Services\Netlogon\Parameters\SysvolReady which has a value of 1 when SYSVOL is ready and a value of 0 when SYSVOL is not ready. Even if I turn this value to 0 – thus disabling SYSVOL, the SYSVOL and NETLOGON shares stop being shared – the SysvolCheck test still passes. NetLogons flags an error though.

Rest of the tests will be covered in a later post. Stay tuned!

BCD Boot menu policy

As part of searching for something BCD related today I learnt of the BCD Boot menu policy setting.

bootmenupolicy

Introduced in Windows 8 this setting controls whether you get a traditional text based boot menu (Windows 7 are prior) or the new touch friendly GUI based menu (Windows 8 and later).

The bootmenupolicy setting can take either of two values: standard or legacy.

For the text menu:

For the GUI menu:

Notes on Volume Shadow Copy in Windows (or How to backup open PST files via Robocopy)

Since XP and Server 2003 Windows has had this cool feature called Volume Shadow Copy (also known as Shadow Copy or Volume Snapshot Service (VSS)). It’s a cool feature in that it lets you take read-only snapshots of your file-system so you can then trawl through it or take backups and such. When Windows create system restore points or does backups this is what it uses. Without VSS Windows wouldn’t be able to backup files that are in use by the system/ applications as these files are locked; but with VSS it can take a “snapshot” of everything as they are at that point in time and the backup program or system restore can access the files in this snapshot. 

A good overview of the Volume Shadow Copy service can be found in these two (1 & 2) TechNet articles. What follows is a tl;dr version but I suggest you read the original articles. 

  • Volume Shadow Copy consists of 4 components:
    1. the VSS requester, which is the software that actually requests for a shadow copy to be made – for e.g. Windows Backup;
    2. the VSS provider, which is a component that actually creates and maintains shadow copies – for e.g. Windows includes a VSS provider that allows for Copy-on-Write (COW) snapshots, SAN solutions usually have their own VSS providers;
      1. there are three types of providers – hardware providers, software providers, and the system provider which is a part of Windows (see the section on Shadow Copy Providers in this link)
    3. the VSS writer, which is a component of Windows or installed software whose role is to ensure that any data belonging to Windows (e.g. Registry) or such software (e.g. Active Directory, SQL or Exchange databases) is in a consistent state for the shadow copy to be made; 
      1. Windows includes many writers (see the section on In-Box VSS writers in this link)
      2. Databases such as Active Directory and Exchange use transaction logs. Which is to say changes are not written to the database directly, rather they are written to memory first and a note made in a special file (the “transaction log”). During periods of non-peak activity changes in the transaction log are committed to the actual database. This way even if the database were to crash during a transaction, when it comes back up the transaction log can be “replayed” and any uncommitted transactions can be added to the database. Here are some links which explain this concept for Active Directory (1 & 2) and Exchange (1).
      3. What the VSS writer component of Active Directory or Exchange does is that when a snapshot is taken of the database it will be in a consistent state wherein any pending transactions are written to it or are waiting to be written, never in a state where the transactions are being committed to the database.  
    4. the VSS service, which is a coordinator for all the components above. 
  • Here’s how it all falls into place:
    1. The requester tells the service that it wants a shadow copy and so it needs a list of the writers on the system with their metadata. 
    2. The service asks all the writers to get back with the information. Each writer creates an XML description of its components and data stores. Each writer also defines a restore method. 
    3. The service passes the above information to the requester. The requester selects the components it wants to shadow copy.
    4. The service now notifies the writers of the selected components to prepare their data (complete any open transactions, flush caches, roll transaction logs, etc). 
    5. Each writer does so and notifies the service. 
    6. The service tells each writer to temporarily freeze all application I/O write requests (read I/O requests are still allowed).
      1. This freeze is not allowed to take longer than 60 seconds. 
      2. The service also flushes the file system cache.
    7. The service then notifies the provider. The provider makes a shadow copy – this is not allowed to last for longer than 10 seconds, during which all I/O requests to that file system are frozen.
    8. Once copy is done, the service releases the pending I/O requests. 
    9. The service also notifies the writers that they are free to resume writing data as before. 
    10. The service informs the requester on the location of the shadow copy. 
  • There are three methods a provider can use to shadow copy:
    1. Complete copy – makes a read-only clone of the original volume – this is only done by hardware providers
    2. Redirect on Write – leaves the original volume untouched, but creates a new volume and any changes are now redirected to this new volume
    3. Copy on Write (COW) – does not make a copy of the original volume, but any changes made after the shadow copy are copied to a “differences area” and when the shadow copy needs to be accessed it can be logically constructed based on the original volume and the changes in the “differences area” – this can be done by both software and hardware providers.
  • Paging and other temporary files are automatically excluded from snapshots. The FilesNotToSnapshot registry key can be used to exclude additional files from a shadow copy (it is meant to be used by applications, not end users).
  • For the system provider the shadow copy storage area (also called the “differences area” above) must be an NTFS volume. The volume to be shadow copied need not be an NTFS volume. 
    1. The “differences area” need not necessarily be on the same volume as the one being shadow copied. If the volume already has a “differences area”, that is used. Otherwise a “differences area” can be manually associated with a volume. Else a volume is automatically selected based on available free space (with preference being given to a space on the volume that’s being shadow copied). 
    2. Although the volume being shadow copied can be non-NTFS, if you are creating persistent shadow copies then it must be NTFS. (Persistent shadow copies exist even after the shadow copy operation is done. Non-persistent shadow copies exist only for the duration of the operation – such as a backup – and are deleted afterwards). 
  • Maximum number of shadow copied volumes is 64. Maximum number of shadow copies created by the system provider for a volume is 512. 
  • There are three tools (VSS requesters) you can use to manually make and manage shadow copies. 
    1. DiskShadow which is present on Windows Server 2008/ Windows Vista and upwards (only on the server versions I think, as I couldn’t find it on my Windows 7 or Windows 8 install)
    2. VssAdmin which is present on Windows Server 2003/ Windows XP and upwards (the client versions are different from the server versions I think, the version on my Windows 7 and Windows 8 machines didn’t have an option to create shadows but only list and delete shadows)
    3. VShadow which is a part of an SDK package (see this thread for more info) and is better than VssAdmin. Good thing about VShadow is that someone has kindly extracted it from the SDK and made available for download. So if you are on a client version of Windows that’s probably what you will go with. 
  • The FAQ section on this link has more bits and pieces of use (no point in me regurgitating it here! :)). 

Practical Example

I have an USB stick with PST files and such. I can’t take a backup of this USB while Outlook is running as the PST files are locked by it. So what I need to do is use VSS to manually make a shadow copy, then expose said shadow copy as a drive/ path, and use my regular backup tool (robocopy in this case) to backup the files in this shadow copy. Easy peasy? It is!

First, I downloaded the VShadow tools.

To create a shadow here’s the command to use:

 This, however, creates a non-persistent shadow copy. I want a persistent shadow copy though as I want to access it even after VShadow exits. So I add a switch to create persistent copies. 

 Here’s a snippet of the output:

Notice how the VSS service determines which component are to be excluded from the shadow and accordingly excludes those Writers? Then it contacts them (not shown) and proceeds to create a shadow. The end result of this operation will be output similar to this:

A shadow copy is a snapshot of a volume at one point of time. Each shadow copy has a GUID. A shadow copy set is a collection of shadow copies of various volumes all taken at the point of time (so it’s a collection of shadow copies of multiple volumes taken at the same point of time). Each shadow copy set has a persistent GUID. These details are shown in the output above, along with the original volume name and others. The shadow copy gets a device name of its which you can mount via mklink (be sure to add a trailing slash to the device name, i.e. \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy16 becomes \\?\GLOBALROOT\Device\HarddiskVolumeShadowCopy16\):

or

Former creates a directory symbolic link, latter creates a junction. Doesn’t make a difference in this case. The directory must not exist beforehand

You can mount persistent shadow copies via their GUID too as below:

This mounts the shadow copy of GUID {dbc75a3a-0a6c-4540-8ba2-c2d3665b3cc1} at the specified drive letter. You don’t specifically need a drive letter. Any empty folder can be used too as a mount point. Even better, you can expose a shadow copy as a shared folder. The command below will expose the previous shadow copy as a shared folder of the specified name. 

It’s also possible to expose only a sub-folder of the shadow copy as the shared folder. 

Wouldn’t it be cool though if there was a way for the shadow creating command to pass on the GUID and other details so the exposing/ mounting command can easily do that? That would make it very easy in batch files and such, and indeed VShadow has such an option. The command below, which is a modification of the original command, will create a new CMD file which when run populates a set of environment variables that contain the shadow GUID and such. 

 The contents of the vss-setvar.cmd file will be similar to this (this output is from a different shadow copy hence the GUIDs vary from the output above):

As you can see all it does is set some environment variables containing the snapshot ID and snapshot set ID. What this means is that you could have a batch file with the following commands:

Nice, isn’t it? On the second line when you call vss-setvar.cmd it will set the environment variables for you and then you can use the other commands to expose the shadow copy and finally delete it once its backed up. 

It’s also possible to run a command as part of the shadow copy process. As in, this command will run when the shadow copy is created but before Vshadow exists. This is useful when you are dealing with non-persistent shadow copies as the copy will still be present when the command runs. You can run any command or a CMD file, but no parameters can be passed. Here’s how (use the -exec switch):

 For instance, here’s what the vss-robocopy.cmd could contain:

Now the robocopy backup happens as soon as the shadow copy finishes but before Vshadow exists. This means I don’t have to use persistent shadow copies like before because when the backup happens the shadow copy is temporarily exposed any way. I find this technique better – feels more neat to me. 

While reading for this post I came across a page with more Vshadow examples. Also, here’s someone using PowerShell to mount all shadow copies – good stuff. maybe it will be of use to me later!

Lastly, if you are trying to make a persistent shadow copy and get the following error:

 It’s quite likely that the volume you are snapshotting isn’t NTFS formatted. Remember, persistent copies require the volume to be NTFS

Active Directory: Domain Controller critical services

The first of my (hopefully!) many posts on Active Directory, based on the WorkshopPLUS sessions I attended last month. Progress is slow as I don’t have much time, plus I am going through the slides and my notes and adding more information from the Internet and such. 

This one’s on the services that are critical for Domain Controllers to function properly. 

DHCP Client

  • In Server 2003 and before the DHCP Client service registers A, AAAA, and PTR records for the DC with DNS
  • In Server 2008 and above this is done by the DNS Client
  • Note that only the A and PTR records are registered. Other records are by the Netlogon service.

File Replication Services (FRS)

  • Replicates SVSVOL amongst DCs.
  • Starting with Server 2008 it is now in maintenance mode. DFSR replaces it.
    • To check whether your domain is still using FRS for SYSVOL replication, open the DFS Management console and see whether the “Domain System Volume” entry is present under “Replication” (if it is not, see whether it is available for adding to the display). If it is present then your domain is using DFSR for SYSVOL replication.
    • Alternatively, type the following command on your DC. If the output says “Eliminated” as below, your domain is using DFSR for SYSVOL. (Note this only works with domain functional level 2008 and above).
  • Stopping FRS for long periods can result in Group Policy distribution errors as SYSVOL isn’t replicated. Event ID 13568 in FRS log.

Distributed File System Replication (DFSR)

  • Replicates SYSVOL amongst DCs. Replaced functionality previously provided by FRS. 
  • DFSR was introduced with Server 2003 R2.
  • If the domain was born functional level 2008 – meaning all DCs are Server 2008 or higher – DFSR is used for SYSVOL replication.
    •  Once all pre-Server 2008 DCs are removed FRS can be migrated to DFSR. 
    • Apart from the dfsrmig command mentioned in the FRS section, the HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\DFSR\Parameters\SysVols\Migrating Sysvols\LocalState registry key can also be checked to see if DFSR is in use (a value of 3 means it is in use). 
  • If a DC is offline/ disconnected from its peers for a long time and Content Freshness Protection is turned on, when the DC is online/ reconnected DFSR might block SYSVOL replications to & from this DC – resuling in Group Policy distribution errors.
    • Content Freshness Protection is off by default. It needs to be manually turned on for each server.
    • Run the following command on each server to turn it on:

      Replace 60 with the maximum number of days it is acceptable for a DC or DFSR member to be offline. The recommended value is 60 days. And to turn off:

      To view the current setting:

    • Content Freshness Protection exists because of the way deletions work.
      • DFSR is multi-master, like AD, which means changes can be made on any server.
      • When you delete an item on one server, it can’t simply be deleted because then the item won’t exist any more and there’s no way for other servers to know if that’s the case because the item was deleted or because it wasn’t replicated to that server in the first place.
      • So what happens is that a deleted item is “tombstoned“. The item is removed from disk but a record for it remains the in DFSR database for 60 days (this period is called the “tombstone lifetime”) indicating this item as being deleted.
      • During these 60 days other DFSR servers can learn that the item is marked as deleted and thus act upon their copy of the item. After 60 days the record is removed from the database too.
      • In such a context, say we have DC that is offline for more than 60 days and say we have other DCs where files were removed from SYSVOL (replicated via DFSR). All the other DCs no longer have a copy of the file nor a record that it is deleted as 60 days has past and the file is removed for good.
      • When the previously offline DC replicates, it still has a copy of the file and it will pass this on to the other DCs. The other DCs don’t remember that this file was deleted (because they don’t have a record of its deletion any more as as 60 days has past) and so will happily replicate this file to their folders – resulting in a deleted file now appearing and causing corruption.
      • It is to avoid such situations that Content Freshness Protection was invented and is recommended to be turned on.
    • Here’s a good blog post from the Directory Services team explaining Content Freshness Protection.

DNS Client

  • For Server 2008 and above registers the A, AAAA, and PTR records for the DC with DNS (notice that when you change the DC IP address you do not have to update DNS manually – it is updated automatically. This is because of the DNS Client service).
  • Note that only the A, AAAA, and PTR records are registered. Other records are by the Netlogon service.  

DNS Server

  •  The glue for Active Directory. DNS is what domain controllers use to locate each other. DNS is what client computers use to find domain controllers. If this service is down both these functions fail.  

Kerberos Distribution Center (KDC)

  • Required for Kerberos 5.0 authentication. AD domains use Kerberos for authentication. If the KDC service is stopped Kerberos authentication fails. 
  • NTLM is not affected by this service. 

Netlogon

  • Maintains the secure channel between DCs and domain members (including other DCs). This secure channel is used for authentication (NTLS and Kerberos) and DC replication.
  • Writes the SRV and other records to DNS. These records are what domain members use to find DCs.
    • The records are also written to a file %systemroot%\system32\config\Netlogon.DNS. If the DNS server doesn’t support dynamic updates then the records in this text file must be manually created on the DNS server. 

Windows Time

  • Acts as an NTP client and server to keep time in sync across the domain. If this service is down and time is not in sync then Kerberos authentication and AD replication will fail (see resolution #5 in this link).
    • Kerberos authentication may not necessarily break for newer versions of Windows. But AD replication is still sensitive to time.  
  • The PDC of the forest root domain is the master time keeper of the forest. All other DCs in the forest will sync time from it.
    • The Windows Time service on every domain member looks to the DC that authenticates them for time time updates.
    • DCs in the domain look to the domain PDC for time updates. 
    • Domain PDCs look to the domain PDC of the domain above/ sibling to them. Except the forest root domain PDC who gets time from an external source (hardware source, Internet, etc).
  • From this link: there are two registry keys HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Config\MaxPosPhaseCorrection and HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W32Time\Config\MaxNegPhaseCorrection that restrict the time updates accepted by the Windows Time service to the number of seconds defined by these values (the maximum and minimum range). This can be set directly in the registry or via a GPO. The recommended value is 172800 (i.e. 48 hours).

w32tm

The w32tm command can be used to manage time. For instance:

  • To get an idea of the time situation in the domain (who is the master time keeper, what is the offset of each of the DCs from this time keeper):
  • To ask the Windows Time service to resync as soon as possible (the command can target a remote computer too via the /computer: switch)

    • Same as above but before resyncing redetect any network configuration changes and rediscover the sources:
  • To get the status of the local computer (use the /computer: switch to target a different computer)
  • To show what time sources are being used:
  • To show who the peers are:
  • To show the current time zone:

    • You can’t change the time zone using this command; you have to do:

On the PDC in the forest root domain you would typically run a command like this if you want it to get time from an NTP pool on the Internet:

Here’s what the command does:

  • specify a list of peers to sync time from (in this example the NTP Pool servers on the Internet);
  • the /update switch tells w32tm to update the Windows Time service with this configuration change;
  • the /syncfromflags:MANUAL tells the Windows Time service that it must only sync from these sources (other options such as “DOMHIER” tells it to sync from the domain peers only, “NO” tells it sync from none, “ALL” tells it to sync from both the domain peers and this manual list);
  • the /reliable:YES switch marks this machine as special in that it is a reliable source of time for the domain (read this link on what gets set when you set a machine as RELIABLE).

Note: You must manually configure the time source on the PDC in the forest root domain and mark it as reliable. If that server were to fail and you transfer the role to another DC, be sure to repeat the step. 

On other machines in the domain you would run a command like this:

This tells those DCs to follow the domain hierarchy (and only the domain hierarchy) and that they are not reliable time sources (this switch is not really needed if these other DCs are not PDCs).

Active Directory Domain Services (AD DS)

  • Provides the DC services. If this service is stopped the DC stops acting as a DC. 
  • Pre-Server 2008 this service could not be stopped while the OS was online. But since Server 2008 it can be stopped and started. 

Active Directory Web Services (AD WS)

  • Introduced in Windows Server 2008 R2 to provide a web service interface to Active Directory Domain Services (AD DS), Active Directory Lightweight Domain Services (AD LDS), and Active Directory Database Mounting Tool instances running on the DC.
    • The Active Directory Database Mounting Tool was new to me so here’s a link to what it does. It’s a pretty cool tool. Starting from Server 2008 you can take AD DS and AD LDS snapshots via the Volume Snapshots Service (VSS) (I am writing a post on VSS side by side so expect to see one soon). This makes use of the NTDS VSS writer which ensures that consistent snapshots of the AD databases can be taken. The AD snapshots can be taken manually via the ntdsutil snapshot command or via backup software  or even via images of the whole system. Either ways, once you have such snapshots you can mount the one(s) you want via ntdsutil and point Active Directory Database Mounting Tool to it. As the tool name says it “mounts” the AD database in the snapshot and exposes it as an LDAP server. You can then use tools such as ldp.exe of the AD Users and Computers to go through this instance of the AD database. More info on this tool can be found at this and this link.
  • AD WS is what the PowerShell Active Directory module connects to. 
  • It is also what the new Active Directory Administrative Center (which in turn uses PowerShell) too connects to.
  • AD WS is installed automatically when the AD DS or AD LDS roles are installed. It is only activated once the server is promoted to a DC or if and AD LDS instance is created on it. 

Notes on TLS/SSL, RSA, DSA, EDH, ECDHE, and so on …

The CloudFlare guys make excellent technical posts. Recently they introduced Keyless SSL (which is a way of conducting the SSL protocol wherein the server you are talking to does not necessarily need to have the private key) and as part of the post going into its technical details they talk about the SSL protocol in general. Below are my notes on this and a few other posts. Crypto intrigues me as I like encryption and privacy so this is an area of interest. 

Note: This is not a summary of the CloudFlare blog post. It was inspired by that post but I talk about a lot more basic stuff below. 

The TLS/SSL protocol

First things first – what we refer to as Secure Sockets Layer (SSL) protocol is not really SSL but Transport Layer Security (TLS).

  • Versions 1.0 to 3.0 of SSL were called, well … SSL 1.0 to SSL 3.0. 
  • TLS 1.0 was the upgrade from SSL 3.0. It is very similar to SSL such that TLS 1.0 is often referred to as SSL 3.1. 
  • Although the differences between TLS 1.0 and SSL 3.0 are not huge, the two cannot talk to each other. TLS 1.0, however, includes a mode wherein it can talk to SSL 3.0 but this decreases security. 

The world still refers to TLS as SSL but keep in mind it’s really TLS. TLS has three versions so far – TLS 1.0, TLS 1.1, and TLS 1.2. A fourth version, TLS 1.3, is currently in draft form. I would be lying if I said I know the differences between these versions (or even the differences between SSL 3.0 and TLS 1.0) so it’s best to check the RFCs for more info!

TLS/SSL goals

The TLS/SSL protocol has two goals:

  1. Authenticate the two parties that are talking with each other (authentication of a server is the more common scenario – such as when you visit your bank’s website for instance – but authentication of the user/ client too is supported and used in some scenarios). 
  2. Protect the conversation between the two parties.

Both goals are achieved via encryption.

Encryption is a way of “locking” data such that only a person who has a “key” to unlock it can read it. Encryption mechanisms are essentially algorithms: you take a message, follow the steps of the algorithm, and end up with an encrypted gobbledygook. All encryption algorithms make use of keys to lock and unlock the message – either a single key (which both encrypts and decrypts the message) or two keys (either can encrypt and decrypt).

Shared key encryption/ Symmetric encryption

A very simple encryption algorithm is the Caesar Cipher where all you do is take some text and replace the letters with letters that are a specified number away from it (for example you could replace “A” with “B”, “B” with “C”, and so on or “A” with “C”, “B” with “D”, and so on … in the first case you replace with a letter one away from it, in the second case you replace with a letter two away from it). For the Caesar Cipher the key is simply the “number of letters” away that you choose. Thus for instance, if both parties decide to use a key of 5, the encrypting algorithm will replace each letter with one that’s 5 letters away, while the decrypting algorithm will replace each letter with one that’s 5 letters before. The key in this case is a shared key – both parties need to know it beforehand. 

Encryption algorithms where a single shared key is used are known as symmetric encryption algorithms. The operations are symmetrical – you do something to encrypt, and you undo that something to decrypt. TLS/SSL uses symmetric encryption to protect the conversation between two parties. Examples of the symmetric encryption algorithms TLS/SSL uses are AES (preferred), Triple DES, and RC4 (not preferred). Examples of popular symmetric  algorithms can  be found on this Wikipedia page

Symmetric encryption has the obvious disadvantage that you need a way of securely sharing the key beforehand, which may not always be practical (and if you can securely share the key then why not securely share your message too?).

Public key encryption/ Asymmetric encryption

Encryption algorithms that use two keys are known as asymmetric encryption or public key encryption. The name is because such algorithms make use of two keys – one of which is secret/ private and the other is public/ known to all. Here the operations aren’t symmetrical- you do something to encrypt, and you do something else to decrypt. The two keys are special in that they are mathematically linked and anything that’s encrypted by one of the keys can be decrypted only by the second key. A popular public key encryption algorithm is RSA. This CloudFlare blog post on Elliptic Curve Cryptography (ECC), which is itself an example of a public key encryption algorithm, has a good explanation of how RSA works. 

Public key encryption can be used for encryption as well as authentication. Say there are two parties, each party will keep its private key to itself and publish the public key. If the first party wants to encrypt a message for the second party, it can encrypt it using the public key of the second party. Only the second party will be able to decrypt the message as only it holds the private key. Thus the second party is authenticated as only the second party holds the private key corresponding to the public key.

Public key cryptography algorithms can also be used to generate digital signatures. If the first party wants to send a message to the second party and sign it, such that the second party can be sure it came from the first party, all the first party needs to do is send the message as usual but this time take a hash of the message encrypt that with its private key. When the second party receives this message it can decrypt the hash via the public key of the first party, make a hash itself of the message, and compare the two. If the hashes match it proves that the message wasn’t tampered in progress and also that the first party has indeed signed it as only its private key locked message can be unlocked by the public key. Very cool stuff actually!

Not all public key cryptography algorithms are good at encryption and signing nor are they required to be so. RSA, for instance, is good at encryption & decryption. Another algorithm, DSA (Digital Signature Algorithm) is good at signing & validation. RSA can do signing & validation too but that’s due to the nature of its algorithm

A question of trust

While public key encryption can be used for authentication there is a problem. What happens if a third party publishes its public key on the network but claims that it is the second party. Obviously it’s a fraud but how is the first party to know of that? Two ways really: one way is the first party can perhaps call or through some other means verify with the second party as to what its public key is and thus choose the correct one – this is tricky because it has to verify the identity somehow and be sure about it – or, the second way, there can be some trusted authority that verifies this for everyone – such a trusted authority will confirm that such and such public key really belongs to the second party. 

These two ways of finding if you can trust someone are called the Web of Trust (WoT) and Public Key Infrastructure (PKI) respectively. Both achieve the same thing, the difference being the former is decentralized while the latter is centralized. Both of these make use of something called certificates – which is basically a digital document that contains the public key as well as some information on the owner of the public key (details such as the email address, web address, how long the certificate is valid for, etc). 

Web of Trust (WoT)

In a WoT model everyone uploads their certificates to certain public servers. When a first party searches for certificates belonging to a second party it can find them – both legitimate ones (i.e. actually belonging to the second party) as well as illegitimate ones (i.e. falsely uploaded by other parties claiming to the be second party). By default though the first party doesn’t trust these certificates. It only trusts a certificate once it verifies through some other means as to which is the legitimate one – maybe it calls up the second party or meets the party in person or gets details from its website. The first party can also trust a certificate if someone else it already trusts has marked a certificate as trusted. Thus each party in effect builds up a web of certificates – it trusts a few and it trusts whatever is trusted by the few that it trusts. 

To add to the pool once a party trusts a certificate it can indicate so for others to see. This is done by signing the certificate (which is similar to the signing process I mentioned earlier). So if the first party has somehow verified that a certificate it found really belongs to the second party, it can sign it and upload that information to the public servers. Anyone else then searching for certificates of the second party will come across the true certificate too and find the signature of the first party. 

WoT is what you use with programs such as PGP and GnuPG. 

Public Key Infrastructure (PKI)

In a PKI model there are designated Certificate Authorities (CA) which are trusted by everyone. Each party sends their certificates to the CA to get it signed. The CA verifies that the party is who it claims to be and then signs it. There are many classes of validation – domain validation (the CA verifies that the requester can manage the domain the certificate is for), organization validation (the CA also verifies that the requester actually exists), and extended validation (a much more comprehensive validation than the other two). 

Certificate Authorities have a tree structure. There are certain CAs – called root CAs – which are implicitly trusted by everyone. Their certificates are self-signed or unsigned but trusted by everyone. These root CAs sign certificates of other CAs who in-turn might sign certificates of other CAs or of a requester. Thus there’s a chain of trust – a root CA trust an intermediary CA, who trusts another intermediary CA, who trusts (signs) the certificate of a party. Because of this chain of trust, anyone who trusts the root CA will trust the certificate signed by one of its intermediaries. 

 It’s probably worth pointing out that you don’t really need to get your certificate signed by a CA. For instance, say I want to encrypt all traffic between my computer and this blog and so I create a certificate for the blog. I will be the only person using this – all my regular visitors will visit the blog unencrypted. In such a case I don’t have to bother with them not trusting my certificate. I trust my certificate as I know what its public key and details look like, so I can install the certificate and use an https link when browsing the blog, everyone else can use the regular http link. I don’t need to get it signed by a CA for my single person use. It’s only if I want the general public to trust the certificate that I must involve a CA. 

PKI is what you use with Internet Browsers such as Firefox, Chrome, etc. PKI is also what you use with email programs such as Outlook, Thunderbird, etc to encrypt communication with a server (these emails program may also use WoT to encrypt communication between a sender & recipient). 

TLS/SSL (contd)

From here on I’ll use the words “client” and “server” interchangeably with “first party” and “second party”. The intent is the same, just that it’s easier to think of one of one party as the client and the other as a server. 

TLS/SSL uses both asymmetric and symmetric encryption. TLS/SSL clients use asymmetric encryption to authenticate the server (and vice-versa too if required) and as part of that authentication they also share with each other a symmetric encryption key which they’ll use to encrypt the rest of their conversation. TLS/SSL uses both types of encryption algorithms because asymmetric encryption is computationally expensive (by design) and so it is not practical to encrypt the entire conversation using asymmetric encryption (see this StackExchange answer for more reasons). Better to use use asymmetric encryption to authenticate and bootstrap symmetric encryption. 

When a TLS/SSL client contacts a TLS/SSL the server sends the client its certificate. The client validates it using the PKI. Assuming the validation succeeds, client and server perform a “handshake” (a series of steps) the end result of which is (1) authentication and (2) the establishment of a “session key” which is the symmetric key used for encrypting the rest of the conversation. 

CloudFlare’s blog post on Keyless SSL goes into more details of the handshake. There are two types of handshakes possible: RSA handshakes and Diffie-Hellman handshakes. The two types of handshakes differ in terms of what algorithms are used for authentication and what algorithms are used for session key generation. 

RSA handshake

RSA handshakes are based on the RSA algorithm which I mentioned earlier under public key encryption. 

An RSA handshake uses RSA certificates for authentication (RSA certificates contain a public RSA key). Once authentication is successful, the client creates a random session key and encrypts with the public key of the server (this encryption uses the RSA algorithm). The server can decrypt this session key with its private key, and going forward both client & server use this session key to encrypt further traffic (this encryption does not use the RSA algorithm, it uses one of the symmetric key algorithms).

 The RSA handshake has a drawback in that if someone were to capture and store past encrypted traffic, and if the server’s private key were to somehow leak, then such a person could easily decrypt the session key and thus decrypt the past traffic. The server’s private key plays a very crucial role here as it not only authenticates the server, it also protects the session key. 

Diffie-Hellman handshake

Diffie-Hellman handshakes are based on the Diffie-Hellman key exchange algorithm. 

The Diffie-Hellman key exchange is an interesting algorithm. It doesn’t do any encryption or authentication by itself. Instead, it offers a way for two parties to generate a shared key in public (i.e. anyone can snoop in on the conversation that takes place to generate the secret key) but the shared key is secret and only the two parties know of it (i.e. the third party snooping in on the conversation can’t deduce the shared key). A good explanation of how Diffie-Hellman does this can be found in this blog post. Essentially: (1) the two parties agree upon a large prime number and a smaller number in public, (2) each party then picks a secret number (the private key) for itself and calculates another number (the public key) based on this secret number, the prime number, and the smaller number, (3) the public keys are shared to each other and using each others public key, the prime number, and the small number, each party can calculate the (same) shared key. The beauty of the math involved in this algorithm is that even though a snooper knows the prime number, the small number, and the two public keys, it still cannot deduce the private keys or the shared key!

There are two versions of the Diffie-Hellman algorithm:

  • a fixed/ static version, where both parties use the same public/ private keys (and hence same shared key) across all their conversations; and
  • an ephemeral version, where one party keeps changing its public/ private key (and hence the shared key)

Since the Diffie-Hellman algorithm does not do authentication it needs some other mechanism to authenticate the client and server. It can use an RSA certificate (certificates containing an RSA public key) or a non-RSA certificates – for example DSA certificates (certificates containing a DSA public key) and ECDSA (Elliptic Curve Digital Signature Algorithm) certificates (ECDSA is a variant of DSA that uses Elliptic Curve Cryptography (ECC). ECDSA certificates contain an ECC public key. ECC keys are better than RSA & DSA keys in that the ECC algorithm is harder to break. So not only are ECC keys more future proof, you can also use smaller length keys (for instance a 256-bit ECC key is as secure as a 3248-bit RSA key) and hence the certificates are of a smaller size). 

The fixed/ static version of Diffie-Hellman requires a Diffie-Hellman certificate for authentication (see here and here). Along with the public key of the server, this certificate also contains the prime number and smaller number required by the Diffie-Hellman algorithm. Since these numbers are a part of the certificate itself and cannot change, Diffie-Hellman certificates only work with fixed/ static Diffie-Hellman algorithms and vice-versa. A Diffie-Hellman handshake that uses the fixed/ static Diffie-Hellman algorithm has the same drawback as a RSA handshake. If the server’s private key is leaked, past traffic can be decrypted.  

The ephemeral version of Diffie-Hellman (often referred to as EDH (Ephermeral Diffie-Hellman) or DHE (Diffie-Hellman Ephemeral)) works with RSA certificates, DSA certificates, and ECDSA certificates. EDH/ DEH is computationally expensive as it is not easy to keep generating a new prime number and small number for every connection. A variant of EDH/ DEH that uses elliptic curves – known as Elliptic Curve Diffie-Hellman Ephemeral (ECDHE) – doesn’t have the performance hit of EDH/ DEH and is preferred. 

A Diffie-Hellman handshake that uses EDH/ DEH or ECDHE doesn’t have the drawback of an RSA handshake. The server’s private key is only used to authenticate it, not for generating/ protecting the shared session key. This feature of EDH/DHE and ECDHE wherein the shared keys are generated for each connection and are shared keys themselves are random and independent of each other is known as Perfect Forward Secrecy (PFS). (Perfect Forward Secrecy is a stronger form of Forward Secrecy. The latter does not require the shared keys themselves be independent of each other, only that the shared keys not be related to the server’s private/ public key). 

To be continued …

Writing this post took longer than I expected so I’ll conclude here. I wanted to explore TLS/SSL in the context of Windows and Active Directory, but I got side-tracked talking about handshakes and RSA, ECDHE, etc. Am glad I went down that route though. I was aware of elliptic curves and ECDHE, ECDSA etc. but had never really explored them in detail until now nor written down a cumulative understanding of it all. 

ReAgentC: Operation failed: 3

The other day I mentioned that whenever I run ReAgentC I get the following error – 

I posted to the Microsoft Community forums hoping to get a solution. The suggested solutions didn’t seem to help but oddly ReAgentC is now working – not sure why. 

One thing I learnt is that the error code 3 means a path is not found. My system didn’t have any corruptions (both sfc /scannow and dism /Online /Cleanup-image /Scanhealth gave it a clean chit) and I did a System Restore too to a point back in time when I know ReAgentC was working well but that didn’t help either. Windows RE itself was working fine as I was able to boot into it. 

In the end I shutdown the laptop and left it for a couple of days as I had other things to do. And when looked at it today ReAgentC was surprisingly working!

I am not sure why it is now working. One theory is that a few updates were applied automatically as I was shutting down the laptop so maybe they fixed some corruption. Or maybe when I booted into Windows RE and booted back that fixed something? (I don’t remember whether I tried ReAgentC after booting back from Windows RE. I think I did but I am not sure). 

Here’s a little PowerShell to find all the updates installed in the last 3 days. Thought I’d post it because I am pleased I came up with it and also maybe it will help someone else. 

This will only work in Windows 8 and above (I haven’t tried but I think installing the Windows Management Framework 4.0 on Windows 7 SP1 and/ or Windows Server 2008 R2 SP1 will get it working on those OSes too). 

Update: And it stopped working again the next day! The laptop was on overnight. The next day I rebooted as it had some pending updates. After the reboot we are back to square one. Of course I removed those two updates and rebooted to see if that helps. It doesn’t. 

 Fun! :)

 

Notes of SFC and Windows Servicing (Component Based Servicing)

SFC

SFC (used to be “System File Checker” but is now called Windows Resource Checker) is an in-built tool for verifying the integrity of all Windows systems files and also replacing them with good versions in case of any corruptions. It has been around for a while – I first used it on Windows 2000 I think – though am sure there’s many differences from that version and the latest ones. For instance, XP and prior used to store a copy of the system protected files in a folder called %WINDIR%\System32\DLLCache (but it would remove some of the protected files as disk space became scarce resulting in SFC prompting for the install media when repairing) while Vista and upwards use the %WINDIR%\System32\WinSxS folder (and its sub-folder %WINDIR%\System32\WinSxS\Backup).

The WinSxS folder

Here is a good article on the WinSxS folder. I recommend everyone read it.

Basically, the WinSxS folder contains all the protected files (under a “WinSxS\Backup” folder) as well as multiple version of DLLs and library components needed by various applications. In a way this folder is where all the DLLs of the system are actually present. The DLLs that you see at other locations are actually NTFS hard links to the ones in this location. So even though the folder might appear HUGE when viewed through File Explorer, don’t be too alarmed as many files that you think might be taking up space elsewhere are not actually taking up space because they are hard links to the files here. But yes, the WinSxS is huge also because it has all those old DLLs and components, and you cannot delete the files in here because you never know what application depends on it. Moreover, you can’t even move the folder to a different location as it has to be in this known location for the hard links to work.

In contrast to the WinSxS folder, the DLLcache folder can be moved anywhere via a registry hack. Also, the DLLcache folder doesn’t keep all the older libraries and such.

The latest versions of SFC can also work against an offline install of Windows.

Here’s SFC on my computer complaining that it was unable to fix some errors:

It is best to view the log file using a tool like trace32 or cmtrace. Here’s a Microsoft KB article on how to use the log file. And here’s a KB article that explains how to manually repair broken files.

Tip

Rather than open the CBS.log in trace32 it is better to filter the SFC bits first as suggested in the above KB articles. Open an elevated command prompt and type this:

Open this file (saved on your Desktop) in trace32 and look for errors.

Servicing

Servicing is the act of enabling/ disable a role/ feature in Windows, installing/ uninstalling updates and service packs. You can service both currently running and offline installations of Windows (yes, that means you can have an offline copy of Windows on a hard disk or a VHD file and you can enable/ disable features and roles on it as well as install hot fixes and updates (but not service packs) – cool huh!). If you have used DISM (Deployment Image Servicing and Management) in Windows 7 and upwards (or pkgmgr.exe & Co. in Vista) then you have dealt with servicing.

File Based Servicing

Windows XP and before used to have File Based Servicing. The update or hotfix package usually had an installer (update.exe or hotfix.exe) that updated the files and libraries on the system. If these were system files they were installed/ updated at WINDIR%\System32 and a copy kept at the file protection cache %WINDIR\System32\DLLcache (remember from above?). If the system files were in use, a restart would copy the files from %WINDIR\System32\DLLcache to WINDIR%\System32. Just in case you needed to rollback an update, a backup of the files that were changed was kept at C:\Windows\$Uninstall$KBnnnnnn (replace “nnnnnn” with the KB number). Life was simple!

Component Based Servicing

Windows Vista introduced Component Based Servicing (CBS). Whereas with File Based Servicing everything was mashed together, now there’s a notion of things being in “components”. So you could have various features of the OS be turned off or on as required (i.e. features and roles). The component itself could be installed to the OS but not active (for example: the files for a DNS server are already installed in a Server install but not activated; when you enable that role, Windows does stuff behind the scenes to activate it). This extends to updates and hotfixes too. For instance, when you install the Remote Server Admin Tools (RSAT) on Windows 7, it installs all the admin tool components but none of these are active by default. All the installer does is just add these components to your system. Later, you go to “Programs and Features” (or use DISM) to enable the components you want. CBS is the future, so that’s what I’ll be focussing on here.

Components

From this blog post:

A component in Windows is one or more binaries, a catalog file, and an XML file that describes everything about how the files should be installed. From associated registry keys and services to what kind security permissions the files should have. Components are grouped into logical units, and these units are used to build the different Windows editions. Each component has a unique name that includes the version, language, and processor architecture that it was built for.

Component Store

Remember the %WINDIR%\System32\WinSxS folder above? That’s where all these components are stored. That folder is the Component Store. (As an aside: “SxS” stands for “Side by Side”. It is a complete (actually, more than complete) install of Windows that lives side by side to the running installation of Windows). When you install a component in Windows Vista and above, the files are actually stored in this component folder. Then, if the component feature/ role/ update is activated, hardlinks are created from locations in the file system to the files here. So, for instance, when you install a Server its component folder will already contains the files for the DNS role; later, when you enable the role hard links are created from WINDIR%\System32 and elsewhere to the files in %WINDIR%\System32\WinSxS.

Payloads

Microsoft refers to the files (binaries such as libraries etc) in the WinSxS folder as payloads. So components consist of payloads that are stored in the WinSxS folder and manifests (not sure what they are) that are stored in the WinSxS\manifests folder.

Component Stack

Here’s a post from the Microsoft Servicing Guy on CBS. Like we had update.exe on XP and before, now we have trustedinstaller.exe which is the interface between the servicing stack and user-facing programs such as “Programs and Features”, DISM, MSI, and Windows Update. The latter pass on packages (and downloads them if necessary) to trustedinstaller.exe who invokes other components of the CBS stack to do the actual work (components such as CSI (Component Servicing Infrastructure), which you can read about in that link).

It is worth pointing out that dependency resolution (i.e. feature Microsoft-Hyper-V-Management-PowerShell requires feature Microsoft-Hyper-V-Management-Clients for instance) is done by the CBS stack. Similarly, the CBS stack is what identifies whether any files required for a feature/ role are already present on the system or need to be downloaded. All this info is passed on to the user-facing programs that interact with the user for further action.

Related folders and locations

Apart from the Component Store here are some other folders and files used by the CBS:

  • %windir%\WinSXS\Manifests – Sub-folder of the Component Store, contains manifests
  • %windir%\Servicing\Packages – A folder that contains the packages of a component. Packages are like components, they contain binaries (the payloads) and manifests (an XML file with the extension .MUM defining the payload as well as the state of the package (installed and enabled, only installed, not installed)). When you run Windows Update, for instance, you download packages that in turn update the components.

    A component might contain many packages. For instance, the Telnet-Client feature has just one package Microsoft-Windows-Telnet-Server-Package~31bf3856ad364e35~amd64~en-US~6.3.9600.16384.mum on my machine, but the HyperV-Client role has more than a dozen packages – Microsoft-Hyper-V-ClientEdition-Package~31bf3856ad364e35~amd64~en-US~6.3.9600.16384.mum being the package when the OS was installed, followed by packages such as Package_1033_for_KB2919355~31bf3856ad364e35~amd64~~6.3.1.14.mum and Package_13_for_KB2903939~31bf3856ad364e35~amd64~~6.3.1.2.mum, etc for various updates that were applied to it. (Remember: In XP and before updates targeted files. Now updates target components. So updates apply to components).

    An update that you install – say KBxxxxxxxx – might have multiple packages with each of these packages targeting different components of the system. The payload in a package is copied to the Component Store; only the .MUM defining the package is left in the %windir%\Servicing\Packages folder. Moreover, each component is updated with details of the package which affects it – this is what we see happening when an update is applied to the OS and Windows takes a long time configuring things. (Remember components are self-sufficient. So it also knows of the updates to it).

    You can get a list of packages installed on your system using the /Get-Packages switch to DISM:

    To get the same info as a table rather than list (the default):

  • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Component Based Servicing – A registry key tree holding a lot of the servicing information.

    For instance, HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Component Based Servicing\PackageDetect\Microsoft-Hyper-V-ClientEdition-Package~31bf3856ad364e35~amd64~~0.0.0.0 on my machine tells me which packages are a part of that component.

    Note that the above component name doesn’t have a language. It is common to all languages. There are separate keys – such as HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Component Based Servicing\PackageDetect\Microsoft-Hyper-V-ClientEdition-Package~31bf3856ad364e35~amd64~en-US~0.0.0.0 – which contain packages for that specific language variant of the component.

  • %windir%\WinSXS\Pending.xml – An XML file containing a list of commands and actions to be performed after a reboot for pending updates (i.e. files that couldn’t be modified as they are in use)
  • %windir%\Logs\CBS\CBS.log – The CBS log file.

Here’s a blog post from The Servicing Guy talking about the above locations. Unfortunately, it’s only part 1 as he didn’t get around to writing the follow-ups.

Summary

Here’s a high-level summary of many of the points I touched upon above:

  • Windows Vista and upwards use Component Based Servicing. A component is a self-sufficient unit. It includes binaries (files and libraries) as well as some metadata (XML files) on where the files should be installed, security rights, registry & service changes, etc. In Windows Vista and upwards you think of servicing in terms of components.
  • The files of a component are stored in the Component Store (the WinSxS folder). Everything else you see on the system are actually hard-links to the files in this Component Store.
  • When a component is updated the older files are not removed. They stay where they are, with newer versions of any changed files being installed side by side to them and references to these files from elsewhere are now hard-linked to this newer version. This way any other components or applications that require the older versions can still use them.
  • Components can be thought of as being made up of packages. When you download an update it contains packages. Packages target components. The component metadata is updated so it is aware that such and such package is a part of it. This way even if a component is not currently enabled on the system, it can have update packages added to it, and if the component is ever enabled later it will already have the up-to-date version.
  • Remember you must think of everything in terms of components now. And components are self-sufficient. They know their state and what they do.

Just so you know …

I haven’t worked much with CBS except troubleshooted when I have had problems or added/ removed packages and such when I am doing some basic servicing tasks on my machine/ virtual labs. Most of what I explain above is my understanding of things from the registry key and the folders supplemented with information I found in blog posts and articles. Take what you read here with a pinch of salt.

Service store corruptions

The Component Store can get corrupted, resulting in errors when installing updates and/ or enabling features. Note: this is not corruption with the system – which can be fixed via tools such as SFC – but corruptions to the Component Store itself.

Windows 8 and later

Windows 8 and upwards can detect and fix such corruptions using DISM /Cleanup-Image (don’t forget to specify /Online for online servicing or /Image:\path\to\install for offline servicing):

  • DISM /Cleanup-Image /ScanHealth will scan the Component Store for errors. It does not fix the error, only scans and updates a marker indicating the Store has errors. Any errors are also logged to the CBS.Log file.
  • DISM /Cleanup-Image /RestoreHealth will scan as above and also fix the error (so it’s better to run this than scan first and then scan & repair).
  • DISM /Cleanup-Image /CheckHealth will check whether there’s any marker indicating the system has errors. It does not do a scan by itself (so there’s no real point to running this, except to quickly check whether any tool has previously set the marker).

If PowerShell is your weapon of choice (yaay to you!), you can use Repair-WindowsImage -ScanHealth | -RestoreHealth | -CheckHealth instead.

If corruptions are determined and you have asked for a repair then Windows Update/ WSUS are contacted for good versions of the components. The /LimitAccess switch can be used to disable this; the /Source switch can be used to specify a source of your own (you must point to the WinSxS folder of a different Windows installation; see this TechNet article for some examples). (Update: WSUS is not a good source so it’s better to use gpedit.msc or GPOs to temporarily specify a Windows Update server, or use the /LimitAccess switch to not contact WU/ WSUS at all and specify a WinSxS folder to use).

Example:

Windows 7 and prior

Windows 7 and below use a Microsoft tool called CheckSUR (Check System Update Readiness).

Here’s a blog post on using CheckSUR to repair corruption. Note that CheckSUR can only repair manifests while DISM /Cleanup-Image can do both manifests and payloads.

Managing the component store size

The Component Store will keep growing as more updates are installed to a system. One way to reduce the size is to tell Windows to remove all payloads from older Service Packs. For instance, say the machine began life as Windows 7, then had updates applied to it, then a Service Pack. You know you will never uninstall this Service Pack so you are happy with removing all the older payloads from WinSxS and essentially tell the OS that from now on it must consider itself as Windows 7 Service Pack 1 for good – there’s no going back.

Here’s how you can do that for the various versions of Windows:

  • Windows Vista Service Pack 1 uses a tool called VSP1CLN.EXE to do this.
  • Windows Vista Service Pack 2 and Windows Server 2008 SP2 use a tool called Compcln.exe
  • Windows 7 Service Pack 1, Windows Server 2008 R2 Service Pack 1, and above use DISM /online /Cleanup-Image /SpSuperseded (for Windows 7 Service Pack 1 with update KB2852386 you can also use the Disk Cleanup Wizard (cleanmgr.exe)).

Automatic scavenging

Windows 7 and above also automatically do scavenging to remove unused components from the Component Store. Windows Vista and prior do scavenging on a removal event, so you can add and remove a feature to force a scavenging.

Windows 8 has a scheduled task StartComponentCleanup that automatically cleans up unused components. It waits 30 days after a component has been updated before removing previous versions (so you have 30 days to rollback to a previous version of the update). You can run this task manually too:

Check this blog post for some screenshots.

Windows 8.1 and Server 2012 R2 extras

Windows 8.1 and Windows Server 2012 R2 include a new DISM switch to analyze the Component Store and output whether a clean up can be made.

The clean up can be done automatically or manually run via the task scheduler entry as previously mentioned. DISM too has a new switch which does the same (but doesn’t follow the 30 days rule like the schedule task so it is more aggressive).

The above switch is available in Windows 8 and Windows Server 2012 too.

Note that the scavenging options above (Windows 7 and up) only remove previous versions of the components. They are not as aggressive as the options to re-base the OS to a particular service pack. Once the previous versions of the components are removed you cannot rollback to those older versions but you can still uninstall the update you are on.

Windows 8.1 and Server 2012 R2 introduce a new switch that lets you re-base to whatever state you are in now. This is the most aggressive option of all as it makes the state of the system permanent. You cannot uninstall any of the existing updates after you rebase (any newer updates made hence can be uninstalled) – the state your system is in now will be frozen and become its new state. This new switch is called /ResetBase:

Windows Server 2012 and Server 2012 R2 extras

Windows Server 2012 introduces the concept of “features on demand”.

Remember I had said by default all the payloads for all the features in Windows are present in the WinSxS folder. When you enable/ disable features you are merely creating hard-links for the files in WinSxS. What this means is that even though your Server 2012 install may not use a feature its files are still present and taking up space in the WinSxS folder. Starting with Server 2012 you can now uninstall (rather than remove) a feature so its files from the WinSxS folder are deleted.

Of course once you remove the files this if you want to enable the feature later you must also specify a source from where they can be downloaded. Check this blog post for more info.

Update: A Feb 2015 post from the PFE team that goes into pretty much the same stuff I mention above in terms of reducing the WinSxS folder size.

Year Three: rakhesh.com

Today marks 2 years since I booked the domain (port25.io, no longer active) where this blog began life. I began posting 10 days later, on 21st November 2012. But that was just an introductory post I think, as the current oldest post on this blog is from 2nd December 2012. When I changed blog URLs I moved that introductory post to the Changelog section. Coincidentally, this post you are reading now also marks the 200th post. :)

This blog has moved on from its original goal of blogging about Exchange to now blogging about movies, thoughts, and whatever techie thing I am currently working on. It began as a outlet I could (hopefully) use to explain things to others. But it has moved to being a personal notebook and bookmarks store – most of my posts are like notes to future self, posts I can refer to to refresh myself on something I may have forgotten or just look up some command or code snippet that I used to solve a particular task. Added to that most of my posts have links to other blogs and articles – links that do a much better job of explaining the concepts – so I can refer to these links too rather than search through my bookmarks. In that sense both the topics and style/ purpose of this blog has evolved from its beginnings. Not that I am complaining – I like where it’s heading to!

Anyways, just thought I must put up a post marking this day. And write a paragraph or two in case it helps anyone else who is on the fence regarding starting a blog. My suggestion would be to just get something started. It’s a good way for the world and yourself to know what you have been up to. Sure there’s tons of excellent blogs out there so it might seem like you have nothing new to add to the pool – and while you may be correct in thinking that, I’d say it’s still a good idea to put your thoughts too out there. Maybe your way of explaining will make better sense to people. Maybe in the process of blogging about what you are learning/ doing you will get a better understanding yourself. Who knows! Give it a shot, and then back off if you have to. This blog too for instance has many weeks when I barely post anything – because I am not doing anything or I am not in the mood to write – and then I think of shutting it down for good. But usually I hold off, and that works out well because when I am back to doing something or I am in the mood to write I have a place to put it down. And then on a day like today when I look back at the posts I made over the past two years I get a kick out of it – wow I have actually worked on and done a lot of things! Who knew!

I guess I Blog, therefore I Am and that’s one good reason to keep blogging. For yourself.

Notes on Windows RE and BCD

Windows RE

Windows RE (Recovery Environment) is a recovery environment that you boot into when your Windows installation is broken. When the Windows boot loader realizes your Windows installation is broken it will automatically boot into Windows RE. (During boot up the Windows boot loader sets a flag indicating the boot process has started. When the OS loads it clears this flag. If the OS doesn’t load and the computer reboots, the boot loader sees the already set flag and knows there’s a problem. A side effect of this is a scenarios where where the OS starts to load but the machine loses power and so the flag isn’t cleared; later when power returns and the machine is turned on the boot loader notices the flag and loads Windows RE as it thinks the OS is broken).

Screenshots

You can manually boot into Windows RE by pressing F8 and selecting “Repair your computer” from the options menu.

winre-1

The Windows RE menu.

winre-2

Apart from continuing the boot process into the installed OS, you can also power off the computer, boot into a USB driver or network connection, or do further troubleshooting. The above screenshot is from a Windows Server 2012 install. Windows 8 has a similar UI, but Windows 7 (and Windows Server 2008 and Windows Vista) have a different UI (but with similar functionality).

Selecting “Troubleshoot” shows the following “Advanced options”:

winre-5

The startup settings can be changed here or a command prompt windows launched for further troubleshooting.

winre-3

It is also possible to re-image the computer from a recovery image. The recovery image can be on a DVD, an external hard drive, or a Recovery Image partition. It is also possible to store your own recovery image to this partition

winre-4

Location of Windows RE

Windows RE itself is based on Windows PE and is stored as a WIM file. This means you can customize Windows RE by adding additional languages, tools, and drivers. You can even add one custom tool to the “Troubleshoot” menu. On BIOS systems the Windows RE WIM file is stored in the (hidden) system partition. On UEFI systems it is stored in the Windows RE tools partition.

The system partition/ Windows RE tools partition has a folder \Recovery\WindowsRE that contains the WIM file winre.wim and a configuration file ReAgent.xml. On the installed system the \Windows\System32\Recovery\ folder has a ReAgent.xml which is a copy of the file in the system tools/ Windows RE tools partition. The former must be present and have correct entries. Also, for BIOS systems, the system partition must be set as active (and it has an MBR ID of 27 which marks it as a system partition).

Notice the “WinreBCD” ID number in the XML file. Its significance will be made clear later (in the section on BCD).

Managing Windows RE

Windows RE can managed using the \Windows\System32\ReAgentC.exe tool. This tool can manage the RE of the currently running OS and for some options even that of an offline OS. More information on ReAgentC.execommand can be found at this TechNet article. Here are some of the things ReAgentC can do:

  • ReAgentC /enable enables Windows RE. ReAgentC /disable disables Windows RE.

    Both these switches work only against the currently running OS – i.e. you cannot make changes to an offline image. You can, however, boot into Windows PE and enable Windows RE for the OS installed on that computer. For this you’ll need the BCD GUID of the OS (get this via bcdedit /enum /v or bcdedit /store R:\Boot\BCD /enum /v where R:\Boot\BCD is the path to the BCD store – this is usually the system partition for BIOS or the EFS partition for UEFI (it doesn’t have a drive letter so you have to mount it manually)). Once you have that, run the command as: ReAgentC /enable /osguid {603c0be6-5c91-11e3-8c88-8f43aa31e915}

    The /enable options requires \Windows\System32\Recovery\ (on the OS partition) to be present and have correct entries.

  • ReAgentC /BootToRE tells the boot loader to boot into Windows RE the next time this computer reboots. This too only works against the currently running OS – you cannot make changes to an offline image.
  • ReAgentC /info gives the status of Windows RE for the currently running OS. Add a switch /target E:\Windows folder to get info for the OS installed on the E: drive (which could a partition on the disk or something you’ve mounted manually).
  • ReAgentc.exe /SetREimage /path R:\Recovery\WindowsRE\ tells the currently running OS that its Windows RE is at the specified path. In the example, R:\Recovery\WindowsRE would be the system partition or Windows RE tools partition that you’ll have mounted manually and this path contains the winrm.wim file. As before add a switch /target E:\Windows folder to set the recovery image for the OS installed on the E: drive.

Operation failed: 3

On my system ReAgentC was working fine until a few days ago but is now giving the following error:

I suspect I must have borked it somehow while making changes for a my previous post on Hyper-V but I can’t find anything to indicate a problem. Assuming I manage to fix it some time, I’ll post about it later.

BCD

I think it’s a good idea to talk about BCD when talking about Windows RE. The BCD is how the boot loader knows where to find Windows RE, and if the BCD entries for Windows RE are messed up it won’t work as expected.

BCD stands for Boot Configuration Data and it’s the Vista and upwards equivalent of boot.ini which we used to have in the XP and prior days.

Boot process difference between Windows XP (and prior) vs Windows Vista (and later)

Windows XP, Windows Server 2003, Windows Server 2000 had three files that were related to the boot process:

  • NTLDR (NT Loader) – which was the boot manager and boot loader, usually installed to the MBR (or to the PBR and chainloaded if you had GRUB and such in the MBR)
  • NTdetect.com – which was responsible for detecting the hardware and passing this info to NTLDR
  • BOOT.INI – a text file which contained the boot configuration (which partitions had which OS, how long to wait before booting, any kernel switches to pass on, etc) and was usually present along with NTLDR

From Windows Vista and up these are replaced with a new set of files:

  • BootMgr (Windows Boot Manager) – which a boot manager that is responsible for showing the boot options to the user and loading the available OSes. Under XP and prior this functionality was provided by NTLDR (which also loaded the OS) but now it’s a separate program of its own. While NTLDR used to read its options from the BOOT.INI file, BootMgr reads its options from the BCD store.
  • BCD (Boot Configuration Data) – a binary file which replaces BOOT.INI and now contains the boot configuration data. This file has the same format as the Windows registry, and in fact once the OS is up and running the BCD is loaded under HKEY_LOCAL_MACHINE\BCD00000000.

    The BCD is a binary file that’s stored in the EFS partition on UEFI systems or in the system partition in BIOS systems under the \Boot folder (it’s a system hidden file so not visible by default). It is a binary file (unlike BOOT.INI which is a text file) so the entries in it can’t be managed via notepad or any text editor. One has to use the BCDEdit.exe tool that’s part of Windows or via third-party tools such as EasyBCD.

  • winload.exe – I mentioned earlier that the boot manager functionality of NTLDR is now taken up by BootMgr. What remains is the boot loader functionality – the task of actually loading the kernel and drivers from disk – and that is now taken care of by winload.exe. In addition, winload.exe also does the hardware detection stuff that was previously done by NTdetect.com.

Vista: the misunderstood Windows

I think this is a good place to mention that while Windows Vista may have been a derided release from a consumer point of view, it was actually a very important release in terms of laying the foundations for future versions of Windows.

Once upon a time we had MS-DOS and Windows 3.x and Windows 95, 98, ME. These had a common set of technologies. Then there was Windows NT, which was different from the these.

Windows 2000 “married” Windows NT and Windows ME. It laid a new foundation upon which later OSes such Windows 2000, Windows XP, and Windows Server 2003 were based. All of these are based on Windows NT and have a common set of technologies. You know one of these, you can work around the others through a bit of trial and error. Some features may be added or missing, but more or less you can figure things out.

Then came Windows Vista and Server 2008. While these are still similar to Windows XP and Windows Server 2003, they are very different too in a lot of ways. Windows Vista and Server 2008 laid the foundations for changes that were further refined in Windows 7, Windows 8, Server 2008 R2, and so on. For instance changes such as WIM files, the boot process, UAC, deployment tools, CBS (Component Based Servicing), and so on. If the only thing you have worked on is Windows XP sure you can get around a bit with Windows Vista or 7, but as you start going deeper into things you’ll realize a lot of things are way different.

Back during the BOOT.INI days you specified disks and partitions in terms of numbers. The BIOS assigned numbers to disks and the BOOT.INI file had entries such as multi(0)disk(0)rdisk(0)partition(1)\WINDOWS which specified the Windows folder on a partition (in this case the 1st partition of the 1st disk) that was to be booted. This was simple and did the trick mostly, except for when you moved disks around or add/ deleted partitions. Then the entry would be out of date and the boot process will fail.

BCD does away with all this.

BCD uses the disk’s GPT identifier or MBR signature to identify the disk (so changing the order of disks won’t affect the boot process any more). Further, each boot entry is an object in the BCD file and these objects have unique GUIDs. (These are the objects I showed through the bcdedit.exe /enum all command above). The object contains the disk signature as well as the partition offset (the sector from where the partition starts on that disk) where it’s supposed to boot from. Thus to boot any entry all BootMgr needs to do is scan the connected disks for the one with the matching signature and then find the partition specified by the offset. This makes BCD independent of the disk numbers assigned by BIOS and it is unaffected by changes made to the order of disks.

A downside of BCD is that while with BOOT.INI one could move the OS to a different disk with the same partitioning and hope for it to boot, that won’t do with BCD as the disk signatures won’t match. BootMgr will scan for the disk signature in the BCD object, not find it, and complain that it cannot find the boot device and/ or winload.exe. (This is not a big deal because BCDEdit can be used to fix the record but it’s something to keep in mind).

Here’s the output from BCDEdit on my machine. There’s two sets of output here – one with a /v switch, the other without.

Couple of things to note here.

First, notice what I meant about each entry being an “object”. As you can see each entry has properties and values – unlike in BOOT.INI days where everything was on a single line with spaces between options.

Second, the /enum switch shows all the active entries in BCD but by default skips the GUID for objects that are universal or known. For instance, the GUID for the boot manager is always {9dea862c-5cdd-4e70-acc1-f32b344d4795} so it replaces that with {default} in the output. Similarly it replaces the GUID for the currently loaded OS – which isn’t universal, but it’s known as it’s the currently loaded one – with {current}. BCDEdit does this to make it easier for the end user to read the output and/ or to refer to these objects when making changes. If you don’t want such “friendly” output use the /v switch like I did in the second case above.

The registry stores the objects as GUIDs. So if I were to take the GUID of the currently running system from the output above and look at the registry I’ll see similar details:

Going back to the BCDEdit output if we compare the device entries for the {bootmgr} and {current} entries we can see it’s represented as partition=\Device\HarddiskVolume1 for the {bootmgr} entry and the friendlier drive letter version partition=C: for the {current} entry (because the partition has a drive letter). BCD starts the volume from 1 so \Device\HarddiskVolume1 refers to the first partition of all the disks on the computer. This is worth emphasising. The \Device\HarddiskVolumeNN representation is not how BCD stores the data internally. Internally BCD uses the disk signature and offset as mentioned earlier, but when displaying to the end-user it uses a friendlier format like \Device\HarddiskVolume1 or a drive letter.

If we compare the registry output above to the corresponding BCD output we can see the partition+disk information represented differently.

Another thing worth noting with the BCDEdit output is that it classifies the output. The first entry is BOOTMRG so it puts it under the section of “Windows Boot Manager”. Subsequent entries are boot loaders so they are put under “Windows Boot Loader”. There’s only one active entry in my system but if I had more entries they too would appear here.

Note that I said there’s only one active entry in my system. There are actually many more entries but these are not active. For instance, there’s an entry to boot into Windows RE but that’s not shown by default. To see all these other entries the /enum switch takes various parameters. For example: /enum osloader shows all OS loading entries, /enum bootmgr shows BOOTMGR, /enum resume shows hibernation resume entries, and so on. To show every entry in the BCD use the switch /enum all (and to see what other options are present do /enum /? to get help).

Notice the Windows RE entry above. And notice that its GUID matches that in the ReAgent.xml file of Windows RE.

On my machine I had one more entry initially:

This is an incorrect entry because the GUID of this entry doesn’t match the Windows RE GUID in the ReAgent.xml file so I deleted it:

Speaking of Windows RE, one of the things we can do from Windows RE (and only from Windows RE!) is repair the MBR, boot sector, and BCD with a tool called Bootrec. To fix only the MBR there’s a tool called bootsect which is available in Windows 8 and above (or Windows PE in case of Windows 7). This tool can replace the MBR with BOOTMGR or NTLDR compatible code and is often useful for fixing unbootable systems.

Another useful tool to be aware of is BCDBoot. This tool is used to create a new BCD store and/ or install the boot loader and related files. I used this tool in a previous posts to install the UEFI bootloader and the BIOS bootloader.

Before I conclude I’d like to link to three posts by Mark Minasi on BCD. They go into similar material as what I did above but I feel are better presented (they talk about the various switches for instance, whereas I just mention them in passing):

Finally, BCDEdit too supports options like you could set in BOOT.INI (for example: use a standard VGA driver, disable/ enable PAE, disable/ enable DEP). You set these options via the bcdedit /set {GUID} ... switch, wherein {GUID} is the ID of the boot entry you want to make the settings on and ... is replaced with the options you want to change. See this MSDN article for more information on the options and how to set them. Common BOOT.INI settings and their new equivalents can be found at this MSDN article.

That’s all for now!

Down the rabbit hole

Ever had this feeling that when you want to do one particular thing, a whole lot of other things keep coming into the picture leading you to other distracting paths?

For about a week now I’ve been meaning to write some posts about my Active Directory workshop. In a typical me fashion, I thought I’d set up some VMs and stuff on my laptop. This being a different laptop to my usual one, I thought of using Hyper-V. And then I thought why not use differencing VHDs to save space. And then I thought why not use a Gen 2 VM. Which doesn’t work so I went on a tangent reading about UEFI’s boot process and writing a blog post on that. Then I went into making an answer file to use while installing, went into refreshing myself on the PowerShell cmdlets I can use to do the initial configuring of Server Core 2012, made a little script to take care of that for multiple servers, and so on …

Finally I got around to installing a member server yesterday. Thought this would be easy – I know all the steps from before, just that I have to use a Server 2012 GUI WIM instead of a Core WIM. But nope! Now the ReAgentC.exe command on my computer doesn’t work! It worked till about 3 days ago but has now suddenly stopped working – so irriting! Of course, I could skip the WinRE partition – not that I use it anyways! – or just use a Gen 1 VM, but that just isn’t me. I don’t like to give up or backtrack from a problem. Every one of these is a learning opportunity, because now I am reading about Component Based Servicing, the Windows Recovery Environment, and learning about new DISM cleanup options that I wasn’t even aware of. But the problem is one of balance. I can’t afford to lose myself too much in learning new things because I’ll soon lose sight of the original goal of making Active Directory related posts.

It’s exciting though! And this is what I like and dislike about embarking on a project like this (writing Active Directory related posts). I like stumbling upon new issues and learning new things and working through them; but I dislike having to be on guard so I don’t go too deep down the hole and lose sight of what I had set out to do.

Here’s a snapshot of where I am now:

workflowy

It’s from WorkFlowy, a tool that I use to keep track of such stuff. I could write a blog post raving about it but I’ll just point you to this excellent review by Farhad Manjoo instead.

Downloading Trace32 and CMTrace for easy log file reading

I was working with some log file recently (C:\Windows\Logs\cbs\CBS.log to be precise, to troubleshoot an issue I am having on my laptop, which I hope to sort soon and write a blog post about). Initially I was opening the file in notepad but that isn’t a great way of going through log files. Then I remembered at work I use Trace32 from the SCCM 2007 Toolkit. So I downloaded it from Microsoft. Then I learnt Trace32’s been replaced with one called CMTrace in SCCM 2012 R2.

Here’s links to both the toolkits:

For the 2007 toolkit when installing choose the option to only install the Common Tools and skip the rest. That will install only Trace32 at C:\Program Files (x86)\ConfigMgr 2007 Toolkit V2 (add this to your PATH variable for ease of access).

2007-toolkit

For the 2012 R2 toolkit choose the option to install only the Client Tools and skip the rest. That will install CMTrace and a few other tools at C:\Program Files (x86)\ConfigMgr 2012 Toolkit R2\ClientTools (add this too to your PATH variable).

2012-toolkit

That’s all! Happy troubleshooting!

Tip: View hidden files and folders in PowerShell

Just as a reference to my future self …

To view hidden files & folders in a directory via PowerShell use the -Force switch with Get-ChildItem:

Hyper-V differencing disks with an answer file

If you follow the differencing VHD approach from my previous posts (this & this) you’ll notice the boot process starts off by getting the devices ready, does a reboot or two, and then you are taken to a prompt to set the Administrator password.

Do that and you are set. There’s no other prompting in terms of selecting the image, partitioning etc (because we have bypassed all these stages of the install process).

Would be good if I could somehow specify the admin password and the server name automatically – say via an answer file. That’ll take care of the two basic stuff I do always any way. My admin password is common for all the machines, and the server name is same as the VM name, so these can be figured out automatically to use with an answer file.

The proper way to create an answer file is too much work and not really needed here. So I Googled for answer files, found one, removed most of it as it was unnecessary, and the result is something like this:

If you replace the text marked with –REPLACE– with the computer name, and save this to the c:\ of a newly created VM, the password and computer name will be automatically set for you!

So here’s what I do in addition to the steps before.

Create the differencing VHD as usual

Save the XML file above as “Unattend.xml”. Doesn’t matter where you save it as long as you, I’ll assume it’s in my current directory. If it is saved anyplace else replace the path accordingly in the second cmdlet below.

Mount the VHD, copy the answer file over replacing the machine name with what you want, dismount the VHD. Finito!

That’s it really.

A different way to manipulate the XML file

I used the -replace operator above to make changes to the XML file. But I can do things differently too as PowerShell understands XML natively.

Three cmdlets instead of one, but this might feel “neater”.

Notes of UEFI, GPT, UEFI boot process, disk partitions, and Hyper-V differencing disks with a Generation 2 VM

In my previous post I had talked about creating differencing VHDs for use with Hyper-V. While making that post I realized that what I what I was doing doesn’t work with Generation 2 VMs. Investigating a bit into that bought me to my old friend UEFI. I say “old friend” because UEFI is something I have been reading off and on the past few months – mainly due to my interest in encryption. For instance, my laptop with Self Encrypting SSDs can only be managed by BitLocker if I install Windows 8 in UEFI mode. By default it had installed in BIOS mode (and was continuing to when I re-installed) so a few months ago I had read about UEFI and figured how to install Windows 8 on that laptop in UEFI mode.

Then at work we started getting UEFI computers and so I spent some time going through the firmware on those computers just to get a hang of UEFI.

And then last month I bought a Notion Ink Cain tablet, and to get encryption working on it I had to enable Secure Boot (which is a part of UEFI) so once again I found myself reading about UEFI. That was a fun exercise (and something I am yet to post about) so I have been meaning to write about UEFI for a while just that I never got around to it. Since I stumbled upon UEFI again today, might as well do so now.

So what is UEFI? Simply put UEFI is a firmware specification that’s meant to replace BIOS. Most modern laptops and desktops come with UEFI but it looks and behaves like BIOS so you might not notice the difference until you delve in. In this post I’ll focus on the boot process of BIOS and UEFI as that’s what I am interested in.

BIOS boot process

With BIOS you have an MBR (Master Boot Record). In BIOS you specify the boot order of disks, and each of these disks is searched for the MBR by BIOS. The MBR is the first sector of a disk and it contains information on the partitions in the disk as well as a special program (called a “boot loader”) which can load OSes from these partitions. Since the MBR is at a standard location the BIOS can pass control to the boot loader located there. The BIOS doesn’t need to know anything about the OSes or their file systems – things are dumb & simple.

BIOS has limitations in terms of the size of disks it can work with, the limited space available to the boot loader (because of which you have to use quirks like “chain loaders” and such), and so on. BIOS is good, but its time has come … its replacement is UEFI.

What is UEFI?

BIOS stands for “Basic Input/ Output System”. UEFI stands for “Unified Extensible Firmware Interface”. UEFI began as EFI, and was developed by Intel but is now managed by the UEFI Forum. Both BIOS and UEFI aren’t a specific piece of software. Rather, they are specifications that define the interface between the firmware and OS. The UEFI specification is more managed. There are many versions of the specification, with each version adding more capabilities. For instance, version 2.2 added the Secure Boot protocol stuff. Version 2.1 added cryptography stuff. As of this writing UEFI is at version 2.4.

In contrast, BIOS doesn’t have a specification as such. Various BIOS implementations have their own feature set and there’s no standard.

For backward compatibility UEFI can behave like BIOS. The UEFI specification defines a Compatibility Support Module (CSM) which can emulate BIOS. Bear in mind, it is still UEFI firmware, just that it behaves like BIOS firmware without any of the additional UEFI features or advantages. You can’t have both UEFI and BIOS on a computer – only one of them is present, after all they are both firmware!

UEFI classes

The UEFI forum defines four classes for computers:

  1. Class 0 – The computer has no UEFI, only BIOS.
  2. Class 1 – The computer has UEFI with CSM only. So it has UEFI but behaves in a BIOS compatible mode.
  3. Class 2 – The computer has UEFI and CSM. So it can behave as BIOS compatible mode if need be.
  4. Class 3 – The computer has UEFI only, no CSM.

It’s important to be aware of what class your computer is. Hyper-V Generation 2 VMs, for instance, behave as Class 3 computers. They have no CSM. (Moreover Hyper-V Generation 2 does not have a 32-bit implementation of UEFI so only 64-bit guest OSes are supported).

UEFI and GPT

UEFI has a different boot process to BIOS. For starters, it doesn’t use the MBR. UEFI uses a newer partitioning scheme called GPT (GUID Partition Table) that doesn’t have many of MBRs limitations.

If your disk partitioning is MBR and you system has UEFI firmware, it will boot but in CSM mode. So be sure to choose GPT partitioning if you want to use UEFI without CSM.

Also, even though your machine has UEFI, when trying to install Windows it might boot the Windows installer in CSM mode. When you press F9 or whatever key to select the boot media, usually there’s an option which lets you boot in UEFI mode or BIOS/ CSM mode. Sometimes the option isn’t explicit and if the boot media has both UEFI and BIOS boot files, the wrong one may be chosen and UEFI will behave in CSM mode. It is possible to detect which mode Windows PE (which runs during Windows install) is running in. It is also possible to force the install media to boot in UEFI or CSM mode by deleting the boot files of the mode you don’t want.

My laptop, for instance, is UEFI. But each time I’d install Windows 8 onto it it would pick up the BIOS boot loader files and boot in CSM mode. Since I wanted to use UEFI for some of its features, I used Rufus to create a bootable USB of the media (be sure to select “GPT partitioning for UEFI computers”) and when I booted from it Windows installed in UEFI mode. The trick isn’t the GPT partitioning. The trick is that by telling Rufus we want to boot on an UEFI computer, it omits the BIOS specific boot loader files from the USB. It is not necessariy to use Rufus – the process can be done manually too.

UEFI and GPT work with both 32-bit and 64-bit Windows. The catch is that to booting from GPT is only supported for 64-bit Windows running in UEFI. So while you can have 32-bit Windows running in UEFI, it will need an MBR partition to boot from. What this means is that such a system will be running as UEFI Class 2 as that’s the only one which supports UEFI and MBR partitions (essentially the system has UEFI but behaves as BIOS compatible mode).

UEFI classes and MBR/GPT partitioning

With Windows you can use MBR or GPT partitions on your computer depending on its class. From this Microsoft page:

  • UEFO Class 0 – Uses MBR partitions.
  • UEFI Class 1 – Uses GPT partitions.
  • UEFI Class 2 – Uses GPT partitions. This class of UEFI support includes CSM so if MBR partitions are present UEFI will run in compatibility mode.
  • UEFI Class 3 – Uses GPT partitions.

I am not clear why Class 1 only uses GPT partitions. Considering Class 1 is UEFI with CSM only and CSM supports MBR, I would have thought Class 1 supports only MBR partitions.

UEFI boot process

The UEFI boot process is more complicated than BIOS. That doesn’t mean it’s difficult to understand or unnecessarily complicated. What I meant is that it isn’t as simple as having an MBR with a boot loader, as in the case of BIOS. You can’t expect to pass along a VHD file created with BIOS in mind to a machine having only UEFI and expect it to work (as was my case). You need to tweak things so the boot process works with UEFI.

An excellent blog post on the UEFI boot process is this. If you have the time and inclination, go read it! You will be glad you did. What follows are my notes from that post and some others.

  • The UEFI specifications define a type of executable (think .exe files) that all UEFI firmware must support. Each OS that wishes the UEFI firmware to be able to boot it will provide a boot loader of this type. That’s it. OS provides such a boot loader, UEFI loads it.
  • In BIOS the boot loader was present in the MBR. Where is it present in UEFI? In order to be not limited by space like BIOS was, UEFI defines a special partition where boot loaders can be stored. The partition doesn’t have to be of a specific size or at a specific location. The spec requires that all UEFI firmware must be able to read a variant of the FAT file system that’s defined in the spec. (UEFI firmware can read other file system types too if they so wish, but support for this variant of FAT is a must). So UEFI boot loaders are stored in a special partition that’s of file system type FAT (the variant defined by UEFI). And to denote this partition as the special partition it has a different type (i.e. it doesn’t say FAT32 or NTFS or EXT2FS etc, it says ESP (EFI System Partition)). Simple! (Oh, and there can be multiple ESP partitions too if you so wish!)

The above design makes UEFI much more reliable than BIOS. Whereas with the latter you could only store a very limited boot loader at a specific space on the disk – and that boot loader usually chain loaded the OSes – with UEFI you can store boot loaders (in the EFI executable format) of each OS in the ESP partition that’s of file system type FAT (the variant defined by UEFI). Already you have a lot more flexibility compared to BIOS.

To tie all these together UEFI has a boot manager. The boot manager is what looks at all the boot loader entries and creates a menu for booting them. The menu isn’t a static one – the firmware can create a menu on the fly based on boot loaders present across multiple disks attached to the computer. And this boot manager can be managed by tools in the installed OS too. (Sure you could do similar things with Linux boot loaders such as GRUB, but the neat thing here is that the functionality is provided by the firmware – independent of the OS – which is really where it should be! It’s because BIOS was so limited that we had fancy boot loaders like GRUB that worked around it).

If you go down to the section entitled “The UEFI boot manager” in the post I linked to earlier you’ll see an example of a boot manager output. No point me paraphrasing what the author has said, so best to go and check there. I’ll mention one interesting point though:

  • Remember I said there are ESP partitions and they contain the OS boot loaders? So, for instance, you could have an UEFI boot manager entry like HD(1,800,61800,6d98f360-cb3e-4727-8fed-5ce0c040365d)File(\EFI\fedora\grubx64.efi) which points to the partition called HD(1,800,61800,6d98f360-cb3e-4727-8fed-5ce0c040365d) (the naming convention follows a EFI_DEVICE_PATH_PROTOCOL specification) and specifically the \EFI\fedora\grubx64.efi file as the boot loader.
  • What you can also have, however, is a generic entry such as HD(2,0,00). Note there’s no boot loader specified here, and probably no specific ESP partition either. What happens in such cases is that the boot manager will go through each ESP partition on that disk, check for a file \EFI\BOOT\BOOT{machine type short-name}.EFI, and try loading that. This way the UEFI spec allows for one to boot from a hard disk without specifying the OS or path to the boot loader, as long as the disk contains a “default” boot loader as per the naming convention above. This is what happens, for instance, when you boot a Windows 8 DVD, for instance. If you put in such a DVD in your computer and check, you’ll see the root folder has a folder called EFI that contains a sub folder called BOOT which contains a file called bootx64.efi.

Another example and screenshot of the UEFI boot manager can be found at this link.

Tying this in with my WIM to VHD case

If you have read this far, it’s obvious what’s wrong with my VHD file. When the Gen 2 VM boots up – and it uses UEFI as it’s a Gen 2 VM – it will look for a ESP partition with the UEFI boot loader but won’t find any (as my VHD has only one partition and that too of type NTFS). So what I need to do is create an ESP partition and copy the boot loaders to it as required. Also, I am using MBR style partitioning and a Gen 2 VM firmware is Class 3, so I must switch to GPT.

In fact, while I am at it why don’t I partition everything properly. When I install Windows manually (server or desktop) it creates many partitions so this looks like a good opportunity to read up on the Windows partitioning scheme and create any other required partitions on my base disk.

Understanding (GPT/UEFI) disk partitions for Windows

There are three Microsoft pages I referred to:

Read those for more details than what I post below.

The following partitions are required:

  • System partition: This is the EFI System Partition (ESP). Minimum size of the partition is 100 MB, FAT32 formatted. For Windows, the ESP contains the NTLDR, HAL, and other files and drivers required to boot the system. The partition GUID for ESP is DEFINE_GUID (PARTITION_SYSTEM_GUID, 0xC12A7328L, 0xF81F, 0x11D2, 0xBA, 0x4B, 0x00, 0xA0, 0xC9, 0x3E, 0xC9, 0x3B) (on an MBR partition the ID is 0xEF; but remember, Windows doesn’t support booting into UEFI mode from MBR partitions). The type of this partition is c12a7328-f81f-11d2-ba4b-00a0c93ec93b. Windows does not support having two ESPs on a single disk.
  • Microsoft Reserved Partition (MSR): Whereas with BIOS/ MBR one could have hidden sectors, UEFI does away with all that. So Microsoft recommends a reserved partition be set aside instead of such hidden sectors. The size of this partition is 128 MB (for drives larger than 16GB; else the size is 32 MB). It does not have any data – think of the MSR as free space set aside for future use by Windows – it is used when any disk operations require extra space and/ or partitions and they can’t use the existing space and/ or partitions. The partition GUID for MSR is DEFINE_GUID (PARTITION_MSFT_RESERVED_GUID, 0xE3C9E316L, 0x0B5C, 0x4DB8, 0x81, 0x7D, 0xF9, 0x2D, 0xF0, 0x02, 0x15, 0xAE). The type of this partition is e3c9e316-0b5c-4db8-817d-f92df00215ae.

The order of these partitions is: ESP, followed by any OEM partitions, followed by MSR, followed by the OS & data partitions. (See this link for a nice picture).

Apart from the two above, Microsoft recommends two other partitions (note: these recommended, not required):

  • Windows Recovery Environment (Windows RE) tools partition: This must be at least 300 MB, preferably 500 MB or larger, and contains the Windows RE tools image (winre.wim) which is about 300 MB in size. It is preferred that these tools are on a separate partition in case the main partition is BitLocker encrypted, and even otherwise to ensure the files in this partition are preserved in case the main partition is wiped out. The type of this partition is de94bba4-06d1-4d40-a16a-bfd50179d6ac.
  • Recovery image partition: This must be at least 2 GB, preferably 3 GB, and contains the Windows recovery image (install.wim) which is about 2 GB in size. This partition must be placed after all other partitions so its space can be reclaimed later if need be. The type of this partition is de94bba4-06d1-4d40-a16a-bfd50179d6ac.

Finally, the disk has basic data partitions which are the usual partitions containing OS and data. These partitions have a GUID DEFINE_GUID (PARTITION_BASIC_DATA_GUID, 0xEBD0A0A2L, 0xB9E5, 0x4433, 0x87, 0xC0, 0x68, 0xB6, 0xB7, 0x26, 0x99, 0xC7). The minimum size requirement for the partition containing the OS the 20 GB for 64-bit and 16 GB for 32-bit. The OS partition must be formatted as NTFS. The type of these partitions are ebd0a0a2-b9e5-4433-87c0-68b6b72699c7.

The order of all these partitions is: Windows RE tools, followed by ESP, followed by any OEM partitions, followed by MSR, followed by the data partitions, and finally the Recovery image partition.

It is worth pointing out that when you are installing Windows via an answer file it is possible to create all the above partitions via an answer file. But in my scenario, I am applying a WIM image to a VHD partition manually and creating all the partitions myself so I need a way to do this manually.

Let’s make some partitions!

Now back to my VHDs. To recap, previously I had shown how I apply an OS image from a WIM file to a (base) VHD and then make differencing VHDs off that base VHD for my Hyper-V VMs. The VHD thus created works well for Generation 1 VMs but fails for Generation 2 VMs. As we have learnt from the current post that’s because (a) I was using MBR partitions instead of GPT and (b) I hadn’t created any ESP partitions for the UEFI firmware to pick a boot loader from. Hyper-V Generation 2 VMs have a Class 3 UEFI firmware, so they don’t do any of the CSM/ BIOS compatibility stuff.

As before, create a new VHD and initialize it. Two changes from before are that I am now using a size of 25 GB instead of 20GB and that I initialize the disk as GPT.

Confirm that the disk is visible and note its number:

By default the newly created disk has a 128 MB MSR partition. Since the ESP has to be before this partition let’s remove that.

Then create new partitions:

Just double-checking:

Wunderbar!

Next I apply the image I want as before:

That takes care of the data partition. Let’s look at the other ones now.

WinRE tools partition

This is the first partition on the disk. I will (1) format it as FAT32, (2) mount it to a temporary drive letter, (3) copy the WinRE.WIM file from E:\Windows\System32\Recovery (change E: to whatever letter is assigned to the OS partition), (4) register the Windows RE image with the OS, and (5) dismount it.

Thanks to this TechNet article on how to register the Windows RE image. The WinRE.WIM image can be customized too with drivers and other tools if required but I won’t be doing any of that here.

Thanks to one of my readers (Exotic Hadron) for pointing out that the winre.wim file is only present in %SYSTEMROOT%\System32\Recovery if Windows was installed by expanding install.wim (like in the above case). On a typical system where Windows is installed via the setup program the file won’t be present here.

Just double-checking that Windows RE is registered correctly:

EFI System Partition

This is the second partition on the disk. As before I will format this as FAT32 and mount to a temporary drive letter. (Note: I use a different cmdlet to assign drive letter here, but you can use the previous cmdlet too).

Format the partition as FAT32. The Format-Volume cmdlet doesn’t work here (am guessing it can’t work with “System” partitions) so I use the usual format command instead:

Get the boot loaders over to this drive, confirm they are copied, and remove the drive letter:

Phew!

Recovery Image partition

Last thing, let’s sort out the recovery image partition.

I am going to skip this (even though I made the partition as an example) because there’s no point wasting the space in my use case. All one has to do to sort out the recovery image partition is to mount it like with the WinRE tools partition and copy over the install.wim file to it. Then use the ReAgentc.exe command to register that image with that installation of Windows. (See steps 5-7 of this link).

That’s it!

Now dismount the VHD, make a differencing VHD as before, create a VM with this VHD and fire away!

And voila! It is booting!

boot-success

 

Update: I came across this interesting post by Mark Russinovich on disk signature collisions and how that affects the BCD. Thought I should link it here as it makes for a good read in the context of what I talk about above.