Contact

Notes on TLS/SSL, RSA, DSA, EDH, ECDHE, and so on …

The CloudFlare guys make excellent technical posts. Recently they introduced Keyless SSL (which is a way of conducting the SSL protocol wherein the server you are talking to does not necessarily need to have the private key) and as part of the post going into its technical details they talk about the SSL protocol in general. Below are my notes on this and a few other posts. Crypto intrigues me as I like encryption and privacy so this is an area of interest. 

Note: This is not a summary of the CloudFlare blog post. It was inspired by that post but I talk about a lot more basic stuff below. 

The TLS/SSL protocol

First things first – what we refer to as Secure Sockets Layer (SSL) protocol is not really SSL but Transport Layer Security (TLS).

  • Versions 1.0 to 3.0 of SSL were called, well … SSL 1.0 to SSL 3.0. 
  • TLS 1.0 was the upgrade from SSL 3.0. It is very similar to SSL such that TLS 1.0 is often referred to as SSL 3.1. 
  • Although the differences between TLS 1.0 and SSL 3.0 are not huge, the two cannot talk to each other. TLS 1.0, however, includes a mode wherein it can talk to SSL 3.0 but this decreases security. 

The world still refers to TLS as SSL but keep in mind it’s really TLS. TLS has three versions so far – TLS 1.0, TLS 1.1, and TLS 1.2. A fourth version, TLS 1.3, is currently in draft form. I would be lying if I said I know the differences between these versions (or even the differences between SSL 3.0 and TLS 1.0) so it’s best to check the RFCs for more info!

TLS/SSL goals

The TLS/SSL protocol has two goals:

  1. Authenticate the two parties that are talking with each other (authentication of a server is the more common scenario – such as when you visit your bank’s website for instance – but authentication of the user/ client too is supported and used in some scenarios). 
  2. Protect the conversation between the two parties.

Both goals are achieved via encryption.

Encryption is a way of “locking” data such that only a person who has a “key” to unlock it can read it. Encryption mechanisms are essentially algorithms: you take a message, follow the steps of the algorithm, and end up with an encrypted gobbledygook. All encryption algorithms make use of keys to lock and unlock the message – either a single key (which both encrypts and decrypts the message) or two keys (either can encrypt and decrypt).

Shared key encryption/ Symmetric encryption

A very simple encryption algorithm is the Caesar Cipher where all you do is take some text and replace the letters with letters that are a specified number away from it (for example you could replace “A” with “B”, “B” with “C”, and so on or “A” with “C”, “B” with “D”, and so on … in the first case you replace with a letter one away from it, in the second case you replace with a letter two away from it). For the Caesar Cipher the key is simply the “number of letters” away that you choose. Thus for instance, if both parties decide to use a key of 5, the encrypting algorithm will replace each letter with one that’s 5 letters away, while the decrypting algorithm will replace each letter with one that’s 5 letters before. The key in this case is a shared key – both parties need to know it beforehand. 

Encryption algorithms where a single shared key is used are known as symmetric encryption algorithms. The operations are symmetrical – you do something to encrypt, and you undo that something to decrypt. TLS/SSL uses symmetric encryption to protect the conversation between two parties. Examples of the symmetric encryption algorithms TLS/SSL uses are AES (preferred), Triple DES, and RC4 (not preferred). Examples of popular symmetric  algorithms can  be found on this Wikipedia page

Symmetric encryption has the obvious disadvantage that you need a way of securely sharing the key beforehand, which may not always be practical (and if you can securely share the key then why not securely share your message too?).

Public key encryption/ Asymmetric encryption

Encryption algorithms that use two keys are known as asymmetric encryption or public key encryption. The name is because such algorithms make use of two keys – one of which is secret/ private and the other is public/ known to all. Here the operations aren’t symmetrical- you do something to encrypt, and you do something else to decrypt. The two keys are special in that they are mathematically linked and anything that’s encrypted by one of the keys can be decrypted only by the second key. A popular public key encryption algorithm is RSA. This CloudFlare blog post on Elliptic Curve Cryptography (ECC), which is itself an example of a public key encryption algorithm, has a good explanation of how RSA works. 

Public key encryption can be used for encryption as well as authentication. Say there are two parties, each party will keep its private key to itself and publish the public key. If the first party wants to encrypt a message for the second party, it can encrypt it using the public key of the second party. Only the second party will be able to decrypt the message as only it holds the private key. Thus the second party is authenticated as only the second party holds the private key corresponding to the public key.

Public key cryptography algorithms can also be used to generate digital signatures. If the first party wants to send a message to the second party and sign it, such that the second party can be sure it came from the first party, all the first party needs to do is send the message as usual but this time take a hash of the message encrypt that with its private key. When the second party receives this message it can decrypt the hash via the public key of the first party, make a hash itself of the message, and compare the two. If the hashes match it proves that the message wasn’t tampered in progress and also that the first party has indeed signed it as only its private key locked message can be unlocked by the public key. Very cool stuff actually!

Not all public key cryptography algorithms are good at encryption and signing nor are they required to be so. RSA, for instance, is good at encryption & decryption. Another algorithm, DSA (Digital Signature Algorithm) is good at signing & validation. RSA can do signing & validation too but that’s due to the nature of its algorithm

A question of trust

While public key encryption can be used for authentication there is a problem. What happens if a third party publishes its public key on the network but claims that it is the second party. Obviously it’s a fraud but how is the first party to know of that? Two ways really: one way is the first party can perhaps call or through some other means verify with the second party as to what its public key is and thus choose the correct one – this is tricky because it has to verify the identity somehow and be sure about it – or, the second way, there can be some trusted authority that verifies this for everyone – such a trusted authority will confirm that such and such public key really belongs to the second party. 

These two ways of finding if you can trust someone are called the Web of Trust (WoT) and Public Key Infrastructure (PKI) respectively. Both achieve the same thing, the difference being the former is decentralized while the latter is centralized. Both of these make use of something called certificates – which is basically a digital document that contains the public key as well as some information on the owner of the public key (details such as the email address, web address, how long the certificate is valid for, etc). 

Web of Trust (WoT)

In a WoT model everyone uploads their certificates to certain public servers. When a first party searches for certificates belonging to a second party it can find them – both legitimate ones (i.e. actually belonging to the second party) as well as illegitimate ones (i.e. falsely uploaded by other parties claiming to the be second party). By default though the first party doesn’t trust these certificates. It only trusts a certificate once it verifies through some other means as to which is the legitimate one – maybe it calls up the second party or meets the party in person or gets details from its website. The first party can also trust a certificate if someone else it already trusts has marked a certificate as trusted. Thus each party in effect builds up a web of certificates – it trusts a few and it trusts whatever is trusted by the few that it trusts. 

To add to the pool once a party trusts a certificate it can indicate so for others to see. This is done by signing the certificate (which is similar to the signing process I mentioned earlier). So if the first party has somehow verified that a certificate it found really belongs to the second party, it can sign it and upload that information to the public servers. Anyone else then searching for certificates of the second party will come across the true certificate too and find the signature of the first party. 

WoT is what you use with programs such as PGP and GnuPG. 

Public Key Infrastructure (PKI)

In a PKI model there are designated Certificate Authorities (CA) which are trusted by everyone. Each party sends their certificates to the CA to get it signed. The CA verifies that the party is who it claims to be and then signs it. There are many classes of validation – domain validation (the CA verifies that the requester can manage the domain the certificate is for), organization validation (the CA also verifies that the requester actually exists), and extended validation (a much more comprehensive validation than the other two). 

Certificate Authorities have a tree structure. There are certain CAs – called root CAs – which are implicitly trusted by everyone. Their certificates are self-signed or unsigned but trusted by everyone. These root CAs sign certificates of other CAs who in-turn might sign certificates of other CAs or of a requester. Thus there’s a chain of trust – a root CA trust an intermediary CA, who trusts another intermediary CA, who trusts (signs) the certificate of a party. Because of this chain of trust, anyone who trusts the root CA will trust the certificate signed by one of its intermediaries. 

 It’s probably worth pointing out that you don’t really need to get your certificate signed by a CA. For instance, say I want to encrypt all traffic between my computer and this blog and so I create a certificate for the blog. I will be the only person using this – all my regular visitors will visit the blog unencrypted. In such a case I don’t have to bother with them not trusting my certificate. I trust my certificate as I know what its public key and details look like, so I can install the certificate and use an https link when browsing the blog, everyone else can use the regular http link. I don’t need to get it signed by a CA for my single person use. It’s only if I want the general public to trust the certificate that I must involve a CA. 

PKI is what you use with Internet Browsers such as Firefox, Chrome, etc. PKI is also what you use with email programs such as Outlook, Thunderbird, etc to encrypt communication with a server (these emails program may also use WoT to encrypt communication between a sender & recipient). 

TLS/SSL (contd)

From here on I’ll use the words “client” and “server” interchangeably with “first party” and “second party”. The intent is the same, just that it’s easier to think of one of one party as the client and the other as a server. 

TLS/SSL uses both asymmetric and symmetric encryption. TLS/SSL clients use asymmetric encryption to authenticate the server (and vice-versa too if required) and as part of that authentication they also share with each other a symmetric encryption key which they’ll use to encrypt the rest of their conversation. TLS/SSL uses both types of encryption algorithms because asymmetric encryption is computationally expensive (by design) and so it is not practical to encrypt the entire conversation using asymmetric encryption (see this StackExchange answer for more reasons). Better to use use asymmetric encryption to authenticate and bootstrap symmetric encryption. 

When a TLS/SSL client contacts a TLS/SSL the server sends the client its certificate. The client validates it using the PKI. Assuming the validation succeeds, client and server perform a “handshake” (a series of steps) the end result of which is (1) authentication and (2) the establishment of a “session key” which is the symmetric key used for encrypting the rest of the conversation. 

CloudFlare’s blog post on Keyless SSL goes into more details of the handshake. There are two types of handshakes possible: RSA handshakes and Diffie-Hellman handshakes. The two types of handshakes differ in terms of what algorithms are used for authentication and what algorithms are used for session key generation. 

RSA handshake

RSA handshakes are based on the RSA algorithm which I mentioned earlier under public key encryption. 

An RSA handshake uses RSA certificates for authentication (RSA certificates contain a public RSA key). Once authentication is successful, the client creates a random session key and encrypts with the public key of the server (this encryption uses the RSA algorithm). The server can decrypt this session key with its private key, and going forward both client & server use this session key to encrypt further traffic (this encryption does not use the RSA algorithm, it uses one of the symmetric key algorithms).

 The RSA handshake has a drawback in that if someone were to capture and store past encrypted traffic, and if the server’s private key were to somehow leak, then such a person could easily decrypt the session key and thus decrypt the past traffic. The server’s private key plays a very crucial role here as it not only authenticates the server, it also protects the session key. 

Diffie-Hellman handshake

Diffie-Hellman handshakes are based on the Diffie-Hellman key exchange algorithm. 

The Diffie-Hellman key exchange is an interesting algorithm. It doesn’t do any encryption or authentication by itself. Instead, it offers a way for two parties to generate a shared key in public (i.e. anyone can snoop in on the conversation that takes place to generate the secret key) but the shared key is secret and only the two parties know of it (i.e. the third party snooping in on the conversation can’t deduce the shared key). A good explanation of how Diffie-Hellman does this can be found in this blog post. Essentially: (1) the two parties agree upon a large prime number and a smaller number in public, (2) each party then picks a secret number (the private key) for itself and calculates another number (the public key) based on this secret number, the prime number, and the smaller number, (3) the public keys are shared to each other and using each others public key, the prime number, and the small number, each party can calculate the (same) shared key. The beauty of the math involved in this algorithm is that even though a snooper knows the prime number, the small number, and the two public keys, it still cannot deduce the private keys or the shared key!

There are two versions of the Diffie-Hellman algorithm:

  • a fixed/ static version, where both parties use the same public/ private keys (and hence same shared key) across all their conversations; and
  • an ephemeral version, where one party keeps changing its public/ private key (and hence the shared key)

Since the Diffie-Hellman algorithm does not do authentication it needs some other mechanism to authenticate the client and server. It can use an RSA certificate (certificates containing an RSA public key) or a non-RSA certificates – for example DSA certificates (certificates containing a DSA public key) and ECDSA (Elliptic Curve Digital Signature Algorithm) certificates (ECDSA is a variant of DSA that uses Elliptic Curve Cryptography (ECC). ECDSA certificates contain an ECC public key. ECC keys are better than RSA & DSA keys in that the ECC algorithm is harder to break. So not only are ECC keys more future proof, you can also use smaller length keys (for instance a 256-bit ECC key is as secure as a 3248-bit RSA key) and hence the certificates are of a smaller size). 

The fixed/ static version of Diffie-Hellman requires a Diffie-Hellman certificate for authentication (see here and here). Along with the public key of the server, this certificate also contains the prime number and smaller number required by the Diffie-Hellman algorithm. Since these numbers are a part of the certificate itself and cannot change, Diffie-Hellman certificates only work with fixed/ static Diffie-Hellman algorithms and vice-versa. A Diffie-Hellman handshake that uses the fixed/ static Diffie-Hellman algorithm has the same drawback as a RSA handshake. If the server’s private key is leaked, past traffic can be decrypted.  

The ephemeral version of Diffie-Hellman (often referred to as EDH (Ephermeral Diffie-Hellman) or DEH (Diffie-Hellman Ephemeral)) works with RSA certificates, DSA certificates, and ECDSA certificates. EDH/ DEH is computationally expensive as it is not easy to keep generating a new prime number and small number for every connection. A variant of EDH/ DEH that uses elliptic curves – known as Elliptic Curve Diffie-Hellman Ephemeral (ECDHE) – doesn’t have the performance hit of EDH/ DEH and is preferred. 

A Diffie-Hellman handshake that uses EDH/ DEH or ECDHE doesn’t have the drawback of an RSA handshake. The server’s private key is only used to authenticate it, not for generating/ protecting the shared session key. This feature of EDH/DHE and ECDHE wherein the shared keys are generated for each connection and are shared keys themselves are random and independent of each other is known as Perfect Forward Secrecy (PFS). (Perfect Forward Secrecy is a stronger form of Forward Secrecy. The latter does not require the shared keys themselves be independent of each other, only that the shared keys not be related to the server’s private/ public key). 

To be continued …

Writing this post took longer than I expected so I’ll conclude here. I wanted to explore TLS/SSL in the context of Windows and Active Directory, but I got side-tracked talking about handshakes and RSA, ECDHE, etc. Am glad I went down that route though. I was aware of elliptic curves and ECDHE, ECDSA etc. but had never really explored them in detail until now nor written down a cumulative understanding of it all. 

ReAgentC: Operation failed: 3

The other day I mentioned that whenever I run ReAgentC I get the following error – 

I posted to the Microsoft Community forums hoping to get a solution. The suggested solutions didn’t seem to help but oddly ReAgentC is now working – not sure why. 

One thing I learnt is that the error code 3 means a path is not found. My system didn’t have any corruptions (both sfc /scannow and dism /Online /Cleanup-image /Scanhealth gave it a clean chit) and I did a System Restore too to a point back in time when I know ReAgentC was working well but that didn’t help either. Windows RE itself was working fine as I was able to boot into it. 

In the end I shutdown the laptop and left it for a couple of days as I had other things to do. And when looked at it today ReAgentC was surprisingly working!

I am not sure why it is now working. One theory is that a few updates were applied automatically as I was shutting down the laptop so maybe they fixed some corruption. Or maybe when I booted into Windows RE and booted back that fixed something? (I don’t remember whether I tried ReAgentC after booting back from Windows RE. I think I did but I am not sure). 

Here’s a little PowerShell to find all the updates installed in the last 3 days. Thought I’d post it because I am pleased I came up with it and also maybe it will help someone else. 

This will only work in Windows 8 and above (I haven’t tried but I think installing the Windows Management Framework 4.0 on Windows 7 SP1 and/ or Windows Server 2008 R2 SP1 will get it working on those OSes too). 

Update: And it stopped working again the next day! The laptop was on overnight. The next day I rebooted as it had some pending updates. After the reboot we are back to square one. Of course I removed those two updates and rebooted to see if that helps. It doesn’t. 

 Fun! :)

 

Notes of SFC and Windows Servicing (Component Based Servicing)

SFC

SFC (used to be “System File Checker” but is now called Windows Resource Checker) is an in-built tool for verifying the integrity of all Windows systems files and also replacing them with good versions in case of any corruptions. It has been around for a while – I first used it on Windows 2000 I think – though am sure there’s many differences from that version and the latest ones. For instance, XP and prior used to store a copy of the system protected files in a folder called %WINDIR%\System32\DLLCache (but it would remove some of the protected files as disk space became scarce resulting in SFC prompting for the install media when repairing) while Vista and upwards use the %WINDIR%\System32\WinSxS folder (and its sub-folder %WINDIR%\System32\WinSxS\Backup).

The WinSxS folder

Here is a good article on the WinSxS folder. I recommend everyone read it.

Basically, the WinSxS folder contains all the protected files (under a “WinSxS\Backup” folder) as well as multiple version of DLLs and library components needed by various applications. In a way this folder is where all the DLLs of the system are actually present. The DLLs that you see at other locations are actually NTFS hard links to the ones in this location. So even though the folder might appear HUGE when viewed through File Explorer, don’t be too alarmed as many files that you think might be taking up space elsewhere are not actually taking up space because they are hard links to the files here. But yes, the WinSxS is huge also because it has all those old DLLs and components, and you cannot delete the files in here because you never know what application depends on it. Moreover, you can’t even move the folder to a different location as it has to be in this known location for the hard links to work.

In contrast to the WinSxS folder, the DLLcache folder can be moved anywhere via a registry hack. Also, the DLLcache folder doesn’t keep all the older libraries and such.

The latest versions of SFC can also work against an offline install of Windows.

Here’s SFC on my computer complaining that it was unable to fix some errors:

It is best to view the log file using a tool like trace32 or cmtrace. Here’s a Microsoft KB article on how to use the log file. And here’s a KB article that explains how to manually repair broken files.

Tip

Rather than open the CBS.log in trace32 it is better to filter the SFC bits first as suggested in the above KB articles. Open an elevated command prompt and type this:

Open this file (saved on your Desktop) in trace32 and look for errors.

Servicing

Servicing is the act of enabling/ disable a role/ feature in Windows, installing/ uninstalling updates and service packs. You can service both currently running and offline installations of Windows (yes, that means you can have an offline copy of Windows on a hard disk or a VHD file and you can enable/ disable features and roles on it as well as install hot fixes and updates (but not service packs) – cool huh!). If you have used DISM (Deployment Image Servicing and Management) in Windows 7 and upwards (or pkgmgr.exe & Co. in Vista) then you have dealt with servicing.

File Based Servicing

Windows XP and before used to have File Based Servicing. The update or hotfix package usually had an installer (update.exe or hotfix.exe) that updated the files and libraries on the system. If these were system files they were installed/ updated at WINDIR%\System32 and a copy kept at the file protection cache %WINDIR\System32\DLLcache (remember from above?). If the system files were in use, a restart would copy the files from %WINDIR\System32\DLLcache to WINDIR%\System32. Just in case you needed to rollback an update, a backup of the files that were changed was kept at C:\Windows\$Uninstall$KBnnnnnn (replace “nnnnnn” with the KB number). Life was simple!

Component Based Servicing

Windows Vista introduced Component Based Servicing (CBS). Whereas with File Based Servicing everything was mashed together, now there’s a notion of things being in “components”. So you could have various features of the OS be turned off or on as required (i.e. features and roles). The component itself could be installed to the OS but not active (for example: the files for a DNS server are already installed in a Server install but not activated; when you enable that role, Windows does stuff behind the scenes to activate it). This extends to updates and hotfixes too. For instance, when you install the Remote Server Admin Tools (RSAT) on Windows 7, it installs all the admin tool components but none of these are active by default. All the installer does is just add these components to your system. Later, you go to “Programs and Features” (or use DISM) to enable the components you want. CBS is the future, so that’s what I’ll be focussing on here.

Components

From this blog post:

A component in Windows is one or more binaries, a catalog file, and an XML file that describes everything about how the files should be installed. From associated registry keys and services to what kind security permissions the files should have. Components are grouped into logical units, and these units are used to build the different Windows editions. Each component has a unique name that includes the version, language, and processor architecture that it was built for.

Component Store

Remember the %WINDIR%\System32\WinSxS folder above? That’s where all these components are stored. That folder is the Component Store. (As an aside: “SxS” stands for “Side by Side”. It is a complete (actually, more than complete) install of Windows that lives side by side to the running installation of Windows). When you install a component in Windows Vista and above, the files are actually stored in this component folder. Then, if the component feature/ role/ update is activated, hardlinks are created from locations in the file system to the files here. So, for instance, when you install a Server its component folder will already contains the files for the DNS role; later, when you enable the role hard links are created from WINDIR%\System32 and elsewhere to the files in %WINDIR%\System32\WinSxS.

Payloads

Microsoft refers to the files (binaries such as libraries etc) in the WinSxS folder as payloads. So components consist of payloads that are stored in the WinSxS folder and manifests (not sure what they are) that are stored in the WinSxS\manifests folder.

Component Stack

Here’s a post from the Microsoft Servicing Guy on CBS. Like we had update.exe on XP and before, now we have trustedinstaller.exe which is the interface between the servicing stack and user-facing programs such as “Programs and Features”, DISM, MSI, and Windows Update. The latter pass on packages (and downloads them if necessary) to trustedinstaller.exe who invokes other components of the CBS stack to do the actual work (components such as CSI (Component Servicing Infrastructure), which you can read about in that link).

It is worth pointing out that dependency resolution (i.e. feature Microsoft-Hyper-V-Management-PowerShell requires feature Microsoft-Hyper-V-Management-Clients for instance) is done by the CBS stack. Similarly, the CBS stack is what identifies whether any files required for a feature/ role are already present on the system or need to be downloaded. All this info is passed on to the user-facing programs that interact with the user for further action.

Related folders and locations

Apart from the Component Store here are some other folders and files used by the CBS:

  • %windir%\WinSXS\Manifests – Sub-folder of the Component Store, contains manifests
  • %windir%\Servicing\Packages – A folder that contains the packages of a component. Packages are like components, they contain binaries (the payloads) and manifests (an XML file with the extension .MUM defining the payload as well as the state of the package (installed and enabled, only installed, not installed)). When you run Windows Update, for instance, you download packages that in turn update the components.

    A component might contain many packages. For instance, the Telnet-Client feature has just one package Microsoft-Windows-Telnet-Server-Package~31bf3856ad364e35~amd64~en-US~6.3.9600.16384.mum on my machine, but the HyperV-Client role has more than a dozen packages – Microsoft-Hyper-V-ClientEdition-Package~31bf3856ad364e35~amd64~en-US~6.3.9600.16384.mum being the package when the OS was installed, followed by packages such as Package_1033_for_KB2919355~31bf3856ad364e35~amd64~~6.3.1.14.mum and Package_13_for_KB2903939~31bf3856ad364e35~amd64~~6.3.1.2.mum, etc for various updates that were applied to it. (Remember: In XP and before updates targeted files. Now updates target components. So updates apply to components).

    An update that you install – say KBxxxxxxxx – might have multiple packages with each of these packages targeting different components of the system. The payload in a package is copied to the Component Store; only the .MUM defining the package is left in the %windir%\Servicing\Packages folder. Moreover, each component is updated with details of the package which affects it – this is what we see happening when an update is applied to the OS and Windows takes a long time configuring things. (Remember components are self-sufficient. So it also knows of the updates to it).

    You can get a list of packages installed on your system using the /Get-Packages switch to DISM:

    To get the same info as a table rather than list (the default):

  • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Component Based Servicing – A registry key tree holding a lot of the servicing information.

    For instance, HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Component Based Servicing\PackageDetect\Microsoft-Hyper-V-ClientEdition-Package~31bf3856ad364e35~amd64~~0.0.0.0 on my machine tells me which packages are a part of that component.

    Note that the above component name doesn’t have a language. It is common to all languages. There are separate keys – such as HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Component Based Servicing\PackageDetect\Microsoft-Hyper-V-ClientEdition-Package~31bf3856ad364e35~amd64~en-US~0.0.0.0 – which contain packages for that specific language variant of the component.

  • %windir%\WinSXS\Pending.xml – An XML file containing a list of commands and actions to be performed after a reboot for pending updates (i.e. files that couldn’t be modified as they are in use)
  • %windir%\Logs\CBS\CBS.log – The CBS log file.

Here’s a blog post from The Servicing Guy talking about the above locations. Unfortunately, it’s only part 1 as he didn’t get around to writing the follow-ups.

Summary

Here’s a high-level summary of many of the points I touched upon above:

  • Windows Vista and upwards use Component Based Servicing. A component is a self-sufficient unit. It includes binaries (files and libraries) as well as some metadata (XML files) on where the files should be installed, security rights, registry & service changes, etc. In Windows Vista and upwards you think of servicing in terms of components.
  • The files of a component are stored in the Component Store (the WinSxS folder). Everything else you see on the system are actually hard-links to the files in this Component Store.
  • When a component is updated the older files are not removed. They stay where they are, with newer versions of any changed files being installed side by side to them and references to these files from elsewhere are now hard-linked to this newer version. This way any other components or applications that require the older versions can still use them.
  • Components can be thought of as being made up of packages. When you download an update it contains packages. Packages target components. The component metadata is updated so it is aware that such and such package is a part of it. This way even if a component is not currently enabled on the system, it can have update packages added to it, and if the component is ever enabled later it will already have the up-to-date version.
  • Remember you must think of everything in terms of components now. And components are self-sufficient. They know their state and what they do.

Just so you know …

I haven’t worked much with CBS except troubleshooted when I have had problems or added/ removed packages and such when I am doing some basic servicing tasks on my machine/ virtual labs. Most of what I explain above is my understanding of things from the registry key and the folders supplemented with information I found in blog posts and articles. Take what you read here with a pinch of salt.

Service store corruptions

The Component Store can get corrupted, resulting in errors when installing updates and/ or enabling features. Note: this is not corruption with the system – which can be fixed via tools such as SFC – but corruptions to the Component Store itself.

Windows 8 and later

Windows 8 and upwards can detect and fix such corruptions using DISM /Cleanup-Image (don’t forget to specify /Online for online servicing or /Image:\path\to\install for offline servicing):

  • DISM /Cleanup-Image /ScanHealth will scan the Component Store for errors. It does not fix the error, only scans and updates a marker indicating the Store has errors. Any errors are also logged to the CBS.Log file.
  • DISM /Cleanup-Image /RestoreHealth will scan as above and also fix the error (so it’s better to run this than scan first and then scan & repair).
  • DISM /Cleanup-Image /CheckHealth will check whether there’s any marker indicating the system has errors. It does not do a scan by itself (so there’s no real point to running this, except to quickly check whether any tool has previously set the marker).

If PowerShell is your weapon of choice (yaay to you!), you can use Repair-WindowsImage -ScanHealth | -RestoreHealth | -CheckHealth instead.

If corruptions are determined and you have asked for a repair then Windows Update/ WSUS are contacted for good versions of the components. The /LimitAccess switch can be used to disable this; the /Source switch can be used to specify a source of your own (you must point to the WinSxS folder of a different Windows installation; see this TechNet article for some examples). (Update: WSUS is not a good source so it’s better to use gpedit.msc or GPOs to temporarily specify a Windows Update server, or use the /LimitAccess switch to not contact WU/ WSUS at all and specify a WinSxS folder to use).

Example:

Windows 7 and prior

Windows 7 and below use a Microsoft tool called CheckSUR (Check System Update Readiness).

Here’s a blog post on using CheckSUR to repair corruption. Note that CheckSUR can only repair manifests while DISM /Cleanup-Image can do both manifests and payloads.

Managing the component store size

The Component Store will keep growing as more updates are installed to a system. One way to reduce the size is to tell Windows to remove all payloads from older Service Packs. For instance, say the machine began life as Windows 7, then had updates applied to it, then a Service Pack. You know you will never uninstall this Service Pack so you are happy with removing all the older payloads from WinSxS and essentially tell the OS that from now on it must consider itself as Windows 7 Service Pack 1 for good – there’s no going back.

Here’s how you can do that for the various versions of Windows:

  • Windows Vista Service Pack 1 uses a tool called VSP1CLN.EXE to do this.
  • Windows Vista Service Pack 2 and Windows Server 2008 SP2 use a tool called Compcln.exe
  • Windows 7 Service Pack 1, Windows Server 2008 R2 Service Pack 1, and above use DISM /online /Cleanup-Image /SpSuperseded (for Windows 7 Service Pack 1 with update KB2852386 you can also use the Disk Cleanup Wizard (cleanmgr.exe)).

Automatic scavenging

Windows 7 and above also automatically do scavenging to remove unused components from the Component Store. Windows Vista and prior do scavenging on a removal event, so you can add and remove a feature to force a scavenging.

Windows 8 has a scheduled task StartComponentCleanup that automatically cleans up unused components. It waits 30 days after a component has been updated before removing previous versions (so you have 30 days to rollback to a previous version of the update). You can run this task manually too:

Check this blog post for some screenshots.

Windows 8.1 and Server 2012 R2 extras

Windows 8.1 and Windows Server 2012 R2 include a new DISM switch to analyze the Component Store and output whether a clean up can be made.

The clean up can be done automatically or manually run via the task scheduler entry as previously mentioned. DISM too has a new switch which does the same (but doesn’t follow the 30 days rule like the schedule task so it is more aggressive).

Note that the scavenging options above (Windows 7 and up) only remove previous versions of the components. They are not as aggressive as the options to re-base the OS to a particular service pack. Once the previous versions of the components are removed you cannot rollback to those older versions but you can still uninstall the update you are on.

Windows 8.1 and Server 2012 R2 introduce a new switch that lets you re-base to whatever state you are in now. This is the most aggressive option of all as it makes the state of the system permanent. You cannot uninstall any of the existing updates after you rebase (any newer updates made hence can be uninstalled) – the state your system is in now will be frozen and become its new state. This new switch is called /ResetBase:

Windows Server 2012 and Server 2012 R2 extras

Windows Server 2012 introduces the concept of “features on demand”.

Remember I had said by default all the payloads for all the features in Windows are present in the WinSxS folder. When you enable/ disable features you are merely creating hard-links for the files in WinSxS. What this means is that even though your Server 2012 install may not use a feature its files are still present and taking up space in the WinSxS folder. Starting with Server 2012 you can now uninstall (rather than remove) a feature so its files from the WinSxS folder are deleted.

Of course once you remove the files this if you want to enable the feature later you must also specify a source from where they can be downloaded. Check this blog post for more info.

Year Three: rakhesh.com

Today marks 2 years since I booked the domain (port25.io, no longer active) where this blog began life. I began posting 10 days later, on 21st November 2012. But that was just an introductory post I think, as the current oldest post on this blog is from 2nd December 2012. When I changed blog URLs I moved that introductory post to the Changelog section. Coincidentally, this post you are reading now also marks the 200th post. :)

This blog has moved on from its original goal of blogging about Exchange to now blogging about movies, thoughts, and whatever techie thing I am currently working on. It began as a outlet I could (hopefully) use to explain things to others. But it has moved to being a personal notebook and bookmarks store – most of my posts are like notes to future self, posts I can refer to to refresh myself on something I may have forgotten or just look up some command or code snippet that I used to solve a particular task. Added to that most of my posts have links to other blogs and articles – links that do a much better job of explaining the concepts – so I can refer to these links too rather than search through my bookmarks. In that sense both the topics and style/ purpose of this blog has evolved from its beginnings. Not that I am complaining – I like where it’s heading to!

Anyways, just thought I must put up a post marking this day. And write a paragraph or two in case it helps anyone else who is on the fence regarding starting a blog. My suggestion would be to just get something started. It’s a good way for the world and yourself to know what you have been up to. Sure there’s tons of excellent blogs out there so it might seem like you have nothing new to add to the pool – and while you may be correct in thinking that, I’d say it’s still a good idea to put your thoughts too out there. Maybe your way of explaining will make better sense to people. Maybe in the process of blogging about what you are learning/ doing you will get a better understanding yourself. Who knows! Give it a shot, and then back off if you have to. This blog too for instance has many weeks when I barely post anything – because I am not doing anything or I am not in the mood to write – and then I think of shutting it down for good. But usually I hold off, and that works out well because when I am back to doing something or I am in the mood to write I have a place to put it down. And then on a day like today when I look back at the posts I made over the past two years I get a kick out of it – wow I have actually worked on and done a lot of things! Who knew!

I guess I Blog, therefore I Am and that’s one good reason to keep blogging. For yourself.

Notes on Windows RE and BCD

Windows RE

Windows RE (Recovery Environment) is a recovery environment that you boot into when your Windows installation is broken. When the Windows boot loader realizes your Windows installation is broken it will automatically boot into Windows RE. (During boot up the Windows boot loader sets a flag indicating the boot process has started. When the OS loads it clears this flag. If the OS doesn’t load and the computer reboots, the boot loader sees the already set flag and knows there’s a problem. A side effect of this is a scenarios where where the OS starts to load but the machine loses power and so the flag isn’t cleared; later when power returns and the machine is turned on the boot loader notices the flag and loads Windows RE as it thinks the OS is broken).

Screenshots

You can manually boot into Windows RE by pressing F8 and selecting “Repair your computer” from the options menu.

winre-1

The Windows RE menu.

winre-2

Apart from continuing the boot process into the installed OS, you can also power off the computer, boot into a USB driver or network connection, or do further troubleshooting. The above screenshot is from a Windows Server 2012 install. Windows 8 has a similar UI, but Windows 7 (and Windows Server 2008 and Windows Vista) have a different UI (but with similar functionality).

Selecting “Troubleshoot” shows the following “Advanced options”:

winre-5

The startup settings can be changed here or a command prompt windows launched for further troubleshooting.

winre-3

It is also possible to re-image the computer from a recovery image. The recovery image can be on a DVD, an external hard drive, or a Recovery Image partition. It is also possible to store your own recovery image to this partition

winre-4

Location of Windows RE

Windows RE itself is based on Windows PE and is stored as a WIM file. This means you can customize Windows RE by adding additional languages, tools, and drivers. You can even add one custom tool to the “Troubleshoot” menu. On BIOS systems the Windows RE WIM file is stored in the (hidden) system partition. On UEFI systems it is stored in the Windows RE tools partition.

The system partition/ Windows RE tools partition has a folder \Recovery\WindowsRE that contains the WIM file winre.wim and a configuration file ReAgent.xml. On the installed system the \Windows\System32\Recovery\ folder has a ReAgent.xml which is a copy of the file in the system tools/ Windows RE tools partition. The former must be present and have correct entries. Also, for BIOS systems, the system partition must be set as active (and it has an MBR ID of 27 which marks it as a system partition).

Notice the “WinreBCD” ID number in the XML file. Its significance will be made clear later (in the section on BCD).

Managing Windows RE

Windows RE can managed using the \Windows\System32\ReAgentC.exe tool. This tool can manage the RE of the currently running OS and for some options even that of an offline OS. More information on ReAgentC.execommand can be found at this TechNet article. Here are some of the things ReAgentC can do:

  • ReAgentC /enable enables Windows RE. ReAgentC /disable disables Windows RE.

    Both these switches work only against the currently running OS – i.e. you cannot make changes to an offline image. You can, however, boot into Windows PE and enable Windows RE for the OS installed on that computer. For this you’ll need the BCD GUID of the OS (get this via bcdedit /enum /v or bcdedit /store R:\Boot\BCD /enum /v where R:\Boot\BCD is the path to the BCD store – this is usually the system partition for BIOS or the EFS partition for UEFI (it doesn’t have a drive letter so you have to mount it manually)). Once you have that, run the command as: ReAgentC /enable /osguid {603c0be6-5c91-11e3-8c88-8f43aa31e915}

    The /enable options requires \Windows\System32\Recovery\ (on the OS partition) to be present and have correct entries.

  • ReAgentC /BootToRE tells the boot loader to boot into Windows RE the next time this computer reboots. This too only works against the currently running OS – you cannot make changes to an offline image.
  • ReAgentC /info gives the status of Windows RE for the currently running OS. Add a switch /target E:\Windows folder to get info for the OS installed on the E: drive (which could a partition on the disk or something you’ve mounted manually).
  • ReAgentc.exe /SetREimage /path R:\Recovery\WindowsRE\ tells the currently running OS that its Windows RE is at the specified path. In the example, R:\Recovery\WindowsRE would be the system partition or Windows RE tools partition that you’ll have mounted manually and this path contains the winrm.wim file. As before add a switch /target E:\Windows folder to set the recovery image for the OS installed on the E: drive.

Operation failed: 3

On my system ReAgentC was working fine until a few days ago but is now giving the following error:

I suspect I must have borked it somehow while making changes for a my previous post on Hyper-V but I can’t find anything to indicate a problem. Assuming I manage to fix it some time, I’ll post about it later.

BCD

I think it’s a good idea to talk about BCD when talking about Windows RE. The BCD is how the boot loader knows where to find Windows RE, and if the BCD entries for Windows RE are messed up it won’t work as expected.

BCD stands for Boot Configuration Data and it’s the Vista and upwards equivalent of boot.ini which we used to have in the XP and prior days.

Boot process difference between Windows XP (and prior) vs Windows Vista (and later)

Windows XP, Windows Server 2003, Windows Server 2000 had three files that were related to the boot process:

  • NTLDR (NT Loader) – which was the boot manager and boot loader, usually installed to the MBR (or to the PBR and chainloaded if you had GRUB and such in the MBR)
  • NTdetect.com – which was responsible for detecting the hardware and passing this info to NTLDR
  • BOOT.INI – a text file which contained the boot configuration (which partitions had which OS, how long to wait before booting, any kernel switches to pass on, etc) and was usually present along with NTLDR

From Windows Vista and up these are replaced with a new set of files:

  • BootMgr (Windows Boot Manager) – which a boot manager that is responsible for showing the boot options to the user and loading the available OSes. Under XP and prior this functionality was provided by NTLDR (which also loaded the OS) but now it’s a separate program of its own. While NTLDR used to read its options from the BOOT.INI file, BootMgr reads its options from the BCD store.
  • BCD (Boot Configuration Data) – a binary file which replaces BOOT.INI and now contains the boot configuration data. This file has the same format as the Windows registry, and in fact once the OS is up and running the BCD is loaded under HKEY_LOCAL_MACHINE\BCD00000000.

    The BCD is a binary file that’s stored in the EFS partition on UEFI systems or in the system partition in BIOS systems under the \Boot folder (it’s a system hidden file so not visible by default). It is a binary file (unlike BOOT.INI which is a text file) so the entries in it can’t be managed via notepad or any text editor. One has to use the BCDEdit.exe tool that’s part of Windows or via third-party tools such as EasyBCD.

  • winload.exe – I mentioned earlier that the boot manager functionality of NTLDR is now taken up by BootMgr. What remains is the boot loader functionality – the task of actually loading the kernel and drivers from disk – and that is now taken care of by winload.exe. In addition, winload.exe also does the hardware detection stuff that was previously done by NTdetect.com.

Vista: the misunderstood Windows

I think this is a good place to mention that while Windows Vista may have been a derided release from a consumer point of view, it was actually a very important release in terms of laying the foundations for future versions of Windows.

Once upon a time we had MS-DOS and Windows 3.x and Windows 95, 98, ME. These had a common set of technologies. Then there was Windows NT, which was different from the these.

Windows 2000 “married” Windows NT and Windows ME. It laid a new foundation upon which later OSes such Windows 2000, Windows XP, and Windows Server 2003 were based. All of these are based on Windows NT and have a common set of technologies. You know one of these, you can work around the others through a bit of trial and error. Some features may be added or missing, but more or less you can figure things out.

Then came Windows Vista and Server 2008. While these are still similar to Windows XP and Windows Server 2003, they are very different too in a lot of ways. Windows Vista and Server 2008 laid the foundations for changes that were further refined in Windows 7, Windows 8, Server 2008 R2, and so on. For instance changes such as WIM files, the boot process, UAC, deployment tools, CBS (Component Based Servicing), and so on. If the only thing you have worked on is Windows XP sure you can get around a bit with Windows Vista or 7, but as you start going deeper into things you’ll realize a lot of things are way different.

Back during the BOOT.INI days you specified disks and partitions in terms of numbers. The BIOS assigned numbers to disks and the BOOT.INI file had entries such as multi(0)disk(0)rdisk(0)partition(1)\WINDOWS which specified the Windows folder on a partition (in this case the 1st partition of the 1st disk) that was to be booted. This was simple and did the trick mostly, except for when you moved disks around or add/ deleted partitions. Then the entry would be out of date and the boot process will fail.

BCD does away with all this.

BCD uses the disk’s GPT identifier or MBR signature to identify the disk (so changing the order of disks won’t affect the boot process any more). Further, each boot entry is an object in the BCD file and these objects have unique GUIDs. (These are the objects I showed through the bcdedit.exe /enum all command above). The object contains the disk signature as well as the partition offset (the sector from where the partition starts on that disk) where it’s supposed to boot from. Thus to boot any entry all BootMgr needs to do is scan the connected disks for the one with the matching signature and then find the partition specified by the offset. This makes BCD independent of the disk numbers assigned by BIOS and it is unaffected by changes made to the order of disks.

A downside of BCD is that while with BOOT.INI one could move the OS to a different disk with the same partitioning and hope for it to boot, that won’t do with BCD as the disk signatures won’t match. BootMgr will scan for the disk signature in the BCD object, not find it, and complain that it cannot find the boot device and/ or winload.exe. (This is not a big deal because BCDEdit can be used to fix the record but it’s something to keep in mind).

Here’s the output from BCDEdit on my machine. There’s two sets of output here – one with a /v switch, the other without.

Couple of things to note here.

First, notice what I meant about each entry being an “object”. As you can see each entry has properties and values – unlike in BOOT.INI days where everything was on a single line with spaces between options.

Second, the /enum switch shows all the active entries in BCD but by default skips the GUID for objects that are universal or known. For instance, the GUID for the boot manager is always {9dea862c-5cdd-4e70-acc1-f32b344d4795} so it replaces that with {default} in the output. Similarly it replaces the GUID for the currently loaded OS – which isn’t universal, but it’s known as it’s the currently loaded one – with {current}. BCDEdit does this to make it easier for the end user to read the output and/ or to refer to these objects when making changes. If you don’t want such “friendly” output use the /v switch like I did in the second case above.

The registry stores the objects as GUIDs. So if I were to take the GUID of the currently running system from the output above and look at the registry I’ll see similar details:

Going back to the BCDEdit output if we compare the device entries for the {bootmgr} and {current} entries we can see it’s represented as partition=\Device\HarddiskVolume1 for the {bootmgr} entry and the friendlier drive letter version partition=C: for the {current} entry (because the partition has a drive letter). BCD starts the volume from 1 so \Device\HarddiskVolume1 refers to the first partition of all the disks on the computer. This is worth emphasising. The \Device\HarddiskVolumeNN representation is not how BCD stores the data internally. Internally BCD uses the disk signature and offset as mentioned earlier, but when displaying to the end-user it uses a friendlier format like \Device\HarddiskVolume1 or a drive letter.

If we compare the registry output above to the corresponding BCD output we can see the partition+disk information represented differently.

Another thing worth noting with the BCDEdit output is that it classifies the output. The first entry is BOOTMRG so it puts it under the section of “Windows Boot Manager”. Subsequent entries are boot loaders so they are put under “Windows Boot Loader”. There’s only one active entry in my system but if I had more entries they too would appear here.

Note that I said there’s only one active entry in my system. There are actually many more entries but these are not active. For instance, there’s an entry to boot into Windows RE but that’s not shown by default. To see all these other entries the /enum switch takes various parameters. For example: /enum osloader shows all OS loading entries, /enum bootmgr shows BOOTMGR, /enum resume shows hibernation resume entries, and so on. To show every entry in the BCD use the switch /enum all (and to see what other options are present do /enum /? to get help).

Notice the Windows RE entry above. And notice that its GUID matches that in the ReAgent.xml file of Windows RE.

On my machine I had one more entry initially:

This is an incorrect entry because the GUID of this entry doesn’t match the Windows RE GUID in the ReAgent.xml file so I deleted it:

Speaking of Windows RE, one of the things we can do from Windows RE (and only from Windows RE!) is repair the MBR, boot sector, and BCD with a tool called Bootrec. To fix only the MBR there’s a tool called bootsect which is available in Windows 8 and above (or Windows PE in case of Windows 7). This tool can replace the MBR with BOOTMGR or NTLDR compatible code and is often useful for fixing unbootable systems.

Another useful tool to be aware of is BCDBoot. This tool is used to create a new BCD store and/ or install the boot loader and related files. I used this tool in a previous posts to install the UEFI bootloader and the BIOS bootloader.

Before I conclude I’d like to link to three posts by Mark Minasi on BCD. They go into similar material as what I did above but I feel are better presented (they talk about the various switches for instance, whereas I just mention them in passing):

Finally, BCDEdit too supports options like you could set in BOOT.INI (for example: use a standard VGA driver, disable/ enable PAE, disable/ enable DEP). You set these options via the bcdedit /set {GUID} ... switch, wherein {GUID} is the ID of the boot entry you want to make the settings on and ... is replaced with the options you want to change. See this MSDN article for more information on the options and how to set them. Common BOOT.INI settings and their new equivalents can be found at this MSDN article.

That’s all for now!

Down the rabbit hole

Ever had this feeling that when you want to do one particular thing, a whole lot of other things keep coming into the picture leading you to other distracting paths?

For about a week now I’ve been meaning to write some posts about my Active Directory workshop. In a typical me fashion, I thought I’d set up some VMs and stuff on my laptop. This being a different laptop to my usual one, I thought of using Hyper-V. And then I thought why not use differencing VHDs to save space. And then I thought why not use a Gen 2 VM. Which doesn’t work so I went on a tangent reading about UEFI’s boot process and writing a blog post on that. Then I went into making an answer file to use while installing, went into refreshing myself on the PowerShell cmdlets I can use to do the initial configuring of Server Core 2012, made a little script to take care of that for multiple servers, and so on …

Finally I got around to installing a member server yesterday. Thought this would be easy – I know all the steps from before, just that I have to use a Server 2012 GUI WIM instead of a Core WIM. But nope! Now the ReAgentC.exe command on my computer doesn’t work! It worked till about 3 days ago but has now suddenly stopped working – so irriting! Of course, I could skip the WinRE partition – not that I use it anyways! – or just use a Gen 1 VM, but that just isn’t me. I don’t like to give up or backtrack from a problem. Every one of these is a learning opportunity, because now I am reading about Component Based Servicing, the Windows Recovery Environment, and learning about new DISM cleanup options that I wasn’t even aware of. But the problem is one of balance. I can’t afford to lose myself too much in learning new things because I’ll soon lose sight of the original goal of making Active Directory related posts.

It’s exciting though! And this is what I like and dislike about embarking on a project like this (writing Active Directory related posts). I like stumbling upon new issues and learning new things and working through them; but I dislike having to be on guard so I don’t go too deep down the hole and lose sight of what I had set out to do.

Here’s a snapshot of where I am now:

workflowy

It’s from WorkFlowy, a tool that I use to keep track of such stuff. I could write a blog post raving about it but I’ll just point you to this excellent review by Farhad Manjoo instead.

Downloading Trace32 and CMTrace for easy log file reading

I was working with some log file recently (C:\Windows\Logs\cbs\CBS.log to be precise, to troubleshoot an issue I am having on my laptop, which I hope to sort soon and write a blog post about). Initially I was opening the file in notepad but that isn’t a great way of going through log files. Then I remembered at work I use Trace32 from the SCCM 2007 Toolkit. So I downloaded it from Microsoft. Then I learnt Trace32’s been replaced with one called CMTrace in SCCM 2012 R2.

Here’s links to both the toolkits:

For the 2007 toolkit when installing choose the option to only install the Common Tools and skip the rest. That will install only Trace32 at C:\Program Files (x86)\ConfigMgr 2007 Toolkit V2 (add this to your PATH variable for ease of access).

2007-toolkit

For the 2012 R2 toolkit choose the option to install only the Client Tools and skip the rest. That will install CMTrace and a few other tools at C:\Program Files (x86)\ConfigMgr 2012 Toolkit R2\ClientTools (add this too to your PATH variable).

2012-toolkit

That’s all! Happy troubleshooting!

Tip: View hidden files and folders in PowerShell

Just as a reference to my future self …

To view hidden files & folders in a directory via PowerShell use the -Force switch with Get-ChildItem:

Hyper-V differencing disks with an answer file

If you follow the differencing VHD approach from my previous posts (this & this) you’ll notice the boot process starts off by getting the devices ready, does a reboot or two, and then you are taken to a prompt to set the Administrator password.

Do that and you are set. There’s no other prompting in terms of selecting the image, partitioning etc (because we have bypassed all these stages of the install process).

Would be good if I could somehow specify the admin password and the server name automatically – say via an answer file. That’ll take care of the two basic stuff I do always any way. My admin password is common for all the machines, and the server name is same as the VM name, so these can be figured out automatically to use with an answer file.

The proper way to create an answer file is too much work and not really needed here. So I Googled for answer files, found one, removed most of it as it was unnecessary, and the result is something like this:

If you replace the text marked with –REPLACE– with the computer name, and save this to the c:\ of a newly created VM, the password and computer name will be automatically set for you!

So here’s what I do in addition to the steps before.

Create the differencing VHD as usual

Save the XML file above as “Unattend.xml”. Doesn’t matter where you save it as long as you, I’ll assume it’s in my current directory. If it is saved anyplace else replace the path accordingly in the second cmdlet below.

Mount the VHD, copy the answer file over replacing the machine name with what you want, dismount the VHD. Finito!

That’s it really.

A different way to manipulate the XML file

I used the -replace operator above to make changes to the XML file. But I can do things differently too as PowerShell understands XML natively.

Three cmdlets instead of one, but this might feel “neater”.

Notes of UEFI, GPT, UEFI boot process, disk partitions, and Hyper-V differencing disks with a Generation 2 VM

In my previous post I had talked about creating differencing VHDs for use with Hyper-V. While making that post I realized that what I what I was doing doesn’t work with Generation 2 VMs. Investigating a bit into that bought me to my old friend UEFI. I say “old friend” because UEFI is something I have been reading off and on the past few months – mainly due to my interest in encryption. For instance, my laptop with Self Encrypting SSDs can only be managed by BitLocker if I install Windows 8 in UEFI mode. By default it had installed in BIOS mode (and was continuing to when I re-installed) so a few months ago I had read about UEFI and figured how to install Windows 8 on that laptop in UEFI mode.

Then at work we started getting UEFI computers and so I spent some time going through the firmware on those computers just to get a hang of UEFI.

And then last month I bought a Notion Ink Cain tablet, and to get encryption working on it I had to enable Secure Boot (which is a part of UEFI) so once again I found myself reading about UEFI. That was a fun exercise (and something I am yet to post about) so I have been meaning to write about UEFI for a while just that I never got around to it. Since I stumbled upon UEFI again today, might as well do so now.

So what is UEFI? Simply put UEFI is a firmware specification that’s meant to replace BIOS. Most modern laptops and desktops come with UEFI but it looks and behaves like BIOS so you might not notice the difference until you delve in. In this post I’ll focus on the boot process of BIOS and UEFI as that’s what I am interested in.

BIOS boot process

With BIOS you have an MBR (Master Boot Record). In BIOS you specify the boot order of disks, and each of these disks is searched for the MBR by BIOS. The MBR is the first sector of a disk and it contains information on the partitions in the disk as well as a special program (called a “boot loader”) which can load OSes from these partitions. Since the MBR is at a standard location the BIOS can pass control to the boot loader located there. The BIOS doesn’t need to know anything about the OSes or their file systems – things are dumb & simple.

BIOS has limitations in terms of the size of disks it can work with, the limited space available to the boot loader (because of which you have to use quirks like “chain loaders” and such), and so on. BIOS is good, but its time has come … its replacement is UEFI.

What is UEFI?

BIOS stands for “Basic Input/ Output System”. UEFI stands for “Unified Extensible Firmware Interface”. UEFI began as EFI, and was developed by Intel but is now managed by the UEFI Forum. Both BIOS and UEFI aren’t a specific piece of software. Rather, they are specifications that define the interface between the firmware and OS. The UEFI specification is more managed. There are many versions of the specification, with each version adding more capabilities. For instance, version 2.2 added the Secure Boot protocol stuff. Version 2.1 added cryptography stuff. As of this writing UEFI is at version 2.4.

In contrast, BIOS doesn’t have a specification as such. Various BIOS implementations have their own feature set and there’s no standard.

For backward compatibility UEFI can behave like BIOS. The UEFI specification defines a Compatibility Support Module (CSM) which can emulate BIOS. Bear in mind, it is still UEFI firmware, just that it behaves like BIOS firmware without any of the additional UEFI features or advantages. You can’t have both UEFI and BIOS on a computer – only one of them is present, after all they are both firmware!

UEFI classes

The UEFI forum defines four classes for computers:

  1. Class 0 – The computer has no UEFI, only BIOS.
  2. Class 1 – The computer has UEFI with CSM only. So it has UEFI but behaves in a BIOS compatible mode.
  3. Class 2 – The computer has UEFI and CSM. So it can behave as BIOS compatible mode if need be.
  4. Class 3 – The computer has UEFI only, no CSM.

It’s important to be aware of what class your computer is. Hyper-V Generation 2 VMs, for instance, behave as Class 3 computers. They have no CSM. (Moreover Hyper-V Generation 2 does not have a 32-bit implementation of UEFI so only 64-bit guest OSes are supported).

UEFI and GPT

UEFI has a different boot process to BIOS. For starters, it doesn’t use the MBR. UEFI uses a newer partitioning scheme called GPT (GUID Partition Table) that doesn’t have many of MBRs limitations.

If your disk partitioning is MBR and you system has UEFI firmware, it will boot but in CSM mode. So be sure to choose GPT partitioning if you want to use UEFI without CSM.

Also, even though your machine has UEFI, when trying to install Windows it might boot the Windows installer in CSM mode. When you press F9 or whatever key to select the boot media, usually there’s an option which lets you boot in UEFI mode or BIOS/ CSM mode. Sometimes the option isn’t explicit and if the boot media has both UEFI and BIOS boot files, the wrong one may be chosen and UEFI will behave in CSM mode. It is possible to detect which mode Windows PE (which runs during Windows install) is running in. It is also possible to force the install media to boot in UEFI or CSM mode by deleting the boot files of the mode you don’t want.

My laptop, for instance, is UEFI. But each time I’d install Windows 8 onto it it would pick up the BIOS boot loader files and boot in CSM mode. Since I wanted to use UEFI for some of its features, I used Rufus to create a bootable USB of the media (be sure to select “GPT partitioning for UEFI computers”) and when I booted from it Windows installed in UEFI mode. The trick isn’t the GPT partitioning. The trick is that by telling Rufus we want to boot on an UEFI computer, it omits the BIOS specific boot loader files from the USB. It is not necessariy to use Rufus – the process can be done manually too.

UEFI and GPT work with both 32-bit and 64-bit Windows. The catch is that to booting from GPT is only supported for 64-bit Windows running in UEFI. So while you can have 32-bit Windows running in UEFI, it will need an MBR partition to boot from. What this means is that such a system will be running as UEFI Class 2 as that’s the only one which supports UEFI and MBR partitions (essentially the system has UEFI but behaves as BIOS compatible mode).

UEFI classes and MBR/GPT partitioning

With Windows you can use MBR or GPT partitions on your computer depending on its class. From this Microsoft page:

  • UEFO Class 0 – Uses MBR partitions.
  • UEFI Class 1 – Uses GPT partitions.
  • UEFI Class 2 – Uses GPT partitions. This class of UEFI support includes CSM so if MBR partitions are present UEFI will run in compatibility mode.
  • UEFI Class 3 – Uses GPT partitions.

I am not clear why Class 1 only uses GPT partitions. Considering Class 1 is UEFI with CSM only and CSM supports MBR, I would have thought Class 1 supports only MBR partitions.

UEFI boot process

The UEFI boot process is more complicated than BIOS. That doesn’t mean it’s difficult to understand or unnecessarily complicated. What I meant is that it isn’t as simple as having an MBR with a boot loader, as in the case of BIOS. You can’t expect to pass along a VHD file created with BIOS in mind to a machine having only UEFI and expect it to work (as was my case). You need to tweak things so the boot process works with UEFI.

An excellent blog post on the UEFI boot process is this. If you have the time and inclination, go read it! You will be glad you did. What follows are my notes from that post and some others.

  • The UEFI specifications define a type of executable (think .exe files) that all UEFI firmware must support. Each OS that wishes the UEFI firmware to be able to boot it will provide a boot loader of this type. That’s it. OS provides such a boot loader, UEFI loads it.
  • In BIOS the boot loader was present in the MBR. Where is it present in UEFI? In order to be not limited by space like BIOS was, UEFI defines a special partition where boot loaders can be stored. The partition doesn’t have to be of a specific size or at a specific location. The spec requires that all UEFI firmware must be able to read a variant of the FAT file system that’s defined in the spec. (UEFI firmware can read other file system types too if they so wish, but support for this variant of FAT is a must). So UEFI boot loaders are stored in a special partition that’s of file system type FAT (the variant defined by UEFI). And to denote this partition as the special partition it has a different type (i.e. it doesn’t say FAT32 or NTFS or EXT2FS etc, it says ESP (EFI System Partition)). Simple! (Oh, and there can be multiple ESP partitions too if you so wish!)

The above design makes UEFI much more reliable than BIOS. Whereas with the latter you could only store a very limited boot loader at a specific space on the disk – and that boot loader usually chain loaded the OSes – with UEFI you can store boot loaders (in the EFI executable format) of each OS in the ESP partition that’s of file system type FAT (the variant defined by UEFI). Already you have a lot more flexibility compared to BIOS.

To tie all these together UEFI has a boot manager. The boot manager is what looks at all the boot loader entries and creates a menu for booting them. The menu isn’t a static one – the firmware can create a menu on the fly based on boot loaders present across multiple disks attached to the computer. And this boot manager can be managed by tools in the installed OS too. (Sure you could do similar things with Linux boot loaders such as GRUB, but the neat thing here is that the functionality is provided by the firmware – independent of the OS – which is really where it should be! It’s because BIOS was so limited that we had fancy boot loaders like GRUB that worked around it).

If you go down to the section entitled “The UEFI boot manager” in the post I linked to earlier you’ll see an example of a boot manager output. No point me paraphrasing what the author has said, so best to go and check there. I’ll mention one interesting point though:

  • Remember I said there are ESP partitions and they contain the OS boot loaders? So, for instance, you could have an UEFI boot manager entry like HD(1,800,61800,6d98f360-cb3e-4727-8fed-5ce0c040365d)File(\EFI\fedora\grubx64.efi) which points to the partition called HD(1,800,61800,6d98f360-cb3e-4727-8fed-5ce0c040365d) (the naming convention follows a EFI_DEVICE_PATH_PROTOCOL specification) and specifically the \EFI\fedora\grubx64.efi file as the boot loader.
  • What you can also have, however, is a generic entry such as HD(2,0,00). Note there’s no boot loader specified here, and probably no specific ESP partition either. What happens in such cases is that the boot manager will go through each ESP partition on that disk, check for a file \EFI\BOOT\BOOT{machine type short-name}.EFI, and try loading that. This way the UEFI spec allows for one to boot from a hard disk without specifying the OS or path to the boot loader, as long as the disk contains a “default” boot loader as per the naming convention above. This is what happens, for instance, when you boot a Windows 8 DVD, for instance. If you put in such a DVD in your computer and check, you’ll see the root folder has a folder called EFI that contains a sub folder called BOOT which contains a file called bootx64.efi.

Another example and screenshot of the UEFI boot manager can be found at this link.

Tying this in with my WIM to VHD case

If you have read this far, it’s obvious what’s wrong with my VHD file. When the Gen 2 VM boots up – and it uses UEFI as it’s a Gen 2 VM – it will look for a ESP partition with the UEFI boot loader but won’t find any (as my VHD has only one partition and that too of type NTFS). So what I need to do is create an ESP partition and copy the boot loaders to it as required. Also, I am using MBR style partitioning and a Gen 2 VM firmware is Class 3, so I must switch to GPT.

In fact, while I am at it why don’t I partition everything properly. When I install Windows manually (server or desktop) it creates many partitions so this looks like a good opportunity to read up on the Windows partitioning scheme and create any other required partitions on my base disk.

Understanding (GPT/UEFI) disk partitions for Windows

There are three Microsoft pages I referred to:

Read those for more details than what I post below.

The following partitions are required:

  • System partition: This is the EFI System Partition (ESP). Minimum size of the partition is 100 MB, FAT32 formatted. For Windows, the ESP contains the NTLDR, HAL, and other files and drivers required to boot the system. The partition GUID for ESP is DEFINE_GUID (PARTITION_SYSTEM_GUID, 0xC12A7328L, 0xF81F, 0x11D2, 0xBA, 0x4B, 0x00, 0xA0, 0xC9, 0x3E, 0xC9, 0x3B) (on an MBR partition the ID is 0xEF; but remember, Windows doesn’t support booting into UEFI mode from MBR partitions). The type of this partition is c12a7328-f81f-11d2-ba4b-00a0c93ec93b. Windows does not support having two ESPs on a single disk.
  • Microsoft Reserved Partition (MSR): Whereas with BIOS/ MBR one could have hidden sectors, UEFI does away with all that. So Microsoft recommends a reserved partition be set aside instead of such hidden sectors. The size of this partition is 128 MB (for drives larger than 16GB; else the size is 32 MB). It does not have any data – think of the MSR as free space set aside for future use by Windows – it is used when any disk operations require extra space and/ or partitions and they can’t use the existing space and/ or partitions. The partition GUID for MSR is DEFINE_GUID (PARTITION_MSFT_RESERVED_GUID, 0xE3C9E316L, 0x0B5C, 0x4DB8, 0x81, 0x7D, 0xF9, 0x2D, 0xF0, 0x02, 0x15, 0xAE). The type of this partition is e3c9e316-0b5c-4db8-817d-f92df00215ae.

The order of these partitions is: ESP, followed by any OEM partitions, followed by MSR, followed by the OS & data partitions. (See this link for a nice picture).

Apart from the two above, Microsoft recommends two other partitions (note: these recommended, not required):

  • Windows Recovery Environment (Windows RE) tools partition: This must be at least 300 MB, preferably 500 MB or larger, and contains the Windows RE tools image (winre.wim) which is about 300 MB in size. It is preferred that these tools are on a separate partition in case the main partition is BitLocker encrypted, and even otherwise to ensure the files in this partition are preserved in case the main partition is wiped out. The type of this partition is de94bba4-06d1-4d40-a16a-bfd50179d6ac.
  • Recovery image partition: This must be at least 2 GB, preferably 3 GB, and contains the Windows recovery image (install.wim) which is about 2 GB in size. This partition must be placed after all other partitions so its space can be reclaimed later if need be. The type of this partition is de94bba4-06d1-4d40-a16a-bfd50179d6ac.

Finally, the disk has basic data partitions which are the usual partitions containing OS and data. These partitions have a GUID DEFINE_GUID (PARTITION_BASIC_DATA_GUID, 0xEBD0A0A2L, 0xB9E5, 0x4433, 0x87, 0xC0, 0x68, 0xB6, 0xB7, 0x26, 0x99, 0xC7). The minimum size requirement for the partition containing the OS the 20 GB for 64-bit and 16 GB for 32-bit. The OS partition must be formatted as NTFS. The type of these partitions are ebd0a0a2-b9e5-4433-87c0-68b6b72699c7.

The order of all these partitions is: Windows RE tools, followed by ESP, followed by any OEM partitions, followed by MSR, followed by the data partitions, and finally the Recovery image partition.

It is worth pointing out that when you are installing Windows via an answer file it is possible to create all the above partitions via an answer file. But in my scenario, I am applying a WIM image to a VHD partition manually and creating all the partitions myself so I need a way to do this manually.

Let’s make some partitions!

Now back to my VHDs. To recap, previously I had shown how I apply an OS image from a WIM file to a (base) VHD and then make differencing VHDs off that base VHD for my Hyper-V VMs. The VHD thus created works well for Generation 1 VMs but fails for Generation 2 VMs. As we have learnt from the current post that’s because (a) I was using MBR partitions instead of GPT and (b) I hadn’t created any ESP partitions for the UEFI firmware to pick a boot loader from. Hyper-V Generation 2 VMs have a Class 3 UEFI firmware, so they don’t do any of the CSM/ BIOS compatibility stuff.

As before, create a new VHD and initialize it. Two changes from before are that I am now using a size of 25 GB instead of 20GB and that I initialize the disk as GPT.

Confirm that the disk is visible and note its number:

By default the newly created disk has a 128 MB MSR partition. Since the ESP has to be before this partition let’s remove that.

Then create new partitions:

Just double-checking:

Wunderbar!

Next I apply the image I want as before:

That takes care of the data partition. Let’s look at the other ones now.

WinRE tools partition

This is the first partition on the disk. I will (1) format it as FAT32, (2) mount it to a temporary drive letter, (3) copy the WinRE.WIM file from E:\Windows\System32\Recovery (change E: to whatever letter is assigned to the OS partition), (4) register the Windows RE image with the OS, and (5) dismount it.

Thanks to this TechNet article on how to register the Windows RE image. The WinRE.WIM image can be customized too with drivers and other tools if required but I won’t be doing any of that here.

Thanks to one of my readers (Exotic Hadron) for pointing out that the winre.wim file is only present in %SYSTEMROOT%\System32\Recovery if Windows was installed by expanding install.wim (like in the above case). On a typical system where Windows is installed via the setup program the file won’t be present here.

Just double-checking that Windows RE is registered correctly:

EFI System Partition

This is the second partition on the disk. As before I will format this as FAT32 and mount to a temporary drive letter. (Note: I use a different cmdlet to assign drive letter here, but you can use the previous cmdlet too).

Format the partition as FAT32. The Format-Volume cmdlet doesn’t work here (am guessing it can’t work with “System” partitions) so I use the usual format command instead:

Get the boot loaders over to this drive, confirm they are copied, and remove the drive letter:

Phew!

Recovery Image partition

Last thing, let’s sort out the recovery image partition.

I am going to skip this (even though I made the partition as an example) because there’s no point wasting the space in my use case. All one has to do to sort out the recovery image partition is to mount it like with the WinRE tools partition and copy over the install.wim file to it. Then use the ReAgentc.exe command to register that image with that installation of Windows. (See steps 5-7 of this link).

That’s it!

Now dismount the VHD, make a differencing VHD as before, create a VM with this VHD and fire away!

And voila! It is booting!

boot-success