Useful offline Windows troubleshooting/ fixing tricks

Had a Windows Server 2008 R2 server that started giving a blank screen since the recent Windows update reboot. This was a VM and it was the same result via VMware console or RDP. Safe Mode didn’t help either. Bummer!

Since this is a VM I mounted its disk on another 2008 R2 VM and tried to fix the problem offline. Most of my attempts didn’t help but I thought of posting them here for reference. 

Note: In the following examples the broken VM’s disk is mounted to F: drive. 

Recent updates

I used dism to list recent updates and remove them. To list updates from this month (March 2017):

To remove an update:

I did this for each of the updates I had. That didn’t help though. And oddly I found that one of the updates kept re-appearing with a slightly different name (a different number suffixed to it actually) each time I’d remove it. Not sure why that was the case but I saw that F:\Windows\SxS had a file called pending.xml and figured this must be doing something to stop the update from being removed. I couldn’t delete the file in-spite of taking ownership and full control, so I opened it in Notepad and cleared all the contents. :o) After that the updates didn’t return but the machine was still broken. 

SFC

I used sfc to check the integrity of all the system files:

No luck with that either!

Event Logs

Maybe the Event Logs have something? These can be found at F:\Windows\System32\Winevt\Logs. Double click the ones of interest to view. 

In my case the Event Logs had nothing! No record at all of the VM starting up or what was causing it to hang. Tough luck!

Bonus info: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Eventlog contains locations of the files backing the Event Logs. Just mentioning it here as I came across this.

Drivers

Could drivers cause any issue? Unlikely. You can’t use dism to query drivers as above but you can check via registry. See this post. Honestly, I didn’t read it much. I didn’t suspect drivers and it seemed too much work fiddling through registry keys and folders. 

Last Known Good Configuration

Whenever I’d boot up the VM I never got the Last Known Good (LKG) Configuration option. I tried pressing F8 a couple of times but it had no effect. So I wondered if I could tweak this via the registry. Turns out I can. And turns out I already knew this just that I had forgotten!

Your current configuration is HKLM\System\CurrentControlSet. This is actually a link to HKLM\System\CurrentControlSet01 or HKLM\System\CurrentControlSet02 or HKLM\System\CurrentControlSet03 or … (you get the point). Each of the CurrentControlSetXXX key is one of your previous configurations. The one that’s actually used can be found via HKLM\System\Select. The entry Current points to the number of the CurrentControlSetXXX key in use. The entry LastKnownGood points to the Last Known Good Configuration. Now we know what to do. 

  1. Mount the HKLM\SYSTEM hive of the broken VM. All registry hives can be found under %windir%\System32\Config. In my case that translates to the file F:\Windows\System32\Config\SYSTEM.
  2. To mount this file open Registry Editor, select the HKLM hive, and go to File > Load Hive. (This is a good post with screenshots etc).  
  3. Go to the Select key above. Change Current to whatever LastKnownGood was. 
  4. That’s all. Now unload the hive and you are done.

This helped in my case! I was finally able to move past the blank screen and get a login prompt. Upon login I was also able to download and install all the patches and confirm that the VM is now working fine (took a snapshot of course, just in case!). I have no idea what went wrong, but at least I have the pleasure of being able to fix it. From the post I link to below, I’d say it looks like a registry hive corruption. 

Since I successfully logged in, my machine’s Last Known Good Configuration will be automatically updated by Windows with the current one. Here’s a blog post that explains this in more detail. 

That’s all! Hope this helps someone.