Contact

Subscribe via Email

Subscribe via RSS

Categories

Recent Posts

Creative Commons Attribution 4.0 International License
© Rakhesh Sasidharan

IE 11 update fails due to prerequisite updates (KB2729094)

IE 11 update requires the following prerequisite updates – link.

Even after installing those (most of which are already there) IE 11 install will complain and fail. The log files are in C:\Windows\IE_.main.log.

In my case I was getting the following error (seems to be the same for others too):

Thing is I already had this hotfix installed, so there was nothing more to do. Found this useful support post where someone suggested running the hotfix install and side-by-side launching the IE install. Might need to do it 2-3 times but that seems to make a difference. So I tried that and sure enough it helped.

That post is worth a read for some other tricks, especially if you are sequencing this via SCCM. I found this article from Symantec too which seems helpful. Some day when I am in charge of SCCM too I can try such stuff out! :)

P2V a SQL cluster by breaking the cluster

Need to P2V a SQL cluster at work. Here’s screenshots of what I did in a test environment to see if an idea of mine would work.

We have a 2 physical-nodes SQL cluster. The requirement was to convert this into a single virtual machine.

P2V-ing a single server is easy. Use VMware Converter. But P2V-ing a cluster like this is tricky. You could P2V each node and end up with a cluster of 2 virtual-nodes but that wasn’t what we wanted. We didn’t want to deal with RDMs and such for the cluster, so we wanted to get rid of the cluster itself. VMware can provide HA if anything happens to the single node.

My idea was to break the cluster and get one of the nodes of the cluster to assume the identity of the cluster. Have SQL running off that. Virtualize this single node. And since there’s no change as far as the outside world is concerned no one’s the wiser.

Found a blog post that pretty much does what I had in mind. Found one more which was useful but didn’t really pertain to my situation. Have a look at the latter post if your DTC is on the Quorum drive (wasn’t so in my case).

So here we go.

1) Make the node that I want to retain as the active node of the cluster (so it was all the disks and databases). Then shutdown SQL server.

sqlshutdown

2) Shutdown the cluster.

clustershutdown

3) Remove the node we want to retain, from the cluster.

We can’t remove/ evict the node via GUI as the cluster is offline. Nor can we remove the Failover Cluster feature from the node as it is still part of a cluster (even though the cluster is shutdown). So we need to do a bit or “surgery”. :)

Open PowerShell and do the following:

This simply clears any cluster related configuration from the node. It is meant to be used on evicted nodes.

Once that’s done remove the Failover Cluster feature and reboot the node. If you want to do this via PowerShell:

4) Bring online the previously shared disks.

Once the node is up and running, open Disk Management and mark as online the shared disks that were previously part of the cluster.

disksonline

5) Change the IP and name of this node to that of the cluster.

Straight-forward. Add CNAME entries in DNS if required. Also, you will have to remove the cluster computer object from AD first before renaming this node to that name.

6) Make some registry changes.

The SQL Server is still not running as it expects to be on a cluster. So make some registry changes.

First go to HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\Setup and open the entry called SQLCluster and change its value from 1 to 0.

Then take a backup (just in case; we don’t really need it) of the key called HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\Cluster and delete it.

Note that MSSQL10_50.MSSQLSERVER may vary depending on whether you have a different version of SQL than in my case.

7) Start the SQL services and change their startup type to Automatic.

I had 3 services.

Now your SQL server should be working.

8) Restart the server – not needed, but I did so anyways.

Test?

If you are doing this in a test environment (like I was) and don’t have any SQL applications to test with, do the following.

Right click the desktop on any computer (or the SQL server computer itself) and create a new text file. Then rename that to blah.udl. The name doesn’t matter as long as the extension is .udl. Double click on that to get a window like this:

udl

Now you can fill in the SQL server name and test it.

One thing to keep in mind (if you are not a SQL person – I am not). The Windows NT Integrated security is what you need to use if you want to authenticate against the server with an AD account. It is tempting to select the “Use a specific user name …” option and put in an AD username/ password there, but that won’t work. That option is for using SQL authentication.

If you want to use a different AD account you will have to do a run as of the tool.

Also, on a fresh install of SQL server SQL authentication is disabled by default. You can create SQL accounts but authentication will fail. To enable SQL authentication right click on the server in SQL Server Management Studio and go to Properties, then go to Security and enable SQL authentication.

sqlauth

That’s all!

Now one can P2V this node.

Installing a new license key in KMS

KMS is something you login to once in a blue moon and then you wonder how the heck are you supposed to install a license key and verify that it got added correctly. So as a reminder to myself.

To install a license key:

Then activate it:

If you want to check that it was added correctly:

I use cscript so that the output comes in the command prompt itself and I can scroll up and down (or put into a text file) as opposed to a GUI window which I can’t navigate.

Very Brief Notes on Windows Memory etc

I have no time nowadays to update this blog but I wanted to dump these notes I made for myself today. Just so the blog has some update and I know I can find these notes here again when needed.

This is on the various types of memory etc in Windows (and other OSes in general). I was reading up on this in the context of Memory Mapped Files.

Virtual Memory

  • Amount of memory the OS can access. Not related to the physical memory. It is only related to processor and OS – is it 32-bit or 64-bit.
  • 32-bit means 2^32 = 4GB; 64-bit means 2^64 = a lot! :)
  • On a default install of 32-bit Windows kernel reserves 2GB for itself and applications can use the balance 2GB. Each application gets 2GB. Coz it doesn’t really exist and is not limited by the physical memory in the machine. The OS just lies to the applications that they have 2GB of virtual memory for themselves.

Physical Memory

  • This is physical. But not limited to the RAM modules. It is RAM modules plus paging file/ disk.

Committed Memory

  • When a virtual memory page is touched (read/ write/ committed) it becomes “real” – i.e. put into Physical Memory. This is Committed Memory. It is a mix of RAM modules and disk.

Commit Limit

  • The total amount of Committed Memory is obviously limited by your Physical Memory – i.e. the RAM modules plus disk space. This is the Commit Limit.

Working Set

  • Set of virtual memory pages that are committed and fully belong to that process.
  • These are memory pages that exist. They are backed by Physical Memory (RAM plus paging files). They are real, not virtual.
  • So a working set can be thought of as the subset of a processes Virtual Memory space that is valid; i.e. can be referenced without a page fault.
    • Page fault means when the process requests for a virtual page and that is not in the Physical Memory, and so has to be loaded from disk (not page file even), the OS will put that process on hold and do this behind the scene. Obviously this causes a performance impact so you want to avoid page faults. Again note: this is not RAM to page file fault; this is Physical Memory to disk fault. Former is “soft” page fault; latter is “hard” page fault.
    • Hard faults are bad and tied to insufficient RAM.

Life cycle of a Page

  • Pages in Working Set -> Modified Pages -> Standby Pages -> Free Pages -> Zeroed pages.
  • All of these are still in Physical RAM, just different lists on the way out.

From http://stackoverflow.com/a/22174816:

Memory can be reserved, committed, first accessed, and be part of the working set. When memory is reserved, a portion of address space is set aside, nothing else happens.

When memory is committed, the operating system guarantees that the corresponding pages could in principle exist either in physical RAM or on the page file. In other words, it counts toward its hard limit of total available pages on the system, and it formally creates pages. That is, it creates pages and pretends that they exist (when in reality they don’t exist yet).

When memory is accessed for the first time, the pages that formally exist are created so they truly exist. Either a zero page is supplied to the process, or data is read into a page from a mapping. The page is moved into the working set of the process (but will not necessarily remain in there forever).

Memory Mapped Files

Windows (and other OSes) have a feature called memory mapped files.

Typically your files are on a physical disk and there’s an I/O cost involved in using them. To improve performance what Windows can do is map a part of the virtual memory allocated to a process to the file(s) on disk.

This doesn’t copy the entire file(s) into RAM, but a part of the virtual memory address range allocated to the process is set aside as mapping to these files on disk. When the process tries to read/ write these files, the parts that are read/ written get copied into the virtual memory. The changes happen in virtual memory, and the process continues to access the data via virtual memory (for better performance) and behind the scenes the data is read/ written to disk if needed. This is what is known as memory mapped files.

My understanding is that even though I say “virtual memory” above, it is actually restricted to the Physical RAM and does not include page files (coz obviously there’s no advantage to using page files instead of the location where the file already is). So memory mapped files are mapped to Physical RAM. Memory mapped files are commonly used by Windows with binary images (EXE & DLL files).

In Task Manager the “Memory (Private Working Set)” column does not show memory mapped files. For this look to the “Commit Size” column.

Also, use tools like RAMMap (from SysInternals) or Performance Monitor.

More info

Solarwinds not seeing correct disk size; “Connection timeout. Job canceled by scheduler.” errors

Had this issue at work today. Notice the disk usage data below in Solarwinds –

Disk Usage

The ‘Logical Volumes’ section shows the correct info but the ‘Disk Volumes’ section shows 0 for everything.

Added to that all the Application Monitors had errors –

Timeout

I searched Google on the error message “Connection timeout. Job canceled by Scheduler.” and found this Solarwinds KB article. Corrupt performance counters seemed to be a suspect. That KB article was a bit confusing me to in that it gives three resolutions and I wasn’t sure if I am to do all three or just pick and choose. :)

Event Logs on the target server did show corrupt performance counters.

Initial Errors

I tried to get the counters via PowerShell to double check and got an error as expected –

Broken Get-Counter

Ok, so performance counter issue indeed. Since the Solarwinds KB article didn’t make much sense to me I searched for the Event ID 3001 as in the screenshot and came across a TechNet article. Solution seemed simple – open up command prompt as an admin, run the command lodctr /R. This command apparently rebuilds the performance counters from scratch based on currently registry settings adn backup INI files (that’s what the help message says). The command completed straight-forwardly too.

lodctr - 1

With this the performance counters started working via PowerShell.

Working Get-Counter

Event Logs still had some error but those were to do with the performance counters of ASP.Net and Oracle etc.

More Errors

The fix for this seemed to be a bit more involved and requires rebooting the server. I decided to skip it for now as I don’t these additional counters have much to do with Solarwinds. So I let those messages be and tried to see if Solarwinds was picking up the correct info. Initially I took a more patient approach of waiting and trying to make it poll again; then I got impatient and did things like removing the node from monitoring and adding it back (and then wait again for Solarwinds to poll it etc) but eventually it began working. Solarwinds now sees the disk space correctly and all the Application Monitors work without any errors too.

Here’s what I am guessing happened (based on that Solarwinds KB article I linked to above). The performance counters of the server got corrupt. Solarwinds uses counters to get the disk info etc. Due to this corruption the poller spent more time than usual when fetching info from the server. This resulted in the Application Monitor components not getting a chance to run as the poller had run out of time to poll the server. Thus the Application Monitors gave the timeout errors above. In reality the timeout was not from those components, it was from the corrupt performance counters.

Exchange DAG fails. Information Store service fails with error 2147221213.

Had an interesting issue at work today. When our Exchange servers (which are in a 2 node DAG) rebooted after patch weekend one of them had trouble starting the Information Store service. The System log had entries such as these (event ID 7024) –

The Microsoft Exchange Information Store service terminated with service-specific error %%-2147221213.

The Application log had entries such as these (event ID 5003) –

Unable to initialize the Information Store service because the clocks on the client and server are skewed. This may be caused by a time change either on the client or on the server, and may require a restart of that computer. Verify that your domain is correctly configured and  is currently online.

So it looked like time synchronization was an issue. Which is odd coz all our servers should be correctly syncing time from the Domain Controllers.

Our Exchange team fixed the issue by forcing a time sync from the DC –

I was curious as to why so went through the System logs in detail. What I saw a sequence of entries such as these –

Notice how time jumps ahead 13:21 when the OS starts to 13:27 suddenly, then jumps back to 13:22 when the Windows Time service starts and begins syncing time from my DC. It looked like this jump of 6 mins was confusing the Exchange services (understandably so). But why was this happening?

I checked the time configuration of the server –

Seems to be normal. It was set to pick time from the site DC via NTP (the first entry under TimeProviders) as well as from the ESXi host the VM is running on (the second entry – VM IC Time Provider). I didn’t think much of the second entry because I know all our VMs have the VMware Tools option to sync time from the host to VM unchecked (and I double checked it anyways).

Only one of the mailbox servers was having this jump though. The other mailbox server had a slight jump but not enough to cause any issues. While the problem server had a jump of 6 mins, the ok server had a jump of a few seconds.

I thought to check the ESXi hosts of both VMs anyways. Yes, they are not set to sync time from the host, but let’s double check the host times anyways. And bingo! turns out the ESXi hosts have NTP turned off and hence varying times. The host with the problem server was about 6 mins ahead in terms of time from the DC, while the host with the ok server was about a minute or less ahead – too coincidental to match the time jumps of the VMs!

So it looked like the Exchange servers were syncing time from the ESXi hosts even though I thought they were not supposed to. I read a bit more about this and realized my understanding of host-VM time sync was wrong (at least with VMware). When you tick/ untick the option to synchronize VM time with ESX host, all you are controlling is a periodic synchronization from host to VM. This does not control other scenarios where a VM could synchronize time with the host – such as when it moves to a different host via vMotion, has a snapshot taken, is restored from a snapshot, disk is shrinked, or (tada!) when the VMware Tools service is restarted (like when the VM is rebooted, as was the case here). Interesting.

So that explains what was happening here. When the problem server was rebooted it synced time with the ESXi host, which was 6 mins ahead of the domain time. This was before the Windows Time service kicked in. Once the Windows Time service started, it noticed the incorrect time and set it correct. This time jump confused Exchange – am thinking it didn’t confuse Exchange directly, rather one of the AD services running on the server most likely, and due to this the Information Store is unable to start.

The fix for this is to either disable VMs from synchronizing time from the ESXi host or setup NTP on all the ESXi hosts so they have the correct time going forward. I decided to go ahead with the latter.

Update: Found this and this blog post. They have more screenshots and a better explanation, so worth checking out. :)

Using SolarWinds to highlight servers in a pending reboot status

Had a request to use SolarWinds to highlight servers in a pending reboot status. Here’s what I did.

Sorry, this is currently broken. After implementing this I realized I need to enable PowerShell remoting on all servers for it to work, else the script just returns the result from the SolarWinds server. Will update this post after I fix it at my workplace. If you come across this post before that, all you need to do is enable PowerShell remoting across all your servers and change the script execution to “Remote Host”.

SolarWinds has a built in application monitor called “Windows Update Monitoring”. It does a lot more than what I want so I disabled all the components I am not interested in. (I could have also just created a new application monitor, I know, just was lazy).

winupdatemon-1

The part I am interested in is the PowerShell Monitor component. By default it checks for the reboot required status by checking a registry key: HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\WindowsUpdate\Auto Update\RebootRequired. Here’s the default script –

Inspired by this blog post which monitors three more registry keys and also queries ConfigMgr, I replaced the default PowerShell script with the following –

Then I added the application monitor to all my Windows servers. The result is that I can see the following information on every node –

winupdatemon-2

Following this I created alerts to send me an email whenever the status of the above component (“Machine restart status …”) went down for any node. And I also created a SolarWinds report to capture all nodes for which the above component was down.

winupdatemon-3

Then I assigned this to a schedule to run once in a month after our patching window to email me a list of nodes that require reboots.

 

Solarwinds AppInsight for IIS – doing a manual install – and hopefully fixing invalid signature (error code: 16007)

AppInsight from Solarwinds is pretty cool. At least the one for Exchange is. Trying out the one for IIS now. Got it configured on a few of our servers easily but it failed on one. Got the following error –

appinsight-error

Bummer!

Manual install it is then. (Or maybe not! Read on and you’ll see a hopeful fix that worked for me).

First step in that is to install PowerShell (easy) and the IIS PowerShell snap-in. The latter can be downloaded from here. This downloads the Web Platform Installer (a.k.a. “webpi” for short) and that connects to the Internet to download the goods. In theory it should be easy, in practice the server doesn’t have connectivity to the Internet except via a proxy so I have to feed it that information first. Go to C:\Program Files\Microsoft\Web Platform Installer for that, find a file called WebPlatformInstaller.exe.config, open it in Notepad or similar, and add the following lines to it –

This should be within the <configuration> -- </configuration> block. Didn’t help though, same error.

webpi-error

Time to look at the logs. Go to %localappdata%\Microsoft\Web Platform Installer\logs\webpi for those.

From the logs it looked like the connection was going through –

But the problem was this –

If I go to the link – https://www.microsoft.com/web/webpi/5.0/webproductlist.xml – via IE on that server I get the following –

untrusted-cert

 

However, when I visit the same link on a different server there’s no error.

Interesting. I viewed the untrusted certificate from IE on the problem server and compared it with the certificate from the non-problem server.

Certificate on the problem server

Certificate on the problem server

Certificate on a non-problem server

Certificate on a non-problem server

Comparing the two I can see that the non-problem server has a VeriSign certificate in the root of the path, because of which there’s a chain of trust.

verisign - g5

If I open Certificate Manager on both servers (open mmc > Add/ Remove Snap-Ins > Certificates > Add > Computer account) and navigate to the “Trusted Root Certification Authorities” store) on both servers I can see that the problem server doesn’t have the VeriSign certificate in its store while the other server has.

cert manager - g5

So here’s what I did. :) I exported the certificate from the server that had it and imported it into the “Trusted Root Certification Authorities” store of the problem server. Then I closed and opened IE and went to the link again, and bingo! the website opens without any issues. Then I tried the Web Platform Installer again and this time it loads. Bam!

The problem though is that it can’t find the IIS PowerShell snap-in. Grr!

no snap-in

no snap-in 2

That sucks!

However, at this point I had an idea. The SolarWinds error message was about an invalid signature, and what do we know of that can cause an invalid signature? Certificate issues! So now that I have installed the required CA certificate for the Web Platform Installer, maybe it sorts out SolarWinds too? So I went back and clicked “Configure Server” again and bingo! it worked this time. :)

Hope this helps someone.

Solarwinds – “The WinRM client cannot process the request”

Added the Exchange 2010 Database Availability Group application monitor to couple of our Exchange 2010 servers and got the following error –

error1

Clicking “More” gives the following –

error2

This is because Solarwinds is trying to run a PowerShell script on the remote server and the script is unable to run due to authentication errors. That’s because Solarwinds is trying to connect to the server using its IP address, and so instead of using Kerberos authentication it resorts to Negotiate authentication (which is disabled). The error message too says the same but you can verify it for yourself from the Solarwinds server too. Try the following command

This is what’s happening behind the scenes and as you will see it fails. Now replace “Negotiate” with “Kerberos” and it succeeds –

So, how to fix this? Logon to the remote server and launch IIS Manager. It’s under “Administrative Tools” and may not be there by default (my server only had “Internet Information Services (IIS) 6.0 Manager”), in which case add it via Server Manager/ PowerShell –

Then open IIS Manager, go to Sites > PowerShell and double click “Authentication”.

iis-1

Select “Windows Authentication” and click “Enable”.

iis-2

Now Solarwinds will work.

Using Solarwinds to monitor Windows Services

This is similar to how I monitored performance counters with Solarwinds.

I want to monitor a bunch of AppSense services.

Similar to the performance counters where I created an application monitor template so I could define the threshold values, here I need to create an application monitor so that the service appears in the alert manager. This part is easy (and similar to what I did for the performance counters) so I’ll skip and just put a screenshot like below –

applimonitor

I created a separate application monitor template but you could very well add it to one of the standard templates (if it’s a service that’s to be monitored on all nodes for instance).

Now for the part where you create alerts.

Initially I thought this would be a case of creating triggers when the above application monitor goes down. Something like this – alert1

And create an alert message like this –

alert1a

With this I was hoping to get a one or more alert messages only for the services that actually went down. Instead, what happened is that whenever any one service went down I’d get an alert for the service that went down and also a message for the services that were up. Am guessing since I was triggering on the application monitor, Solarwinds helpfully sent the status for each of its components – up or down.

alert2

The solution is to modify your trigger such that you target each component.

alert3

Now I get alerts the way I want.

Hope this helps!

Fixing a Windows Server that was stuck on “Preparing to configure Windows”

This is something that I fixed a few months ago at work but didn’t get a chance to blog about then. Coz of the gap I might not post much verbosely about it as I usually may.

The situation was that we had a Windows Server 2012 R2 that was stuck on a “Preparing to configure Windows” loop. I didn’t take a screenshot of it but you can find an example of it for Windows 7 here. All the usual troubleshooting steps like automatic repairs and last known good configuration etc didn’t make a dent. The problem began after the server was rebooted following Windows Updates so I focused on that. Here’s what I did to fix the server:

  1. I rebooted the server and pressed F8 before the Windows logo screen.
  2. This got me to the recovery options screen, where I selected the option to get a command prompt.
  3. Next I found the drive letter that corresponds to the C: drive of the server. This was through trial and error by typing each drive letter and find the one with the Windows folder.
    1. In my case the drive letter was E: so keep that in mind while viewing the screenshots replace. Replace with the drive letter you find.
  4. I entered the following command to get a list of all Windows updates, both installed and pending. 1
  5. The output was as below. 2
  6. I noted the names of the updates that were in a “Staged” state and uninstalled them one by one. 3
  7. Then I exited the command prompt. This rebooted the server and it was stuck at the following screen for a while. 4
  8. I stayed on this screen for about 15 mins. The server rebooted itself and stayed on the screen again, for about 10 mins. After this it proceeded to the login screen and I could login as usual. :)

Find out which DCs in your domain have the DHCP service enabled

Use PowerShell –

Result is a table of DC names and the status of the “DHCP Server” service. If the service isn’t installed (i.e. the feature isn’t enabled) you get a blank.

Quickly get the last boot up time of a remote Windows machine

PowerShell:

Command Prompt/ WMI:

Double quotes are important for the WMI method.

WMI Access Denied for remote machine etc

This isn’t going to be a coherent post really (unlike my usual posts which are more coherent, I hope!). I came across a bunch of new stuff as I was troubleshooting this WMI issue and thought I should put them all somewhere.

The issue is that we are trying to get Solarwinds to monitor one of our DMZ servers via WMI but it keeps failing.

solarwinds

Other servers in the DMZ work, it’s just this one that fails. WMI ports aren’t fixed like I had mentioned earlier but I don’t think that matters coz they aren’t fixed for the other servers either. Firewall configuration is the same between both servers.

I thought of running Microsoft Network Monitor but realized that it’s been replaced with Microsoft Message Analyzer. That looks nice and seems to do a lot more than just network monitoring – I must explore it sometime. For now I ran it on the DMZ and applied a filter to see traffic from our Solarwinds server to it. The results showed that access was being denied, so may not a port issue after all.

analyzer message

Reading up more on this pointed me to a couple of suggestions. None of them helped but I’d like to mention them here for future reference.

First up is the command wbemtest. Run this from the local machine (the DMZ server in my case), click “Connect”, and then “Connect” again.

wbemtest1

If all is well with WMI you should get no error.

wbemtest2

Now try the same from the Solarwinds server, but this time try connecting to the DMZ server and enter credentials if any.

wbemtest3

That worked with the local administrator account but failed with the account I was using from Solarwinds to monitor the server. Error 0x80070005.

wbemtest4

So now I know the issue is one of permissions and not ports.

Another tool that can be used is WmiMgmt. Just type wmimgmt.msc somewhere to launch it. I tried this on the DMZ machine to confirm WMI works fine. I also tried it from SolarWinds machine to the DMZ machine. (Right click to connect to a remote computer).

wmimgmt

Problem with WmiMgmt is that unlike wbemtest you can’t a different account to use. If the two machines are domain joined or have the same local accounts, then it’s fine – you can run as from one machine and connect to the other – but otherwise there’s nothing much you can do. WmiMgmt is good to double check the permissions though. Click “Properties” in the screenshot above, go to the “Security” tab, select the “Root” namespace and click the “Security” button. The resulting window should show the permissions.

wmimgmt2

In my case the Administrators group members had full permissions as expected. The account I was using from the Solarwinds server was a member of this group too yet had access denied.

Another place to look at is Component Services > Computers > My Computer > DCOM Config > “Windows Management and Instrumentation” – right click and “Properties”.

componentservices

Make sure “Authentication Level” is “Default”. Then go to the “Security” tab and make sure the account/ group you want has permissions.

componentservices2

 

Also right click on “My Computer” and go to “Properties”.

componentservices3

Under the “COM Security” tab go to the “Edit Limits” of both access & launch and activation permissions and ensure the permissions are correct. My understanding is that the limits you specify here over-ride everything else.

componentservices4

In my case none of the above helped as they were all identical between both servers and at the correct setting. What finally helped was this Serverfault post.

Being DMZ servers these were on a workgroup and the Solarwinds server was connecting via a local account. Turns out that:

In a workgroup, the account connecting to the remote computer is a local user on that computer. Even if the account is in the Administrators group, UAC filtering means that a script runs as a standard user.

That is a good link to refer to. It is about Remote UAC and WMI. The solution is to go to the following registry key – HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System – and create a new name called LocalAccountTokenFilterPolicy (of type DWORD_32) with a value of 1. (Check the Serverfault post for one more solution if you are using a Server 2012).

Remote UAC. Ah! Am surprised I didn’t think of that in the beginning itself. If the LocalAccountTokenFilterPolicy has a value of 0 (the default) then whenever a member of the Administrators group connects remotely, the security tokens of that account and filtered to remove admin access. This is only for local admin accounts, not domain admin accounts – mind you. If LocalAccountTokenFilterPolicy has a value of 1, then no filtering happens and the account connects with full admin rights. I have encountered Remote UAC in the past when accessing admin shares (e.g. \\computer\drive$) in my home workgroup, never thought it would be playing a role here with WMI!

Hope this helps someone. :)

Notes on WMI ports & monitoring

Trying to set up monitoring for some of our Windows DMZ servers via SolarWinds and came across a few interesting links. At the same time I noticed that my carefully organized bookmarks folders seem to be corrupt. Many folders are empty. This happened a few days ago too, but that time it was just one folder (well one folder that I knew of, could be more who knows) and so I was able to view and older copy of my bookmarks via Xmarks and add the missing entries back.

But this time it’s a whole bunch of folders and the only option Xmarks has it to either export the older copy or overwrite your current copy with this older set. I don’t want the latter as that would mean losing all my newer bookmarks. Wish there was some way of merging the current and older copies! Anyhow, what’s happened is happened, I think I’ll stick to using this blog for bookmarks. I keep referring to this blog over my bookmarks anyway, so this is a sign to stop with the unnecessary filing.

To start off, this is a must read on WMI ports and how to allow firewall exceptions for WMI. Gist of the matter is that WMI uses dynamic ports via the RPC Portmapper. When the Solarwinds server (for example) wants to talk to WMI on a target server, it contacts the RPC Portmapper service on the target server on port 135 (which is the standard port for the Portmapper service) and gets a dynamic port to use for WMI. This port can be anywhere between 1024 – 65535.

The fix for this is to give the Portmapper service a specific set of ports to use. One method is to use the registry (see the previous link or this KB article). Add a key called Internet under HKEY_LOCAL_MACHINE\Software\Microsoft\Rpc. To this add values  Ports (MULTI_SZ), PortsInternetAvailable (REG_SZ), and UseInternetPorts (REG_SZ). Set a value of Y for the latter two, and a range like 5000-5100 to the former. Restart the server after this.

Although I haven’t tried it, I think a similar effect as the above can be achieved via Component Services (type dcomcnfg.exe in a command prompt). Expand the “Computers” folder here, right click on “My Computer”, go to “Default Protocols”, click “Properties” of “Conenction-oriented TCP/IP”, and add a port range.

dcomcnfg

Another method is to use Group Policies.

Yet another method seems to be to get WMI to not use the RPC Portmapper for dynamic ports. By default WMI runs as a shared service, which is why it uses the RPC Portmapper. It is possible to make it run as a standalone service so it doesn’t use the Portmapper and instead defaults to port 24158. (This port number too can be changed via dcomcnfg.exe but I am not sure how).


These two links didn’t make much sense to me, but I know they are of use so linking them here as a reference to myself for later: