Contact

Subscribe via Email

Subscribe via RSS/JSON

Categories

Creative Commons Attribution 4.0 International License
© Rakhesh Sasidharan

Elsewhere

Enable & Disable SSH on ESXi host via PowerCLI

I alluded to this in another post but couldn’t find it when I was searching my posts for the cmdlet. So here’s a separate post.

The Get-VMHostService is your friend when dealing with services on ESXi hosts. You can use it to view the services thus:

To start and stop services we use the Start-VMHostService and Stop-VMHostService but these take (an array of) HostService objects.  HostService objects are what we get from the Get-VMHostService cmdlet above. Here’s how you stop the SSH & ESXi Shell services for instance:

Since the cmdlet takes an array, you can give it HostService objects of multiple hosts. Here’s how I start SSH & ESXi Shell for all hosts:

As an aside here’s a nice post on six different ways to enable SSH on a host. Good one!

DFS Namespaces missing on some servers / sites

This is something I had sorted out for one of our offices earlier. Came across it for another office today and thought I should make a blog post of it. I don’t remember if I made a a blog post the last time I fixed it (and although it’s far easier to just search my blog and then decide whether to make a blog post or not, I am lazy that way :)).

Here’s the problem. We use AppV to stream some applications to our users. One of our offices started complaining that AppV was no longer working for them. Here’s the error they got:

cderror

I logged on to the user machine to find out where the streaming server is:

Checked the Event Logs on that server and there were errors from the “Application Virtualization” source along these lines:

So I checked the DFS path and it was empty. In fact, there was no folder called “Content” under “\\domain.com\dfs” – which was odd!

I switched DFS servers to that of another location and all the folders appeared. So the problem definitely was with the DFS server of this site.

From the DFS Management tool I found the namespace server for this site (it was the DC) and the folder target (it was one of the data servers). Checked the folder target share and that was accessible, so no issues there. It was looking to be a DFS Namespace issue.

Hopped on to the DFS and checked “C:\DFSRoots\DFS”. It didn’t have the “Content” folder above – so definitely a DFS Namespace issue!

I ran dfsdiag and it gave no errors:

So I checked the Event Logs on the DC. No errors there. Next I restarted the “DFS Namespace” service and voila! all the namespaces appeared correctly.

Ok, so that fixed the problem but why did it happen in the first place? And what’s there to ensure it doesn’t happen again? The site was restarted yesterday as part of some upgrade work so did that cause the namespaces to fail?

I checked the timestamps of the DFS Namespace entries (these are from source “DfsSvc” in “Custom Views” > “Server Roles” > “File Server”). Once the namespaces were ready there was an entry along the following lines at 08:06:43 (when the DC came back up from the reboot):

No errors there. But where does the DC get its name spaces from? This was a domain DFS so it would be getting it from Active Directory. So let’s look at the AD logs under “Applications and Services Logs” > “Directory Service”. When the domain services are up and running there’s an entry from the “ActiveDirectory_DomainService” source along these lines:

On this server the entry was made at 08:06:47. Notice the timestamp – it occurs after the “DFS Namespace” service has finished building namespace! And therein lies the problem. Because the Directory Services took a while to be ready, when DFS was being initialized it went ahead with stale data. When I restarted the service later, it picked up the correct data and sorted the namespaces.

I rebooted the server a couple more times but couldn’t replicate the problem as the Directory Service always initialized before DFS Namespace. To be on the safe side though I set the startup type of “DFS Namespace” to be “Automatic (Delayed)” so it always starts after the “Directory Service”.

 

DNS zone and domain

Once upon a time I used to play with DNS zones & domains for breakfast, but it’s been a while and I find myself to be a bit rusty.

Anyways, something I realized / remembered today – a DNS domain is not equal to a DNS zone. When creating a DNS domain under Windows, using the GUI, it is easy to equate the domain to the zone; but if you come from a *nix background then you know a zone is the zone file whereas domains are different from that.

For example here’s a domain called “domain.com” and its sub-domain “sub.domain.com”.

domainYou would think there wouldn’t be much difference between the two but the fact is that “domain.com” is also the zone here and “sub.domain.com” is a part of that zone. The domain “sub.domain.com” is not independent of the main domain “domain.com”. It can’t have its own name servers. And when it comes to zone transfers “sub.domain.com” follows whatever is set for “domain.com”. You can’t, for instance, have “domain.com” be denied zone transfers while allowing zone transfers for “sub.domain.com” – it’s simply not possible, and if you think about it that makes sense too because after all “sub.domain.com” doesn’t have its own name servers.

In this case the zone “domain.com” consists of both the domain “domain.com” and its sub-domain “sub.domain.com”.

In contrast below is an example where there are two zones, one for “domain.com” and another for “sub.domain.com”. Both domain and sub-domain have their own zones (and hence name servers) in this case.

subdomainWhen creating a new domain / zone the GUI makes this clear too but it’s easy to miss out the distinction.

New domain

New domain

New Zone

New Zone

Stub zones

We use stub zones at work and initially I had a domain “sub.domain.com” which I wanted to create a a stub zone on another server. That failed with an error that the zone transfer failed.

transferInitially I took this to mean the stub zone was failing because the zone wasn’t getting transferred from the main server. That was correct – sort of.  Because “sub.domain.com” isn’t a zone of its own, it doesn’t have any name servers. And the way stub zones work is that the stub server contacts the name servers of “sub.domain.com” to get a list of name servers for the stub zone but that fails because “sub.domain.com” doesn’t have any name servers! It is not a zone, and only zones have name servers, not (sub-)domains.

So the error message was misleading. Yes, the zone transfer failed, but that’s not because the transfer failed but because there were no servers with the “sub.domain.com” zone. What I have to do is convert “sub.domain.com” to a zone of its own. (Create a zone called “sub.domain.com”, create new records in that zone, then delete the “sub.domain.com” domain).

Worth noting: Stub zones don’t need zone transfers allowed. Stub zones work via the stub server contacting the name servers of the stub zone and asking for a list of NS, A, and SOA records. These are available without any zone transfer required.

In our case we wanted to create a stub host record. We had an A record “host.sub.domain.com” and wanted to create a stub to that from another server. The solution is very simple – create a new zone called “host.sub.domain.com”, create a blank A record in that with the IP address you want (same IP that was in the “host.sub.domain.com” A record), then delete the previous “host.sub.domain.com” A record. 

Now create a stub zone for that record:stubzoneAnd that’s it.

Just to recap: zones contain domains. A domain can be spread (as sub-domains) among multiple zones. For zone transfers and stub zones you need the domain in question to be in a zone of its own. 

Run HP USB Key Utility from Windows 7

The HP USB Key Utility can be used to write an SPP ISO to a bootable USB drive. If you run it on a Windows Desktop though it refuses to run because it requires a Windows Server OS.

To work around this extract the files instead of installing it, and run the hpusbkey.exe file from the extracted files. That will work.

Also worth keeping in mind is the fact that the latest version of this utility – version 2.0.0.0 – is 64 bit only. You only need to use this version if you want to write the latest (2015.04 as of this post) SPP to a USB. That’s because the ISO for this version is greater than 4GB and older versions of the utility don’t support it. If you are on 32 bit Windows 7, be sure to use an older version such as 1.8.0.0 or 1.7.0.0.

Hope this helps someone!

Misc ESXI/ vSphere stuff

Just some notes to myself so I can refer to this later.

  • You can only have a maximum of 256 VMFS datastores per ESXI host. (This is one reason why you wouldn’t want to create a LUN/ datastore per VM. Wouldn’t work if you have a lot of VMs!)
    • Other maximums (for vSphere 5.5) can be found at this link.
  • When you create distributed switch port group there are 3 port binding options:
    • Static Binding (the default): VM NICs are connected to the port group at VM creation and remain so until the VM is removed from the port group. Power off a VM or disconnecting the NIC from the port group does not remove it from the port group – the port is still kept aside for the VM. What this means is that once you connect a VM to a port it stays with that forever.
      • Since the ports are assigned at VM creation, even if vCenter is down when the VM later powers on/ connects to the port group, it will continue to have network connectivity. (Note the emphasis on “later”. If the VM were already running and vCenter were to go down network traffic isn’t affected in either of the binding options).
    • Dynamic Binding (deprecated): VM NICs are connected to the port group only when the VM is powered on and its NIC connected to the port group. Power off the VM or disconnect the NIC and it is not longer connected to the same port when it comes back on or is reconnected.
      • Since the port binding happens only when the VM is powered on or connected, and the port group resides with vCenter, what this means is that you can only power on / off such VMs via vCenter. If vCenter is off / unreachable when the VM powers on / connects, it will not have network connectivity as it won’t have a port in the port group. (As above, note that this doesn’t affect VMs that are already running).
      • Dynamic Binding is deprecated but is useful when the number of VMs is larger than the number of ports in the port group and not all VMs will be on / connected at the same time.
    • Ephemeral Binding: Similar to Distributed Binding, VM NICs are connected to the port group only when the VM is powered on and its NIC connected to the port group. Powering off the VM or disconnecting it results in the port being removed from the port group. 
      • Although Dynamic and Ephemeral Bindings seem similar, they don’t have similar limitations. Thus while VMs with Dynamic Binding port groups won’t have network connectivity if they are powered on / connected when vCenter is off / unreachable, VMs with Ephemeral Binding have no such limitation. They don’t get a proper port number from the port group, but get a temporary one like h-1 which changes to a proper port number whenever connectivity with vCenter is restored.
      • Below screenshot shows the port numbers of three VMs, each connected to a port group of different binding (Ephemeral, Dynamic, Standard from top to bottom) and powered on when the vCenter was unreachable. Bindings
      • Although the NIC is unable to get a port – like Dynamic Binding – with an Ephemeral Binding port group the host creates a fake port and connects the VM anyway. 
      • I don’t understand why Dynamic Binding even exists as an option – unless it’s for backward compatibility? Ephemeral Binding seems to have the advantage of Dynamic Binding – ports are created at VM connection / powering on and so you can oversubscribe to a port group – but doesn’t have the disadvantage of lost connectivity when vCenter is off / unreachable. (I assume Ephemeral port groups too can be used for over subscribing, though the official KB articles don’t say anything like this so I could be wrong).
      • Dynamically creating / removing ports from the port group is an expensive operation so Dynamic and Ephemeral Binding port groups have a performance overhead. Static Binding is the preferred one.
      • Also, Ephemeral Binding port groups lose their history and security controls across host reboots. Apparently Dynamic Binding port groups don’t do this as I don’t see any mention of this as a Dynamic Binding limitation anywhere.

That’s all for now!

 

Get ESXi host network info using PowerShell/ PowerCLI

Not an exhaustive post, I am still exploring this stuff.

To get a list of network adapters on a host:

To get a list of virtual switches on a host, with the NICs assigned to these:

To get a list of port groups on a host:

To get a list of port groups , the virtual switches they are mapped to, and the NICs that make up these switches:

This essentially combines the first and third cmdlets above.

More later!

ESXi 4.0 “unsupported” mode

At work we still use some ESXi 4.0 hosts so this one’s a reminder to myself as ESXi Shell access works slightly different with that one.

On ESXi 4.0 once we are on the DCUI screen, pressing Alt+F1 gives access to a different (hidden) console. Whatever you type here seems to have no effect, but if you type the word unsupported and press Enter, you will be prompted to enter the root password and enter the Tech Support Mode (TSM). For screenshots and such of this check out this blog post.

On ESXi 4.1 and above you can enable this via the DCUI. See this KB article for the deets.

On that note here’s a good blog post detailing various ways of enabling SSH access on an ESXi host. Informative.

Number of IPv4 routes did not match

Was creating / migrating some ESXi hosts during the week and came across the above error “Number of IPv4 routes did not match” when checking for host profile compliance of one of the hosts. All network settings of this host appeared to be same as the rest so I was stumped as to what could be wrong. Via a VMware KB article I came across the esxcfg-route command that helped identify the problem. To run this command SSH into the host:

By default the command only outputs the default gateway but you can pass it the -l switch to list all routes:

In my case the above output was from one of the hosts, while the following was from the non-compliant host:

Notice the vmk2 interface has the wrong network. Not sure how that happened. Oddly the GUI didn’t show this incorrect network but obviously something was corrupt somewhere.

To fix that I thought I’ll remove the vmk2 interface and re-add it. Big mistake! Possibly because its network was same as that of the management network (10.50.0.0/24) removing this interface caused the host to lose connectivity from vCenter. I could ping it but couldn’t connect to it via SSH, vSphere Client, or vCenter. Finally I had to reset the network via the DCUI – it’s under “Network Restore Options”. I tried “Restore vDS” first, didn’t help, so did a “Restore Standard Switch”. This is a very useful – it creates a new standard switch and moves the Management Network onto that so you get connectivity to the host. This way I was able to reconnect to the host, but now I stumbled upon a new problem.

The host didn’t have the vmk2 interface any more but when I tried to recreate it I got an error that the interface already exists. But no, it does not – the GUI has no trace of it! Some forum posts suggested restarting the vCenter service as that clears its cache and puts it in sync with the hosts but that didn’t help either. Then I came across this post which showed me that it is possible for the host to still have the VMkernel port but vCenter to not know of it. For this the esxcli command is your friend. To list all VMkernel ports on a host do the following:

After that, removing the VMkernel interface can be done by a variant of same command:

Now I could add the re-add the interface via vSphere and get the hosts into compliance.

Before I conclude this post though, a few notes on the commands above.

If you have PowerCLI installed you can run all the esxcli commands via the Get-EsxCli cmdlet. For example:

If I wanted to remove the interface via PowerCLI the command would be slightly different:

I would have written more on the esxcli command itself but this excellent blog post covers it all. It’s an all powerful command that can be used to manage many aspects of the ESXi host, even set it in maintenance mode!

Heck you can even use esxcli to upgrade from one ESXi version to another. It is also possible to run the esxcli command from a remote computer (Windows or Linux) by installing the vSphere CLI tools on that computer. Additionally, there’s also the vSphere Management Assistant (VMA) which is a virtual appliance that offers command line tools.

The esxcli is also useful if you want to kill a VM. For instance the following lists all running VMs on a host:

If that VM were stuck for some reason and cannot be stopped or restarted via vSphere it’s very useful to know the esxcli command can be used to kill the VM (has happened a couple of times to me in the past):

Regarding the type of killing you can do:

There are three types of VM kills that can be attempted: [soft, hard, force]. Users should always attempt ‘soft’ kills first, which will give the VMX process a chance to shutdown cleanly (like kill or kill -SIGTERM). If that does not work move to ‘hard’ kills which will shutdown the process immediately (like kill -9 or kill -SIGKILL). ‘force’ should be used as a last resort attempt to kill the VM. If all three fail then a reboot is required. (required)

Another command line option is vim-cmd which I stumbled upon from one of the links above. I haven’t used it much so as a reference to myself here’s a blog post explaining it in detail.

Lastly there’s also a bunch of esxcfg-* commands, one of whom we came across above.

I haven’t used these much. They seem to be present for compatibility reasons with ESXi 3.x and prior. Back then you had commands with a vicfg- prefix, now you have the same but with a esxcfg- prefix. For instance, esxcfg-vmknic is now replaced with esxcli network interface as we saw above.

That’s all for now!

Update: Thought I’d use this post to keep track of other useful commands.

To get IPv4 addresses details:

Replace with ipv6 if that’s what you want.

To set an IPv4 address:

To ping an address from the host:

Change keyboard layout:

Get current keyboard layout:

List available layouts:

Set a new layout:

Remotely enable SSH

The esxcli commands are cool but you need to enable SSH each time you want to connect to the host and run these (unless you install the CLI tools on your machine). If you have PowerCLI though you can enable SSH remotely.

To list the services:

To enable SSH and the ESXi shell: