I am yet to read this but in case you didn’t know there’s a book by HP on Virtual Connect. I haven’t used Virtual Connect at all except briefly see it for the first time when my colleagues showed it to me last month. I have to update the Virtual Connect firmware for our enclosures now so am looking into how I can do that. Here are some more documents I am yet to read; linking them here as a bookmark to myself:
- A PDF giving an overview of Virtual Connect
- A page with all the documentation HP has on Virtual Connect and related
- A page with many whitepapers and manuals on how Virtual Connect works
Virtual Connect firmware can be done via HP SUM/ SPP. It can also be done independently via the Virtual Connect Support Utility (VCSU).
- This PDF (which can be found via the second bullet point above) is very useful. It is a document outlining the steps involved in upgrading the Virtual Connect firmware. It’s from 2013 but I couldn’t find anything newer on HP’s website.
- The above PDF is also linked to from this excellent blog post that talks about how to upgrade the firmware without any downtime.
- VCSU can be downloaded from this page.
- Here’s a page with some of its more useful commands.
- Finally, this page has the latest version of the firmware. You can download the version of Windows and extract the binary image of the firmware.
Upgrading the Virtual Connect firmware seems very straightforward. As I said you can do it via the SUM/ SPP too. Recommended order is to first upgrade the OA (easily done via SUM/ SPP – no reboot required); then the ROM, iLO, and any other firmware for the blades (again easily done via SUM/ SPP – ROM & iLO don’t require any reboot); and finally the VC. For me the big question was whether I can do the VC upgrade without any network impact.
The PDF I mentioned above (this one) is a must read on the upgrade process. Page 10 onwards talks about the upgrade process.
One thing to note is that before upgrade VCSU (which is what SUM/ SPP too use behind the scenes I suppose) takes a backup of the configuration and does health checks. If the VCs don’t pass health checks the upgrade doesn’t happen. Each Ethernet module of the VC takes about 20 minute to upgrade; each FC module takes about 5 mins. An overview of the upgrade process can be found on page 11 – in short, here’s what happens:
- Via SFTP the new firmware is copied in parallel to all modules.
- Firmware is upgraded on all modules in parallel. This can be thought of as the update phase.
- Then the firmware are activated. The default order is odd-even in which modules on the odd side of the enclosure are activated, then those on the even side.
- It is also possible to do serial activation (one after the other), or parallel (everything at the same time), or manual.
- Post activation the module is rebooted.
- I am not very clear here but it seems the modules on the backup VC side of things (including the backup VC) get rebooted first.
- Then the modules on the primary VC side of things (except the primary VC) get rebooted.
- Failover VC Manager (VCM) to the backup VC module, and then the primary VC module is rebooted.
- Post-reboot the VCM fails over back to the primary VC module (this is only for the Ethernet modules I think, not FC).
Notice the bit about the reboots above? That’s when network connectivity can be lost. On page 12 the document talks about how network outages can be avoided via redundant configuration and NIC bonding but then on page 13 it clarifies that because the reboot is a graceful one there is a possibility that there could be a 20 second network outage because the blade hardware (and the OS running on it) might not be notified that the VC module is down. You see, something called the SmartLink and DCC protocol are responsible for informing the blades that the VC modules are down and so the NICs they map to are down – and so they should fail over to another NIC using the backup VC – but because the firmware is being upgraded the SmartLink and DCC protocol are unavailable, no one informs the blades. So it only when the OS in the blades realize that it has lost network connectivity and must take corrective action, does the OS fail over to using the backup NIC – leading to a potential 20 second outage.
(What I said above is also what this blog post mentions. To give credit I came across the blog post first and through it the guide).
The workaround to the above outage is to set the activation order as manual. And then reset the VC modules through the OA. Since that’s a reset – as opposed to a graceful reboot – the blades will get a notification immediately that the module is down.
Here’s how I updated the VC firmware on my servers without any network outage. First I used VCSU (in update mode) to update & activate the VC modules. Note I select “manual” as the option in two places below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
C:\Program Files (x86)\Hewlett-Packard Company\Virtual Connect Support Utility>vcsu.exe ------------------------------------------------------------------------------- HP BladeSystem c-Class Virtual Connect Support Utility Version 1.11.0 (Build 16) Build Date: Nov 18 2014 09:37:15 Copyright (C) 2006-2014 Hewlett-Packard Development Company, L.P. All Rights Reserved ------------------------------------------------------------------------------- Please enter action ("help" for list): update Please enter Onboard Administrator IP Address: 10.134.201.100 Please enter Onboard Administrator Username: Administrator Please enter Onboard Administrator Password: ******** Please enter firmware package location: vcfwall441.bin Please enter Configuration backup password (Optional): Please enter Force Update options if any (eg: version,health): Please enter VC-Enet module activation order if any (eg: parallel or odd-even or serial or manual. Default: odd-even): manual Please enter VC-FC module activation order if any (eg: parallel or odd-even or serial or manual. Default: serial): manual Please enter the time (in minutes) to wait between activating or rebooting VC-Enet modules (max 60 mins. Default: 0 mins): 5 Please enter the time (in minutes) to wait between activating or rebooting VC-FC modules (max 60 mins. Default: 0 mins): 5 The target configuration is integrated into a Virtual Connect Domain. Please enter the Virtual Connect Domain administrative user credentials to continue. User Name: Administrator Password: ******** |
I set a time of 5 mins to wait between activation of each VC module. That’s generally recommended.
After that I got the screens below – the whole process took about 40 minutes:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
(takes about 5 mins here) Gathering module and package information [Step 2 of 2] (takes another 5 mins here) The following modules will be updated: =============================================================================== Enclosure Bay Module Current Version New Version =============================================================================== GB8007KF5D 1 HP VC Flex-10 3.70 4.41 Enet Module 2012-07-03T20:37:31Z 2015-03-05T20:13:31Z ------------------------------------------------------------------------------- GB8007KF5D 2 HP VC Flex-10 3.70 4.41 Enet Module 2012-07-03T20:37:31Z 2015-03-05T20:13:31Z ------------------------------------------------------------------------------- The following modules will NOT be updated: =========================================================================== Enclosure Bay Module Current Version Status =========================================================================== GB8007KF5D 3 HP 8/24c SAN Unknown Application does Switch Pwr Pk+ not support the BladeSystem hardware. Part #: c-Class AJ822A --------------------------------------------------------------------------- GB8007KF5D 4 HP 8/24c SAN Unknown Application does Switch Pwr Pk+ not support the BladeSystem hardware. Part #: c-Class AJ822A --------------------------------------------------------------------------- During the update process, modules being updated will be temporarily unavailable. In addition, the update process should NOT be interrupted by removing or resetting modules, or by closing the application. Interrupting the update or the modules being updated may cause the modules to not be updated properly. Please verify the above report before continuing. Would you like to continue with this update? [YES/NO]: YES Starting firmware update process (about 30 mins of updating ...) The following modules were updated successfully: ======================================================================= Enclosure Bay Module New Version ======================================================================= GB8007KF5D 1 HP VC Flex-10 Enet Module 3.70 2012-07-03T20:37:31Z GB8007KF5D 2 HP VC Flex-10 Enet Module 3.70 2012-07-03T20:37:31Z Total execution time: 00:38:48 VCSU log file is available at the following location: C:\Program Files (x86)\Hewlett-Packard Company\Virtual Connect Support Utility\vcsu-17844.log Press Return/Enter to exit... |
That completes the updating and activation but the firmware isn’t activated yet because I chose not to reboot. Because of that there’s no network downtime so far.
After that I logged into the OA, went over to the Interconnect Bays section > selected the first VC module > Virtual Buttons tab > and clicked Reset.
This resets the VC module. Again no network outage (I was continually pinging some of the hosts and the VMs – one of the VMs had 3 packet drops, that’s it; the hosts I pinged had no drops). Post resetting (which is instant on the UI) I waited some 5 mins, then checked the Information tab to see the firmware level. It was showing the new firmware:
After that did the same (reset) for the second VC module. Waited 5-6 mins and then I ran VCSU again (in healthcheck mode) to confirm the state of the modules. (To make the output smaller I used input switches to VCSU. Could have done the same above too).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
C:\Program Files (x86)\Hewlett-Packard Company\Virtual Connect Support Utility>vcsu -a healthcheck -i 10.134.201.100 -u Administrator -p * -vcu Administrator -vcp * ------------------------------------------------------------------------------- HP BladeSystem c-Class Virtual Connect Support Utility Version 1.11.0 (Build 16) Build Date: Nov 18 2014 09:37:15 Copyright (C) 2006-2014 Hewlett-Packard Development Company, L.P. All Rights Reserved ------------------------------------------------------------------------------- Please enter Onboard Administrator Password: ******** The target configuration is integrated into a Virtual Connect Domain. Please enter the Virtual Connect Domain administrative user credentials to continue. Password: ******** ------------------------------------------------------------------------------- Onboard Administrator Information ------------------------------------------------------------------------------- Version : 4.40 VLAN Tagging : Disabled ------------------------------------------------------------------------------- Virtual Connect Domain Information ------------------------------------------------------------------------------- Stacking Connection : OK Stacking Redundancy : OK ------------------------------------------------------------------------------- Bay 1 : HP VC Flex-10 Enet Module ------------------------------------------------------------------------------- Power : On Health : Ok IP Address : 10.134.201.108 IP Connectivity : Passed Version : 4.41 Mode : Primary Domain Configuration : In Sync Module Configuration : In Sync ------------------------------------------------------------------------------- Bay 2 : HP VC Flex-10 Enet Module ------------------------------------------------------------------------------- Power : On Health : Ok IP Address : 10.134.201.109 IP Connectivity : Passed Version : 4.41 Mode : Backup Domain Configuration : In Sync Module Configuration : In Sync ------------------------------------------------------------------------------- Bay 3 : HP 8/24c SAN Switch Pwr Pk+ BladeSystem c-Class ------------------------------------------------------------------------------- Power : On Health : Ok IP Connectivity : Unknown ------------------------------------------------------------------------------- Bay 4 : HP 8/24c SAN Switch Pwr Pk+ BladeSystem c-Class ------------------------------------------------------------------------------- Power : On Health : Ok IP Connectivity : Unknown Total execution time: 00:01:02 VCSU log file is available at the following location: C:\Program Files (x86)\Hewlett-Packard Company\Virtual Connect Support Utility\vcsu-14520.log |
As can be seen the modules are in sync and both the latest firmware version. All done without any network outage! :)
Update Jan 2016: Chris Lynch (from HPE) wrote to me three months ago clarifying some misinformation in my post above. Turns out you no longer have the 20 second outage and all that I wrote above is more or less incorrect. :) Rather than copy paste his email here I’ve printed it to a PDF and you can read it here – Chris Lynch update. Thanks Chris!