Category: vSphere 6.7

VMware vSphere 6.7 blog posts.

HPE ProLiant DL20 Gen9 Home Lab

This blog post is about replacing my current 24×7 Lab with a new set of two HPE ProLiant DL20 Gen9 servers. In this blog post, I am going to tell you about the configuration of the machines and how they are running on VMware ESXi. Also, I am going to compare them to my other lab hardware and my past home lab equipment.

Hardware

So let’s kick off with the hardware! The HPE DL20 Gen 9 servers I bought were both new in the box from eBay and I changed the hardware components to my own liking.

A couple of interesting points I learned so far nearly all servers that you will find for sale are provided with an Intel Xeon E3-12XX v5 processor. One item you need to take into account: yes you can swap the CPU from a v5 to a v6 like I did but you need to replace the memory modules also! The memory modules are compatible with a v5 or v6 processor but not both ways. The Intel Xeon E3-12XX v5 CPUs are using 2133 MHz memory and the Intel Xeon E3-12XX v6 CPUs are using 2400 MHz memory. So keep that in mind when swapping the processor and/or buying memory.

In the end, after some swapping of components, I ended up with the following configuration. Both ProLiant servers have an equal configuration (like it should be in a vSphere cluster):

ComponentItem
Vendor:HPE
Model:DL20 Gen9
CPU:Intel® Xeon® Processor E3-1230 v6
Memory:64GB DDR4 ECC (4 x 16GB UDIMM @2400MHz)
Storage:32GB SD card on the motherboard
Storage controller:All disabled
Network card(s):HPE Ethernet 1Gb 2-port 332i Network Adapter
Expansion card(s):HPE 361T Dual-Port 2x Gigabit-LAN PCIe x4
Rackmount kit:HPE 1U Short Friction Rail Kit

Power usage

So far I have measured the power usage of the machines individually with the listed configuration in the hardware section. When measuring the power usage the machine was running VMware ESXi and on top of about seven virtual machines that were using about 30% of the total compacity. I was quite amazed by the low power consumption of 31.7 watts per host but I have to take into account that this is only the compute part! The hosts are not responsible for storage. Here is a photo of my power meter when performing the test:

Screenshot(s)

Here are some screenshot(s) of the servers running in my home lab environment and running some virtual machine workload:

  • Screenshot 01: Is displaying one of the hosts running VMware ESXi 6.7 (screenshot from HPE iLO).
  • Screenshot 02: Is displaying one of the hosts connected to VMware vCenter and running virtual machines.
  • Screenshot 03: Is displaying one of the hosts HPE iLO web page.

Positives & Negatives

To sum up, my experience I have created a list of positives and negatives to give you some insight into the HPE ProLiant DL20 Gen9 as a home lab server.

Positives:

  • A lot of CPU power compared to my previous ESXi hosts, link to the previous setup.
  • Rack-mounted servers (half-size deep with sliding rails).
  • Out of band management by default (HPE iLO).
  • Power usage is good for the amount of compute power delivered.
  • No additional drivers are required for VMware ESXi to run.
  • The HPE DL20 Gen9 has been on the VMware HCL, link.

Negatives:

  • Noisy compared to my previous setup (HPE ProLiant ML10 Gen8). For comparison, the HPE ProLiant DL360 Gen8 is in most cases “quiet” compared to the HPE ProLiant DL20 Gen9.
  • Would be nice if there was support for more memory because you can never have enough of that in a virtualization environment ;).

Photos

Here are some photos of the physical hardware and the internals, I did not take any pictures of the hardware when the components were all installed. I am sorry :(.

  • Screenshot 01 – Is displaying both machines running and installed in the 19″ server rack.
  • Screenshot 02 – Is displaying the internals of the DL20 Gen9. Keep in mind this one is empty. As you can see in that picture the chassis is just half-size!

Wrap-up

So that concludes my blog post. If you got additional questions or remarks please respond in the comment section below. Thanks for reading my blog post and see you next time.

VMware vCenter SNMP Configuration

VMware vSphere 6.7 Logo

In this last blog of the year, we are going to set up the SNMP agent on VMware vCenter Server. This blog will cover the vCenter SNMP configuration and I will show some debugging examples to verify the working of the SNMP Agent. In my case, I am using Zabbix Server as the monitoring program to verify the status of my VMware vCenter Server in my lab environments. This reduces the amount of manual troubleshooting and ensures that services are running as expected.

The reason why I did this write-up was because of the lack of documentation from the vendor’s website. As you can see in the source pages below there is a limited set of commands and nearly no examples. To set up my environment I needed some additional commands to get everything working correctly.

Environment

The environment where I configured the SNMP agent was on a VMware vCenter Server 6.7 update 3 (VCSA /appliance). I am monitoring the VMware vCenter Server with a Zabbix Server that is running on CentOS 8. I am currently using SNMP v2 in this example because it is used by most people.

Keep in mind: SNMP v1 and v3 are also supported by both products. My recommendation is to use SNMP v3 of course because of the security improvements like authentication & encryption :).

SNMP

A quick explanation about SNMP (thanks Wikipedia): Simple Network Management Protocol (SNMP) is an Internet Standard protocol for collecting and organizing information about managed devices on IP networks and for modifying that information to change device behavior. Devices that typically support SNMP include cable modems, routers, switches, servers, workstations, printers, and more.

SNMP is widely used in network management for network monitoring. SNMP exposes management data in the form of variables on the managed systems organized in a management information base (MIB) which describes the system status and configuration. These variables can then be remotely queried (and, in some circumstances, manipulated) by managing applications.

Commands

Here are the commands I have used for the vCenter SNMP configuration. Note: make sure you have access to the root account to perform the login.

# Step 1: Start an SSH connection with the vCenter Server (use Putty or something equivalent).

# Step 2: Login as the root user

# Step 3: After a successful login you should be in the appliance Shell.

# Step 4: View the current configuration for SNMP
snmp.get

# Step 5: Configure the SNMP Community (in this example I use MySnmpCommunity)
snmp.set --communities MySnmpCommunity

# Step 6: Allow a device to access the SNMP agent (192.168.10.10 = monitoring server)
snmp.set --targets 192.168.10.10@161/MySnmpCommunity,172.0.0.1@161/MySnmpCommunity,localhost@161/MySnmpCommunity

# Step 7: Enable the SNMP Agent
snmp.enable

# Step 7: Verify the SNMP Settings configured
snmp.get

# Step 8: Test the working (in my case it never works... not sure why? Has something to do with my access restrictions?)
snmp.test

# Step 9: Perform a test from the monitoring server (in my case a Linux machine with snmpwalk)
snmpwalk -v2c -c MySnmpCommunity %hostname-vcenter%

Screenshots

Here are some screenshots related to the SNMP configuration:

Wrap-up

So that is it! Hopefully, this blog post was useful and this wraps-up 2020. See you next year and if you have any comments please respond below.

Sources

Here are some sources I used when configuring SNMP on VMware vCenter Server:

ProLiant ML10 v2 CPU Swap

In this blog post, I am talking about the HPE ProLiant ML10 v2 home lab servers that I have been using for the last three years. I had some performance issues related to the processor with the number of virtual machines and containers running on the little ML10 v2 servers. So it was time for a CPU Swap!

On the internet, there are a lot of speculations on which CPUs are supported in the HPE ProLiant ML10 v2. So that is why I did this blog post.

The servers were originally bought with Intel® Pentium® Processor G3240 CPUs. This was the smallest CPU available at the time. At first, I was looking at the Intel Xeon E3-1220 v3 CPUs but I decided to buy the Intel® Core™ i3-4170 Processor on Ebay.com for a couple of bucks. The choice was related to the pricing difference and the amount of power usage.

I can confirm that both HPE ProLiant ML10 v2 servers detected the i3-4170 CPUs without any issues. The systems are running 24×7 and the CPU temperature is around fourth to fifty degrees with the fans running on their lowest operating mode.

Comparison

As you already figured out the G3240 is a slow CPU compared to the i3-4170. So it was a well worth invested upgrade it for about 40 euro’s for both CPUs in total.

The hypervisor (VMware ESXi) and workload performance improved drastically. Because of the additional instruction sets like AES-NI and clock speed. So it was a good investment at least in my opinion.

Here is a comparison provided by the Intel ARK website. Click here for the link.



Screenshot(s)

Here are some screenshots of one of the HPE ML10 v2 server that was upgraded with the new CPU. As you can see the screenshots are from the HPE Integrated Lights-out or in short (iLO). The first screenshot is of the new CPU that is detected, the second one is the memory configuration and the third screenshot is the operating temperatures after running a couple of days with the workload.

Result

As you can see the Intel i3-4170 CPU is working without any issues in the ML10 v2 server. Currently, they have been running for about 100 days without any reboot. So I can confirm they are stable and do not overheat! The CPU swap is successful!

Notes:

  • I use stock cooling.
  • I do not use a modified BIOS.

Thanks for reading and see you next time!

VCSA 6.7 Out of Space (SEAT)

Today I was greeted by the following error message when logging into the VMware vCenter Server also known as VCSA: “Could not connect to one or more vCenter Server systems: https://%fqdn%:443/sdk“. So it was time for a quick write-up on how to resolve this issue.

The issues were already present a couple of hours earlier based on monitoring and logging. For example, Veeam Backup & Replication tried to perform a backup but failed because there were no vSphere Tags available. Veeam Backup & Replication generated the following message “Tag Backup SLA – Bronze is unavailable, VMs residing on it will be skipped from processing.“.

I’m running a VMware vCenter Server as in a VCSA 6.7 appliance and it has an embedded Platform Services Controller. The exact version of the appliance was at the moment of the issue “6.7.0.32000 – Build 14070457“.

Could not connect to one or more vCenter Server systems:

At first glance everything looks fine, the web-interfaces are online, authentication is working but after login, the following message appears “Could not connect to one or more vCenter Server systems: https://%fqdn%:443/sdk“. None of the pages are displaying any content. Here is a screenshot:



After performing a simple reboot nothing had happened, the result was the same. So it was time to dig deeper. Luckily the reboot did trigger a new event in the Appliance Management Page (5480). It appeared that the /storage/seat disk had filled up. The alert that popped-up was “File system /storage/seat is low on database storage space. Increase the size of disk /storage/seat or decrease the data retention.” Here is a screenshot:



Increasing Disk Space

After finding the error message it appeared to be an easy fix. Here is an overview of the commands I used. The commands are also usable for expanding one of the other VCSA virtual disks.

Keep in mind: before increasing disk capacity make sure you have a backup or snapshot available.

It this case we are going to expand this /storage/seat volume. The seat volume is responsible for Stats, Events, Alarms, and Tasks (SEAT) for VMware Postgres (Database).

# Step 01: Connect with the vCenter Server with an SSH Session (use for example Putty).

# Step 02: Login with the root account (root/your-password).

# Step 03: Enable the shell
shell

# Step 04: Run the command to verify the current disk space:
df -h

# Step 05: Increase disk capacity with the Host Client because the vCenter Web-interface is not working ;) (see screenshots)

# Step 06: Run the disk expansion command, the expected output should be: VC_CFG_RESULT=0
vpxd_servicecfg storage lvm autogrow

# Step 07: Verify the disk again, the disk should be bigger!
df -h

# Step 08: Reboot the VCSA
reboot

# Step 09: Verify the working of the VCSA Appliance after reboot.

Here is a collection of screenshots of me performing the procedure.

Conclusion

VMware made it easy for the system administrators to identify the issue and quickly expand the virtual disk from the vCenter Appliance. This is a huge improvement compared to the past. The only thing you need to watch out for is the number of virtual disks connected to the VCSA. If you do not watch out you could expand the wrong disk.

The reason that the disk filled up was caused by two things in my case. 1) I created and destroyed lots of virtual machines in the days before the incident. 2) The VCSA is configured as a tiny footprint so that is why the disks are relatively small.

So this was the write-up! If you got any comments or questions please respond in the section below.

VMware ESXi Start a Virtual Machine from Shell

In this blog post, we are going to talk about VMware ESXi and the ability to start & stop a virtual machine from the ESXi shell.

Recently, I ran into a situation where vCenter Server was powered-off manually and the ESXi host that was responsible was not able to open the VMware Host Client to start VMware vCenter with a single mouse click. So it was time to figure out how to boot a virtual machine from the VMware ESXi shell.

I was lucky because SSH was enabled on the ESXi Host so I was able to connect and log in with the root account, but then I ran into the issue… Which command do I need to power on a virtual machine? I knew for sure that it was possible but it took me some time to find the right commands.

So based on that experience, it was time for a quick write-up to show you how to boot a virtual machine from the shell with VMware ESXi. To complete the article I also added the commands for powering off.

Note: The environment was running on vSphere 6.7 Update 2. So all commands are valid for vSphere 6.7 and probably older versions of VMware vSphere.



Start a Virtual Machine from Shell

Here is a step-by-step procedure for booting a virtual machine from the VMware ESXi shell.

# Step 01: Connect with SSH (for example Putty).

# Step 02: Login as a user with root privileges

# Step 03a: View ESXi host virtual machine inventory
vim-cmd vmsvc/getallvms

# Step 03b: View ESXi host virtual machine inventory with filter
vim-cmd vmsvc/getallvms | grep %VMname%

# Step 04: Write-down the VMid, in my case:
183

# Step 05: Verify the current power status
vim-cmd vmsvc/power.getstate %VMid%

# Step 06: Power-on virtual machine
vim-cmd vmsvc/power.on %VMid%

# Step 07: Command has been executed the virtual machine will be power-on. To verify you can use:
vim-cmd vmsvc/power.getstate %VMid%

Screenshots

Here are the screenshots of performing a VMware vCenter virtual machine startup from the VMware ESXi shell.


Stop a Virtual Machine from Shell

Here is the procedure for stopping a virtual machine from the VMware ESXi Shell.

# Step 01: Connect with SSH (for example Putty).

# Step 02: Login as a user with root privileges

# Step 03a: View ESXi host virtual machine inventory
vim-cmd vmsvc/getallvms

# Step 03b: View ESXi host virtual machine inventory with filter
vim-cmd vmsvc/getallvms | grep %VMname%

# Step 04: Write-down the VMid, in my case:
183

# Step 05: Verify the current power status
vim-cmd vmsvc/power.getstate %VMid%

# Step 06: Power-on virtual machine
vim-cmd vmsvc/power.off %VMid%

# Step 07: Command has been executed the virtual machine will be power-off. To verify you can use:
vim-cmd vmsvc/power.getstate %VMid%


Conclusion

You can easily perform this procedure if you know the right commands. There is not a lot of new information available about the vim-cmd command but I added the source(s) below.

VMware ESXi SATADOM Boot Device

In the blog post, I am showing you how to deal with SATADOM boot devices in your ESXi Hosts. Recently I replaced my SD cards with SATADOMs in all my ESXi Hosts in my HomeLab. This blog post is about my experience and configuration that was required for HPE ProLiant servers.

SD Card

In the past, I always used SD cards in my VMware ESXi servers as a boot media but overtime SD cards would wear out of fail. This is of course not ideal but the costs of replacing an SD card are quite low compared to 2.5-inch drives for example. So a nice alternative is a SATADOM, a fast, cheap and more reliable solution.

Here are some screenshots of my Home Lab environment with a failed SD card. The ESXi Host is still fully operational but has lost its boot device. In most cases, you can reboot the ESXi Host and it will work for about three days and the issue is back.



SATADOM

So after a couple of failures over the years, it was time to replace the SD cards with a SATADOM. The installation is quite simple but you need to verify some stuff… some SATADOMs use external power and some receive their power from the SATA connector (please verify this before buying).

In my case, I bought a SATADOM with an external power source. This because my ProLiant servers do not have SATA Ports with a power feature. I ended up buying a Delock SATA 6 Gb/s Flash Module 16 GB vertical (part nr 54655) in a webshop in Holland.

The “biggest” issue I encountered was configuring the BIOS in a way that the device was correctly detected. Here are the screenshots related to the BIOS settings and SATA port used on the motherboard. It appeared that the ML10v2 expected the SATADOM to be connected to port 5, on other ports, it was not working or it was not detected by VMware ESXi.

Here is a recording of the HP ProLiant ML10 v2 booting from SATADOM after a successful ESXi installation. Compared to the SD card the boot time has been reduced with 50%. Speed is of course always nice to have but how many times do you boot an ESXi host in a production environment? On the other hand… it could be very useful for a Lab Environment that is not running 24×7 and you boot your ESXi Hosts on a daily basis.

HP ProLiant ML10 v2 – Booting from SATADOM


VMware vSAN Requirements

So let’s look at the official requirements for VMware vSAN when using a SATADOM as boot media. Note: based on the amount of physical memory installed in your ESXi Host the requirements change!

  • When you boot a vSAN host from a SATADOM device, you must use single-level cell (SLC) device. The size of the boot device must be at least 16 GB.
  • If the memory of the ESXi host has 512 GB of memory or less, you can boot the host from a USB, SD, or SATADOM device.
  • If the memory of the ESXi host has more than 512 GB, consider the following guidelines.
    • You can boot the host from a SATADOM or disk device with a size of at least 16 GB. When you use a SATADOM device, use a single-level cell (SLC) device.
    • If you are using vSAN 6.5 or later, you must resize the coredump partition on ESXi hosts to boot from USB/SD devices. For more information, see the VMware knowledge base article at http://kb.vmware.com/kb/2147881.

Sources

Here is a list of sources I used for writing this article.

vSphere 6.7 Convergence Tool: Failed to get vecs users and permissions

Last week I was converting a vSphere 6.7 Update 1 environment from external PSC to embedded PSC. After a couple of seconds running the conversion, it ended in an error message (Failed to get vecs users and permissions).

The customer was using the latest available vCenter 6.7 update 1 release available at this point vCenter Appliance 6.7 U1b (11727113). The environment consists of one Platform Services Controller (PSC) and one vCenter Server (VC) and a couple of VMware ESXi 6.7 Update 1 hosts.



Error Message

The error message in my PowerShell window displayed the following error message. Not really the best message (possible resolution is []) but it pointed me in the right direction.

### PowerShell output from vcsa-util.exe
2019-05-07 11:07:58,538 [loggable.py:102]: ================ [FAILED] Task: MonitorPSCDeployTask: Running MonitorPSCDeployTask execution failed at 11:07:58 ================
2019-05-07 11:07:58,553 [loggable.py:102]: Task 'MonitorPSCDeployTask: Running MonitorPSCDeployTask' execution failed because [ERROR: Converge Process Failed!], possible resolution is []
2019-05-07 11:07:58,553 [loggable.py:102]: ================================================================================
2019-05-07 11:07:58,631 [taskflow.py:943]: <MonitorPSCDeployTask - com.vmware.vcsa.installer.converge.monitor_psc_deploy(FAILED)> in <ConvergeTaskFlow - converge(FAILED)> status changed to: FAILED
2019-05-07 11:07:58,694 [taskflow.py:641]: Execution attempt 1 for Task <MonitorPSCDeployTask - com.vmware.vcsa.installer.converge.monitor_psc_deploy(FAILED)> FAILED with exception: ERROR: Converge Process Failed!
2019-05-07 11:07:58,694 [taskflow.py:672]: Finished executing <MonitorPSCDeployTask - com.vmware.vcsa.installer.converge.monitor_psc_deploy(FAILED)> and its status is FAILED
2019-05-07 11:07:58,694 [taskflow.py:675]: <ConvergeTaskFlow - converge(FAILED)> overall status is now FAILED

Inside the “converge_mgmt.log” logfile the following error was displayed see output below. The log file can be found on the following location on your local system: “C:\Users\User\AppData\Local\Temp\vcsaCliInstaller-2019-05-07-11-25-6pn5b67r\workflow_1557228307149\converge\converge_mgmt.log“. Keep in mind, the file path is dynamic and I was using Microsoft Windows.

2019-05-07T11:07:46.688Z ERROR converge Failed to get vecs users and permissions. Error: {
    "componentKey": null,
    "problemId": null,
    "detail": [
        {
            "id": "install.ciscommon.command.errinvoke",
            "localized": "An error occurred while invoking external command : 'Command: ['/usr/lib/vmware-vmafd/bin/vecs-cli', 'entry', 'getcert', '--store', 'APPLMGMT_PASSWORD', '--alias', 'location_password_default', '--output', '/root/velma/old_certs/APPLMGMT_PASSWORD.crt']\nStderr: Error: No certificates were found for entry [location_password_default] of type [Secret Key].\nvecs-cli failed. Error 87: Operation failed with error ERROR_INVALID_PARAMETER (87) \n'",
            "translatable": "An error occurred while invoking external command : '%(0)s'",
            "args": [
                "Command: ['/usr/lib/vmware-vmafd/bin/vecs-cli', 'entry', 'getcert', '--store', 'APPLMGMT_PASSWORD', '--alias', 'location_password_default', '--output', '/root/velma/old_certs/APPLMGMT_PASSWORD.crt']\nStderr: Error: No certificates were found for entry [location_password_default] of type [Secret Key].\nvecs-cli failed. Error 87: Operation failed with error ERROR_INVALID_PARAMETER (87) \n"
            ]
        }
    ],
    "resolution": null
}
2019-05-07T11:07:46.706Z INFO converge Cleanup successful with partial flag = True.


Solving the issue

After searching on Google on the string “ERROR converge Failed to get vecs users and permissions“. I got a hit on a VMware KB article. The VMware article can be found below and explained what was going wrong.

The solution is very simple… remove the vCenter Backup Schedule in the VAMI (VMware Appliance Management Interface):

Procedure:

  1. Log into the vCenter Server Appliance Management Interface (https://%vcenter-fqdn%:5480)
  2. Login with the root account.
  3. Navigate to the Backup view
  4. Next to Backup Schedule, click the Delete button to delete the current backup schedule
  5. Attempt the convergence process again!
  6. Once the convergence is complete, re-create the backup schedule. See Schedule a File-Based Backup for more information on creating a backup schedule.

Community Feedback

I got the following feedback on this article after publishing:

  • Update 08-04-2019: David Stamen reached out to me on Twitter with the response: This was fixed in #vSphere67U2.

Sources

The following websites were very usefull for troubleshooting this issue:

vRealize Automation – Failed to convert external resource %VM-Name%.

I ran into an error message today with vRealize Automation (vRA). The error message that came up was: Failed to convert external resource Prod-Fin-00012. The issue occurred in vRA version 7.3.1.

Inside the vRealize Automation portal, I tried to upgrade virtual machine hardware but it failed directly when issuing the request. Strange thing was it was working a couple of day ago. After some investigating the error also came back on other day-2 tasks. So it was time to dive deeper into the issue.

Here is a screenshot of the issue:

vRealize Automation - Error Message - Failed to convert external resource
vRealize Automation – Error Message – Failed to convert external resource

The Cause

So let us think about what vRealize Automation is performing, it is executing a task on a virtual machine. To perform this it needs to talk to vCenter Server and to talk to vCenter Server it uses vRealize Orchestrator.

Here is a simple overview of the communication that happens in this case. vRealize Automation is communicating to vRealize Orchestrator and vRealize Orchestrator is communicating to vCenter Server.

VMware vRealize Automation - vSphere Endpoint Communication
VMware vRealize Automation – vSphere Endpoint Communication

Error messages

The following error messages were found on the following systems:

vRealize Automation error message:

Error message: Failed to convert external resource Prod-Fin-00012.
Script action com.vmware.vcac.asd.mappings/mapToVCVM failed.

vRealize Orchestrator error message:

https://LAB-VC-A.Lab.local:443/sdk (unusable: java.lang.ClassCastException: com.vmware.vcac.authentication.http.spring.oauth2.OAuthToken cannot be cast to com.vmware.vim.sso.client.SamlToken)

As you can see here vRealize Orchestrator has communication issues with VMware vCenter Server. This issue needs to be addressed for vRealize Automation.

Screenshots:



The Solution

After finding the vRealize Orchestrator vSphere endpoints in an error state it was clear that this was the issue. vRealize Orchestrator is not successfully communicating with vCenter Server so this needs to be addressed.

Procedure:

  1. Open the vRealize Orchestrator Client (https://%vro-node-fqdn%).
  2. Login with administrative credentials (example: administrator@vsphere.local).
  3. Navigate to the following location “Library > vCenter > Configuration“.
  4. Run the following workflow “Remove a vCenter Server instance” (screenshot 01 & screenshot 02).
  5. Run the following workflow “Add a vCenter Server instance” (screenshot 03 & screenshot 04).
  6. Validate the vRealize Orchestrator Endpoint Status (screenshots 05).
  7. Run the item in vRealize Automation again.
  8. Everything should be working again.

Screenshots:

HPE Storage Controller Management (ssacli)

This time I decided to do a blog post about the HPE Smart Array RAID controllers with their wonderful ssacli tool. The tooling of HPE is very powerful because you can online manage a VMware ESXi host and migrate for example from a RAID 1 volume to a RAID 10 without downtime or change the read and write cache ratio.

So far as I know I haven’t seen an identical tool yet from the other server hardware vendors like Cisco, Dell EMC, IBM, and Supermicro. The main difference has always been that the HPE tool can perform the operation live without downtime. 

So far as I can remember it has been there for ages. It was already available for VMware ESX 4.0 and is still available in VMware ESXi 6.7. So thumbs-up for HPE :).

Let’s talk about controller support. The tool supports the most HPE SmartArray controllers over the last 10 to 15 years, for example, the Smart Array P400 was released in 2005 and is still working fine today.

Here is an overview of supported controllers:

  • HPE Smart Array P2XX
  • HPE Smart Array P4XX
  • HPE Smart Array P7XX
  • HPE Smart Array P8XX

HPE SSACLI – Location

In case you are using the HPE VMware ESXi custom images. The tool is already pre-installed when installing ESXi. The tool is installed as a VIB (vSphere Installable Bundle). This means it can also be updated with vSphere Update Manager.

Over the years the name of the HPE Storage Controller Tool has been changed and so has the location. Here is a list of locations that have been used for the last ten years for VMware ESXi:

# Location VMware ESXi 4.0/4.1/5.0
/opt/hp/hpacucli/bin/hpacucli

# Location VMware ESXi 5.1/5.5/6.0
/opt/hp/hpssacli/bin/hpssacli

# Location VMware ESXi 6.5/6.7
/opt/smartstorageadmin/ssacli/bin/ssacli


HPE SSACLI – Examples

I have collected some screenshots over the years. Screenshots were taken by doing maintenance on VMware ESXi servers. The give you an idea what valuable information can be shown.


HPE SSACLI – Abréviation

All commands have a short name to reduce the length of the total input provided to the ssacli tool:

### Shortnames:
- chassisname = ch
- controller = ctrl 
- logicaldrive = ld
- physicaldrive = pd 
- drivewritecache = dwc
- licensekey = lk

### Specify drives:
- A range of drives (one to three): 1E:1:1-1E:1:3
- Drives that are unassigned: allunassigned

HPE SSACLI – Status

To view the status of the controller, disks or volumes you can run all sorts of commands to get information about what is going on in your VMware ESXi server. The extensive detail is very useful for troubleshooting and gathering information about the system.

# Show - Controller Slot 1 Controller configuration basic
./ssacli ctrl slot=1 show config

# Show - Controller Slot 1 Controller configuration detailed
./ssacli ctrl slot=1 show detail

# Show - Controller Slot 1 full configuration
./ssacli ctrl slot=1 show config detail

# Show - Controller Slot 1 Status
./ssacli ctrl slot=1 show status

# Show - All Controllers Configuration
./ssacli ctrl all show config

# Show - Controller slot 1 logical drive 1 status
./ssacli ctrl slot=1 ld 1 show status

# Show - Physical Disks status basic
./ssacli ctrl slot=1 pd all show status

# Show - Physical Disk status detailed
./ssacli ctrl slot=1 pd all show status

# Show - Logical Disk status basic
./ssacli ctrl slot=1 ld all show status

# Show - Logical Disk status detailed
./ssacli ctrl slot=1 ld all show detail

HPE SSACLI – Creating

Creating a new logical drive can be done online with the HPE Smart Array controllers. I have displayed some basic examples.

# Create - New single disk volume
./ssacli ctrl slot=1 create type=ld drives=2I:0:8 raid=0 forced

# Create - New spare disk (two defined)
./ssacli ctrl slot=1 array all add spares=2I:1:6,2I:1:7

# Create - New RAID 1 volume
./ssacli ctrl slot=1 create type=ld drives=1I:0:1,1I:0:2 raid=1 forced

# Create - New RAID 5 volume
./ssacli ctrl slot=1 create type=ld drives=1I:0:1,1I:0:2,1I:0:3 raid=5 forced

HPE SSACLI – Adding drives to logical drive

Adding drives to an already created logical drive is possible with the following commands. You need to perform two actions: adding the drive(s) and expanding the logical drive. Keep in mind: make a backup before performing the procedure.

# Add - All unassigned drives to logical drive 1
./ssacli ctrl slot=1 ld 1 add drives=allunassigned

# Modify - Extend logical drive 2 size to maximum (must be run with the "forced" flag)
./ssacli ctrl slot=1 ld 2 modify size=max forced

HPE SSACLI – Rescan controller

To issue a controller rescan, you can run the following command. This can be interesting for when you add new drives in hot swap bays.

### Rescan all controllers
./ssacli rescan

HPE SSACLI – Drive Led Status

The LED status of the drives can also be controlled by the ssacli utility. An example is displayed below how to enable and disable a LED.

# Led - Activate LEDs on logical drive 2 disks
./ssacli ctrl slot=1 ld 2 modify led=on

# Led - Deactivate LEDs on logical drive 2 disks
./ssacli ctrl slot=1 ld 2 modify led=off

# Led - Activate LED on physical drive
./ssacli ctrl slot=0 pd 1I:0:1 modify led=on

# Led - Deactivate LED on physical drive
./ssacli ctrl slot=0 pd 1I:0:1 modify led=off


HPE SSACLI – Modify Cache Ratio

Modify the cache ratio on a running system can be interesting for troubleshooting and performance beanchmarking.

# Show - Cache Ratio Status
./ssacli ctrl slot=1 modify cacheratio=?

# Modify - Cache Ratio read: 25% / write: 75%
./ssacli ctrl slot=1 modify cacheratio=25/75

# Modify - Cache Ratio read: 50% / write: 50%
./ssacli ctrl slot=1 modify cacheratio=50/50

# Modify - Cache Ratio read: 0% / Write: 100%
./ssacli ctrl slot=1 modify cacheratio=0/100


HPE SSACLI – Modify Write Cache

Changing the write cache settings on the storage controller can be done with the following commands:

# Show - Write Cache Status
./ssacli ctrl slot=1 modify dwc=?

# Modify - Enable Write Cache on controller
./ssacli ctrl slot=1 modify dwc=enable forced

# Modify - Disable Write Cache on controller
./ssacli ctrl slot=1 modify dwc=disable forced

# Show - Write Cache Logicaldrive Status
./ssacli ctrl slot=1 logicaldrive 1 modify aa=?

# Modify - Enable Write Cache on Logicaldrive 1
./ssacli ctrl slot=1 logicaldrive 1 modify aa=enable

# Modify - Disable Write Cache on Logicaldrive 1
./ssacli ctrl slot=1 logicaldrive 1 modify aa=disable

HPE SSACLI – Modify Rebuild Priority

Viewing or changing the rebuild priority can be done on the fly. Even when the rebuild is already active. Used it myself a couple of times to lower the impact on production.

# Show - Rebuild Priority Status
./ssacli ctrl slot=1 modify rp=?

# Modify - Set rebuildpriority to Low
./ssacli ctrl slot=1 modify rebuildpriority=low

# Modify - Set rebuildpriority to Medium
./ssacli ctrl slot=1 modify rebuildpriority=medium

# Modify - Set rebuildpriority to High
./ssacli ctrl slot=1 modify rebuildpriority=high

HPE SSACLI – Modify SSD Smart Path

You can modify the HPE SDD Smart Path feature by disabling or enabling. To make clear what the HPE SDD Smart Path includes, here is a official statement by HPE: 

“HP SmartCache feature is a controller-based read and write caching solution that caches the most frequently accessed data (“hot” data) onto lower latency SSDs to dynamically accelerate application workloads. This can be implemented on direct-attached storage and SAN storage.”

For example, when running VMware vSAN SSD Smart Path must be disabled for better performance. In some cases worse the entire vSAN disk group fails.

# Note: This command requires the array naming type like A/B/C/D/E

# Modify - Enable SSD Smart Path
./ssacli ctrl slot=1 array a modify ssdsmartpath=enable

# Modify - Disable SSD Smart Path
./ssacli ctrl slot=1 array a modify ssdsmartpath=disable

HPE SSACLI – Delete Logical Drive

Deleting a logical drive on the HPE Smart Array controller can be done with the following commands.

# Delete - Logical Drive 1
./ssacli ctrl slot=1 ld 1 delete

# Delete - Logical Drive 2
./ssacli ctrl slot=1 ld 2 delete

HPE SSACLI – Erasing Physical Drives

In some cases, you need to erase a physical drive. This can be performed with multiple erasing options. Also, you can stop the process.

Erase patterns available:

  • Default
  • Zero
  • Random_zero
  • Random_random_zero
# Erase physical drive with default erasepattern
./ssacli ctrl slot=1 pd 2I:1:1 modify erase

# Erase physical drive with zero erasepattern
./ssacli ctrl slot=1 pd 2I:1:1 modify erase erasepattern=zero

# Erase physical drive with random zero erasepattern
./ssacli ctrl slot=1 pd 1E:1:1-1E:1:3 modify erase erasepattern=random_zero

# Erase physical drive with random random zero erasepattern
./ssacli ctrl slot=1 pd 1E:1:1-1E:1:3 modify erase erasepattern=random_random_zero

# Stop the erasing process on phsyical drive 1E:1:1
./ssacli ctrl slot=1 pd 1E:1:1 modify stoperase

HPE SSACLI – License key

In some cases a licence key needs to be installed on the SmartArray storage controller to enable the advanced features. This can be done with the following command:

# License key installation
./ssacli ctrl slot=1 licensekey XXXXX-XXXXX-XXXXX-XXXXX-XXXXX

# License key removal
./ssacli ctrl slot=5 lk XXXXXXXXXXXXXXXXXXXXXXXXX delete 

Related sources

A couple of interesting links related to the HPE Storage Controller tool for VMware ESXi:

Changing VMware Storage Controller to Paravirtual for CentOS 7

In this post, we are going to change the Virtual Storage Controller from LSI Logic Parallel to VMware Paravirtual for a CentOS 7 based Virtual Machine that is running on VMware vSphere. This blog post will contain step by step guidance for performing the operation.

In my case the virtual machine was build in VMware Workstation and after some time migrated to VMware ESXi. The VMware Paravirtual Storage Controller is not supported in VMware Workstation. That is why the virtual machine came over with the “wrong” storage controller.

My 24×7 Lab environment is running shared iSCSI based storage and all virtual machines are thin provisioned. The Virtual Machine that came over from VMware Workstation is installed with CentOS 7.

Why VMware Paravirtual?

Why should you want to migrate from an LSI Logic Parallel to a VMware Paravirtual SCSI Controller? Two simple reasons and they are two good ones:

  • Lower CPU utilization
  • Higher Throughput

Personally, I have a third reason to add… compliance. All my virtual machines should be compliant with the VMware Best Practice and my personal Home Lab standard. In my Lab environment, this means using the VMware Paravirtual where ever possible/supported.

VMware Official Statement 1:

PVSCSI adapters are high-performance storage adapters that can result in greater throughput and lower CPU utilization. PVSCSI adapters are best for environments, especially SAN environments, where hardware or applications drive a very high amount of I/O throughput. The VMware PVSCSI adapter driver is also compatible with the Windows Storport storage driver. PVSCSI adapters are not suitable for DAS environments.VMware Paravirtual SCSI adapters are high-performance storage adapters that can result in greater throughput and lower CPU utilization.

VMware Official Statement 2:

The PVSCSI adapter offers a significant reduction in CPU utilization as well as potentially increased throughput compared to the default virtual storage adapters, and is thus the best choice for environments with very I/O-intensive guest applications.



Procedure

The most important step in the process is to make sure you have a valid backup! After that, it is just following the steps described below:

  • Create a virtual machine snapshot or backup before you begin.
  • Power-off the virtual machine.
  • Add the VMware Paravirtual Controller to the Virtual Machine. Do not change the disk controller assignment yet, only add the storage controller to the VM (screenshot 01).
  • Power-on the virtual machine.
  • Login with an account on the virtual machine (account must be able to obtain root access).
  • Start rebuilding the initial ramdisk image (screenshot 02):
    mkinitrd -f -v /boot/initramfs-$(uname -r).img $(uname -r)
  • Power-off the virtual machine.
  • Assign disks to the new storage controller and remove the old storage controller (screenshot 03).
  • Power-on the virtual machine.
  • Validate that everything is working and disks are mounted (screenshot 04).
  • Remove the virtual machine snapshot or backup after you are done.

Screenshots

Here are some screenshots from the procedure:

Conclusion

At this point, I have swapped out three virtual machines from the LSI controller to the VMware Paravirtual SCSI Controller. The machines have been running now for about two weeks without any problems. So everything is compliant again ;).

If you encounter any problems or have any questions about this subject please feel free to contact me on Twitter or the Reply option below.



Source

Here are some interesting related articles that I used for creating this blog post: