Tag: ITQ

VCSA 6.7 Out of Space (SEAT)

Today I was greeted by the following error message when logging into the VMware vCenter Server also known as VCSA: “Could not connect to one or more vCenter Server systems: https://%fqdn%:443/sdk“. So it was time for a quick write-up on how to resolve this issue.

The issues were already present a couple of hours earlier based on monitoring and logging. For example, Veeam Backup & Replication tried to perform a backup but failed because there were no vSphere Tags available. Veeam Backup & Replication generated the following message “Tag Backup SLA – Bronze is unavailable, VMs residing on it will be skipped from processing.“.

I’m running a VMware vCenter Server as in a VCSA 6.7 appliance and it has an embedded Platform Services Controller. The exact version of the appliance was at the moment of the issue “6.7.0.32000 – Build 14070457“.

Could not connect to one or more vCenter Server systems:

At first glance everything looks fine, the web-interfaces are online, authentication is working but after login, the following message appears “Could not connect to one or more vCenter Server systems: https://%fqdn%:443/sdk“. None of the pages are displaying any content. Here is a screenshot:



After performing a simple reboot nothing had happened, the result was the same. So it was time to dig deeper. Luckily the reboot did trigger a new event in the Appliance Management Page (5480). It appeared that the /storage/seat disk had filled up. The alert that popped-up was “File system /storage/seat is low on database storage space. Increase the size of disk /storage/seat or decrease the data retention.” Here is a screenshot:



Increasing Disk Space

After finding the error message it appeared to be an easy fix. Here is an overview of the commands I used. The commands are also usable for expanding one of the other VCSA virtual disks.

Keep in mind: before increasing disk capacity make sure you have a backup or snapshot available.

It this case we are going to expand this /storage/seat volume. The seat volume is responsible for Stats, Events, Alarms, and Tasks (SEAT) for VMware Postgres (Database).

# Step 01: Connect with the vCenter Server with an SSH Session (use for example Putty).

# Step 02: Login with the root account (root/your-password).

# Step 03: Enable the shell
shell

# Step 04: Run the command to verify the current disk space:
df -h

# Step 05: Increase disk capacity with the Host Client because the vCenter Web-interface is not working ;) (see screenshots)

# Step 06: Run the disk expansion command, the expected output should be: VC_CFG_RESULT=0
vpxd_servicecfg storage lvm autogrow

# Step 07: Verify the disk again, the disk should be bigger!
df -h

# Step 08: Reboot the VCSA
reboot

# Step 09: Verify the working of the VCSA Appliance after reboot.

Here is a collection of screenshots of me performing the procedure.

Conclusion

VMware made it easy for the system administrators to identify the issue and quickly expand the virtual disk from the vCenter Appliance. This is a huge improvement compared to the past. The only thing you need to watch out for is the number of virtual disks connected to the VCSA. If you do not watch out you could expand the wrong disk.

The reason that the disk filled up was caused by two things in my case. 1) I created and destroyed lots of virtual machines in the days before the incident. 2) The VCSA is configured as a tiny footprint so that is why the disks are relatively small.

So this was the write-up! If you got any comments or questions please respond in the section below.

VMware ESXi Start a Virtual Machine from Shell

In this blog post, we are going to talk about VMware ESXi and the ability to start & stop a virtual machine from the ESXi shell.

Recently, I ran into a situation where the vCenter Server was powered-off manually and the ESXi host that was responsible was not able to open the VMware Host Client to start VMware vCenter with a single mouse click. So it was time to figure out how to boot a virtual machine from the VMware ESXi shell.

I was lucky because SSH was enabled on the ESXi Host so I was able to connect and log in with the root account, but then I ran into the issue… Which command do I need to power on a virtual machine? I knew for sure that it was possible but it took me some time to find the right commands.

So based on that experience, it was time for a quick write-up to show you how to boot a virtual machine from the shell with VMware ESXi. To complete the article I also added the commands for powering off.

Note: The environment was running on vSphere 6.7 Update 2. So all commands are valid for vSphere 6.7 and probably older versions of VMware vSphere.

Start a Virtual Machine from Shell

Here is a step-by-step procedure for booting a virtual machine from the VMware ESXi shell.

# Step 01: Connect with SSH (for example Putty).

# Step 02: Login as a user with root privileges

# Step 03a: View ESXi host virtual machine inventory
vim-cmd vmsvc/getallvms

# Step 03b: View ESXi host virtual machine inventory with filter
vim-cmd vmsvc/getallvms | grep %VMname%

# Step 04: Write-down the VMid, in my case:
183

# Step 05: Verify the current power status
vim-cmd vmsvc/power.getstate %VMid%

# Step 06: Power-on virtual machine
vim-cmd vmsvc/power.on %VMid%

# Step 07: Command has been executed the virtual machine will be power-on. To verify you can use:
vim-cmd vmsvc/power.getstate %VMid%

Screenshots

Here are the screenshots of performing a VMware vCenter virtual machine startup from the VMware ESXi shell.

Stop a Virtual Machine from Shell

Here is the procedure for stopping a virtual machine from the VMware ESXi Shell.

# Step 01: Connect with SSH (for example Putty).

# Step 02: Login as a user with root privileges

# Step 03a: View ESXi host virtual machine inventory
vim-cmd vmsvc/getallvms

# Step 03b: View ESXi host virtual machine inventory with filter
vim-cmd vmsvc/getallvms | grep %VMname%

# Step 04: Write-down the VMid, in my case:
183

# Step 05: Verify the current power status
vim-cmd vmsvc/power.getstate %VMid%

# Step 06: Power-on virtual machine
vim-cmd vmsvc/power.off %VMid%

# Step 07: Command has been executed the virtual machine will be power-off. To verify you can use:
vim-cmd vmsvc/power.getstate %VMid%

Conclusion

You can easily perform this procedure if you know the right commands. There is not a lot of new information available about the vim-cmd command but I added the source(s) below.

ITQ Transform! 2019

On the twenty-sixth of September 2019, it was time for our yearly event! For the third time in a row under the name ITQ Transform! This year the theme of the event was “Empower your Business”.

The event was located in Fort Voordorp near Utrecht in Holland. A central spot in the country with enough parking space to accommodate all attendees.

For the attendees, there was choice enough to select an appropriate session. The sessions were grouped into three main categories:

  • Accelerate your cloud journey
  • Empower the Digital Workspace
  • Modernizing your Apps

There were also multiple guest speakers how we’re responsible for the opening and closing keynotes:

  • Spencer Pitts – VMware
  • Wouter Sliedrecht – Pivotal
  • Robert Doornbos – F1 driver


TeraSky & ITQ

The biggest announcement that was made on ITQ Transform was about TeraSky & ITQ. TeraSky is a premier VMware Partner that is located in Israel and ITQ is an enterprise VMware partner that is located in the Benelux.

Combining there effort they join forces to help customers around the world. The company will be called SKY-TQ. Here is the announcement video that was shown on ITQ Transform:

ITQ Transform Session

This year I did a customer case session with a VMware Cloud Provider called ViaData. It was a combined effort with Jitse Hijlkema about the first Cloud Provider Pod that was deployed in the Benelux. Because of our experience with the product, it was time to talk about it in the form of a knowledge sharing session.

For those who are interested here is the presentation of our slide deck. The drawback is that the entire event was in Dutch… so we are sorry if you are an English viewer.

View Slide Deck:

SDDC Based on Cloud Provider Pod
Transform! 2019 – Jitse & Mischa – SDDC based on Cloud Provider Pod


ITQ Transform! Showcase

Here are some photos of the ITQ Transform! 2019 event. You can click on the image for the full-color version.

Note: Not all pictures are very clear… somehow my Pixel 3 phone didn’t like the lighting setup.

For those who are interested in the past, I also wrote a blog post about ITQ Transform! 2017.

vCloud Director Appliance Keystore Password

Recently I was deploying vCloud Director (vCD) at a customer and I was experiencing some issues surrounding the Java Keystore password that is configured out of the box (OOTB) in the vCD Cell.

Before digging into the topic, let’s talk about the latest vCloud Director releases. Since the release of vCloud Director 9.5, the software is available as a virtual appliance. In the releases before vCloud Director was released as an installable package for on a Linux Distribution. So a lot has changed related to the vCloud Director software package.

Personally, I really like the vCloud Director appliance since it reduces the complexity and the deployment time at the customer. So that is a good thing!



Why Keystore Access

To install an SSL certificate on a vCloud Director Cell you need access to Java Keystore for replacing the default certificate. The default certificate installed is a self-signed certificate, so not usable for a production environment with a cloud management portal available in the outside world.

So to install a new SSL Certificate that has been issued by a third-party certificate authority we need access to the Java Keystore from the vCloud Director Cell.

The problem is there is no password listed in any of the manuals or online documentation. So that is not really ideal…

After a lot of searching and trying we came up with the following solution for vCloud Director 9.5 and vCloud Director 9.7.

vCloud Director 9.5 Keystore Password

To verify the default Java Keystore Password on your vCloud Director 9.5 cell you need to run the following commands:

  1. Open an SSH Session with your vCloud Director Cell with for example Putty.
  2. Login with your root account.
  3. Run the following command:
cat /opt/vmware/etc/isv/firstboot | grep 'keystore-password'

Here are two screenshots of the full code and the output from the command above.

In all the cases I have tested so far the standard password is “akimbi“. Not really the most save solution in my opinion. You can also create a new Java Keystore to work around the problem and to set a selfdefined password.



vCloud Director 9.7 Keystore Password

I also tested the same operation on a vCloud Director 9.7 Cell and the default password has been removed from the firstboot script. If you look closely at the firstboot code the entire script is overhauled and the script now responsible for certificate generation is “generate-certificates.sh“.

The root password configured at deployment is now the default password for the Java Keystore on the appliance. Keep in mind: When changing the root password the old password is still valid for the Java Keystore but not as login. Also, this is not a very ideal scenario in my opinion.

Here are two screenshots of the code:

Final word

This concludes my blog post about the vCloud Director Java Keystore Password. I wasted more time on this subject than expected… so it was time for a proper write-up for the vCommunity.

VMware ESXi SATADOM Boot Device

In the blog post, I am showing you how to deal with SATADOM boot devices in your ESXi Hosts. Recently I replaced my SD cards with SATADOMs in all my ESXi Hosts in my HomeLab. This blog post is about my experience and configuration that was required for HPE ProLiant servers.

SD Card

In the past, I always used SD cards in my VMware ESXi servers as a boot media but overtime SD cards would wear out of fail. This is of course not ideal but the costs of replacing an SD card are quite low compared to 2.5-inch drives for example. So a nice alternative is a SATADOM, a fast, cheap and more reliable solution.

Here are some screenshots of my Home Lab environment with a failed SD card. The ESXi Host is still fully operational but has lost its boot device. In most cases, you can reboot the ESXi Host and it will work for about three days and the issue is back.



SATADOM

So after a couple of failures over the years, it was time to replace the SD cards with a SATADOM. The installation is quite simple but you need to verify some stuff… some SATADOMs use external power and some receive their power from the SATA connector (please verify this before buying).

In my case, I bought a SATADOM with an external power source. This because my ProLiant servers do not have SATA Ports with a power feature. I ended up buying a Delock SATA 6 Gb/s Flash Module 16 GB vertical (part nr 54655) in a webshop in Holland.

The “biggest” issue I encountered was configuring the BIOS in a way that the device was correctly detected. Here are the screenshots related to the BIOS settings and SATA port used on the motherboard. It appeared that the ML10v2 expected the SATADOM to be connected to port 5, on other ports, it was not working or it was not detected by VMware ESXi.

Here is a recording of the HP ProLiant ML10 v2 booting from SATADOM after a successful ESXi installation. Compared to the SD card the boot time has been reduced with 50%. Speed is of course always nice to have but how many times do you boot an ESXi host in a production environment? On the other hand… it could be very useful for a Lab Environment that is not running 24×7 and you boot your ESXi Hosts on a daily basis.

HP ProLiant ML10 v2 – Booting from SATADOM


VMware vSAN Requirements

So let’s look at the official requirements for VMware vSAN when using a SATADOM as boot media. Note: based on the amount of physical memory installed in your ESXi Host the requirements change!

  • When you boot a vSAN host from a SATADOM device, you must use single-level cell (SLC) device. The size of the boot device must be at least 16 GB.
  • If the memory of the ESXi host has 512 GB of memory or less, you can boot the host from a USB, SD, or SATADOM device.
  • If the memory of the ESXi host has more than 512 GB, consider the following guidelines.
    • You can boot the host from a SATADOM or disk device with a size of at least 16 GB. When you use a SATADOM device, use a single-level cell (SLC) device.
    • If you are using vSAN 6.5 or later, you must resize the coredump partition on ESXi hosts to boot from USB/SD devices. For more information, see the VMware knowledge base article at http://kb.vmware.com/kb/2147881.

Sources

Here is a list of sources I used for writing this article.

vSphere 6.7 Convergence Tool: Failed to get vecs users and permissions

Last week I was converting a vSphere 6.7 Update 1 environment from external PSC to embedded PSC. After a couple of seconds running the conversion, it ended in an error message (Failed to get vecs users and permissions).

The customer was using the latest available vCenter 6.7 update 1 release available at this point vCenter Appliance 6.7 U1b (11727113). The environment consists of one Platform Services Controller (PSC) and one vCenter Server (VC) and a couple of VMware ESXi 6.7 Update 1 hosts.



Error Message

The error message in my PowerShell window displayed the following error message. Not really the best message (possible resolution is []) but it pointed me in the right direction.

### PowerShell output from vcsa-util.exe
2019-05-07 11:07:58,538 [loggable.py:102]: ================ [FAILED] Task: MonitorPSCDeployTask: Running MonitorPSCDeployTask execution failed at 11:07:58 ================
2019-05-07 11:07:58,553 [loggable.py:102]: Task 'MonitorPSCDeployTask: Running MonitorPSCDeployTask' execution failed because [ERROR: Converge Process Failed!], possible resolution is []
2019-05-07 11:07:58,553 [loggable.py:102]: ================================================================================
2019-05-07 11:07:58,631 [taskflow.py:943]: <MonitorPSCDeployTask - com.vmware.vcsa.installer.converge.monitor_psc_deploy(FAILED)> in <ConvergeTaskFlow - converge(FAILED)> status changed to: FAILED
2019-05-07 11:07:58,694 [taskflow.py:641]: Execution attempt 1 for Task <MonitorPSCDeployTask - com.vmware.vcsa.installer.converge.monitor_psc_deploy(FAILED)> FAILED with exception: ERROR: Converge Process Failed!
2019-05-07 11:07:58,694 [taskflow.py:672]: Finished executing <MonitorPSCDeployTask - com.vmware.vcsa.installer.converge.monitor_psc_deploy(FAILED)> and its status is FAILED
2019-05-07 11:07:58,694 [taskflow.py:675]: <ConvergeTaskFlow - converge(FAILED)> overall status is now FAILED

Inside the “converge_mgmt.log” logfile the following error was displayed see output below. The log file can be found on the following location on your local system: “C:\Users\User\AppData\Local\Temp\vcsaCliInstaller-2019-05-07-11-25-6pn5b67r\workflow_1557228307149\converge\converge_mgmt.log“. Keep in mind, the file path is dynamic and I was using Microsoft Windows.

2019-05-07T11:07:46.688Z ERROR converge Failed to get vecs users and permissions. Error: {
    "componentKey": null,
    "problemId": null,
    "detail": [
        {
            "id": "install.ciscommon.command.errinvoke",
            "localized": "An error occurred while invoking external command : 'Command: ['/usr/lib/vmware-vmafd/bin/vecs-cli', 'entry', 'getcert', '--store', 'APPLMGMT_PASSWORD', '--alias', 'location_password_default', '--output', '/root/velma/old_certs/APPLMGMT_PASSWORD.crt']\nStderr: Error: No certificates were found for entry [location_password_default] of type [Secret Key].\nvecs-cli failed. Error 87: Operation failed with error ERROR_INVALID_PARAMETER (87) \n'",
            "translatable": "An error occurred while invoking external command : '%(0)s'",
            "args": [
                "Command: ['/usr/lib/vmware-vmafd/bin/vecs-cli', 'entry', 'getcert', '--store', 'APPLMGMT_PASSWORD', '--alias', 'location_password_default', '--output', '/root/velma/old_certs/APPLMGMT_PASSWORD.crt']\nStderr: Error: No certificates were found for entry [location_password_default] of type [Secret Key].\nvecs-cli failed. Error 87: Operation failed with error ERROR_INVALID_PARAMETER (87) \n"
            ]
        }
    ],
    "resolution": null
}
2019-05-07T11:07:46.706Z INFO converge Cleanup successful with partial flag = True.


Solving the issue

After searching on Google on the string “ERROR converge Failed to get vecs users and permissions“. I got a hit on a VMware KB article. The VMware article can be found below and explained what was going wrong.

The solution is very simple… remove the vCenter Backup Schedule in the VAMI (VMware Appliance Management Interface):

Procedure:

  1. Log into the vCenter Server Appliance Management Interface (https://%vcenter-fqdn%:5480)
  2. Login with the root account.
  3. Navigate to the Backup view
  4. Next to Backup Schedule, click the Delete button to delete the current backup schedule
  5. Attempt the convergence process again!
  6. Once the convergence is complete, re-create the backup schedule. See Schedule a File-Based Backup for more information on creating a backup schedule.

Community Feedback

I got the following feedback on this article after publishing:

  • Update 08-04-2019: David Stamen reached out to me on Twitter with the response: This was fixed in #vSphere67U2.

Sources

The following websites were very usefull for troubleshooting this issue:

No Workflow Output in vRealize Orchestrator (vRO) 7.6

About a year ago I wrote a blog about the following issue: No Workflow Output in vRealize Orchestrator (vRO) 7.4. Apparently, after the vRealize Orchestrator (vRO) 7.6 upgrade the issue has returned.

I will not go into full detail in this blog post. Just a basic instruction on how to resolve the issue in vRealize Orchestrator 7.6. More background information can be found over here with additional screenshots.

Note: vRealize Orchestrator 7.6 was released on 11-04-2019 and can be downloaded over here. The vRealize Orchestrator 7.6 release notes can be found over here.



No Workflow Output

Let’s start with a small introduction to the issue. After an upgrade from vRealize Orchestrator 7.5 to vRealize Orchestrator 7.6 the (legacy client) is not able to show any workflow logs after executing a workflow run.

To make it perfectly clear… the workflow is executed and is working fine. The logs are just not displayed in the vRealize Orchestrator Client.

Here is an example, the workflow has been executed and it should output information but the logs tab is empty. Keep in mind: this image is from a vRO 7.4 instance but looks identical to vRO 7.6.

Restore the workflow output

Here are the commands for resolving the issue in vRealize Orchestrator ( vRO) 7.6 . The fix can be applied in under 10 minutes by a system administrator.

Before removing the files it is good practice to make sure you have a backup or virtual machine snapshot of your vRO appliance.

### Step 01: Start an SSH session with the vRealize Orchestrator Appliance (use for example Putty).

### Step 02: Login with root credentials

### Step 03: When you run the following command multiple files will be shown:
ls -l /var/log/vco/app-server/scripting.log_lucene*

### Step 04: Stop the Orchestrator service
service vco-server stop

### Step 05: Remove the log files
rm -rf /var/log/vco/app-server/scripting.log_lucene* 

### Step 06: Start the Orchestrator service
service vco-server start

### Step 07: Open the vRealize Orchestrator Client

### Step 08: Execute a workflow and logging should be working again.

I have tested it so far on two vRealize Orchestrator 7.6 appliances that were upgraded from vRealize Orchestrator 7.5. In both cases the upgrade was successful but the workflow output was not working.

There might be some vRO 7.5 to vRO 7.6 upgrades that will work without issues… like an Orchestrator that is just sitting idle or maybe a clean install that is directly upgraded from 7.5 to 7.6?

If you got any comments or tips please respond below!

PowervRA: Invoke-RestMethod The underlying connection was closed.

At a customer, I encountered the following issue when trying to connect with PowervRA to vRealize Automation. The error message that appeared was: Invoke-RestMethod : The underlying connection was closed: An unexpected error occurred on a send.

Let go one step back: So what is PowervRA you might ask? PowervRA is a PowerShell Toolkit to manage VMware vRealize Automation (vRA). With PowervRA you can configure and manage your vRealize Automation environment, for example, create a new tenant, assigning permissions or viewing the user’s requests.

The problem

The problem started by connecting with PowervRA to vRealize Automation (vRA). There was no way to get a successful connection. I tried using the IP addresses, hostname and FQDN also different credentials didn’t make any difference. The error that returned in all cases was identical.

The customer was using the latest version of PowervRA. At this moment it was PowervRA 3.5.0. The vRealize Automation version they were using was 7.4.0.

Here is the screenshot of the error message:

PowervRA - Invoke-RestMethod : The underlying connection was closed
PowervRA – Invoke-RestMethod : The underlying connection was closed – Issue

Here is the full error message in plain text from the PowerShell Console:

Error message:
Invoke-RestMethod : The underlying connection was closed: An unexpected error occurred on a send.
At C:\Program Files\WindowsPowerShell\Modules\PowervRA\3.5.0\PowervRA.psm1:510 char:21
+         $Response = Invoke-RestMethod @Params
+                     ~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-RestMethod], WebException
    + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeRestMethodCommand

To validate the issue further I tried the same thing in my Lab environment. The strange thing was that everything was working fine with the identical versions.



The solution

Until this moment I am not really sure why it is working in one environment and not in the other… I suspect it has something to do with Windows Updates or Domain Security Policies? To address the issue there is only one way: force PowerShell/PowervRA to use TLS 1.2 when connecting with vRealize Automation (vRA).

Procedure:

  1. Open the PowerShell command-prompt as administrator.
  2. Run the following command before connecting to vRealize Automation. The command is listed below. No output is expected after running this command.
  3. Run the Connect-vRAServer PowerShell command to start a session with vRealize Automation. Everything should be working and authentication should be possible.

PowerShell code

Copy and paste the code into your PowerShell console before connecting to vRealize Automation:

[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12

Screenshot

Here is a screenshot after the TLS 1.2 has been forced:

PowervRA - Invoke-RestMethod : The underlying connection was closed
PowervRA – Invoke-RestMethod : The underlying connection was closed – Solved

Source

Here is the official GitHub page related to PowervRA:

vRealize Automation – Failed to convert external resource %VM-Name%.

I ran into an error message today with vRealize Automation (vRA). The error message that came up was: Failed to convert external resource Prod-Fin-00012. The issue occurred in vRA version 7.3.1.

Inside the vRealize Automation portal, I tried to upgrade virtual machine hardware but it failed directly when issuing the request. Strange thing was it was working a couple of day ago. After some investigating the error also came back on other day-2 tasks. So it was time to dive deeper into the issue.

Here is a screenshot of the issue:

vRealize Automation - Error Message - Failed to convert external resource
vRealize Automation – Error Message – Failed to convert external resource

The Cause

So let us think about what vRealize Automation is performing, it is executing a task on a virtual machine. To perform this it needs to talk to vCenter Server and to talk to vCenter Server it uses vRealize Orchestrator.

Here is a simple overview of the communication that happens in this case. vRealize Automation is communicating to vRealize Orchestrator and vRealize Orchestrator is communicating to vCenter Server.

VMware vRealize Automation - vSphere Endpoint Communication
VMware vRealize Automation – vSphere Endpoint Communication

Error messages

The following error messages were found on the following systems:

vRealize Automation error message:

Error message: Failed to convert external resource Prod-Fin-00012.
Script action com.vmware.vcac.asd.mappings/mapToVCVM failed.

vRealize Orchestrator error message:

https://LAB-VC-A.Lab.local:443/sdk (unusable: java.lang.ClassCastException: com.vmware.vcac.authentication.http.spring.oauth2.OAuthToken cannot be cast to com.vmware.vim.sso.client.SamlToken)

As you can see here vRealize Orchestrator has communication issues with VMware vCenter Server. This issue needs to be addressed for vRealize Automation.

Screenshots:



The Solution

After finding the vRealize Orchestrator vSphere endpoints in an error state it was clear that this was the issue. vRealize Orchestrator is not successfully communicating with vCenter Server so this needs to be addressed.

Procedure:

  1. Open the vRealize Orchestrator Client (https://%vro-node-fqdn%).
  2. Login with administrative credentials (example: administrator@vsphere.local).
  3. Navigate to the following location “Library > vCenter > Configuration“.
  4. Run the following workflow “Remove a vCenter Server instance” (screenshot 01 & screenshot 02).
  5. Run the following workflow “Add a vCenter Server instance” (screenshot 03 & screenshot 04).
  6. Validate the vRealize Orchestrator Endpoint Status (screenshots 05).
  7. Run the item in vRealize Automation again.
  8. Everything should be working again.

Screenshots:

vSAN: PBM error occurred during PreCloneCheckCallback

Lately, I encountered some issues related to VMware vSAN in my Lab environment. The error message that was popping up all the time was “PBM error occurred during PreCloneCheckCallback“.

So how did the problem occur? First, we start with some background information. My Lab environment is powered-on when needed and powered-off when not needed. This is, of course, a little bit different than a production 24×7 environment that you have in your datacenters worldwide.

The environment was booted successfully at first glance. We are talking about Domain Controllers, vCenter Server, VMware NSX-V, nested ESXi Hosts, and vRealize Automation. When I started deploying virtual machines with a vRealize Automation (vRA) based on blueprints with vSphere Templates issues started to occur.

vRealize Automation was failing on the provisioning task and was cleaning up the deployment because of the failed state (default behavior). So it was time to dig into the underlying infrastructure.

Environment

When the issue occurred the following software versions were used in my lab environment:

  • VMware vCenter 6.5 Update 2B
  • VMware vRealize Automation 7.3.1
  • VMware ESXi 6.5 Update 2
  • VMware vSAN 6.6

Error message(s)

Here is all the information that can be found in various locations surrounding the issue. Lets start with the screenshots. The first one is from VMware vCenter and the second one is from vRealize Automation. As you can see there is clearly a problem.

And here is an overview of the error message(s). Here is the vRealize Automation log entry related to the VMware vSAN issue:

Error in Execute DynamicOps.Common.Client.HtmlResponseException: Service Unavailable (503)

Here is the VMware vCenter log entry related to the VMware vSAN issue:

A general system error occurred - PBM error occurred during PreCloneCheckCallback (2118557)

Solution

The solution is quick but is more like a quick fix because it comes back every time I re-start my lab environment (cold boot).

Procedure:

  • Open a web browser.
  • Navigate to your vCenter Server URL (https://%vc%/vsphere-client).
  • Login with a user that has administrator credentials (administrator@vsphere.local).
  • Navigate to Hosts & Clusters > Select the vCenter Object.
  • Click on the Configure tab.
  • Click on the Storage Providers.
  • Click on the following two buttons:
    • Synchronizes all Storage Providers with the current state of the environment.
    • Rescan the storage provider for new storage systems and storage capabilities.
  • After pressing the buttons, you don’t see any tasks running on the vCenter Server (expected behavior). After 5 seconds everything should be working and provisioning should be possible.

Wrap-up

Thanks for reading this blog post. If you have any comments, please respond in the comment section below!