So how did the problem occur? First, we start with some background information. My Lab environment is powered-on when needed and powered-off when not needed. This is, of course, a little bit different than a production 24×7 environment that you have in your datacenters worldwide.
The environment was booted successfully at first glance. We are talking about Domain Controllers, vCenter Server, VMware NSX-V, nested ESXi Hosts and vRealize Automation. When I started deploying virtual machines with a vRealize Automation (vRA) based on blueprints with vSphere Templates issues started to occur.
vRealize Automation was failing on the provisioning task and was cleaning up the deployment because of the failed state (default behavior). So it was time to dig into the underlying infrastructure.
When the issue occurred the following software versions were used in my lab environment:
- VMware vCenter 6.5 Update 2B
- VMware vRealize Automation 7.3.1
- VMware ESXi 6.5 Update 2
- VMware vSAN 6.6
Here is all the information that can be found in various locations surrounding the issue.
Error message: Screenshots
Here are the screenshots, the first one is from VMware vCenter and the second one is from vRealize Automation. As you can see there is clearly a problem.
Error message: vRealize Automation
Here is the vRealize Automation log entry related to the VMware vSAN issue:
Error in Execute DynamicOps.Common.Client.HtmlResponseException: Service Unavailable (503)
Error message: vCenter Server
Here is the VMware vCenter log entry related to the VMware vSAN issue:
A general system error occurred - PBM error occurred during PreCloneCheckCallback (2118557)
The solution is quick but is more like a quick fix because it comes back every time I start up my lab environment.
- Open a web browser.
- Navigate to your vCenter Server URL (https://%vc%/vsphere-client).
- Login with a user that has administrator credentials (email@example.com).
- Navigate to Hosts & Clusters > Select the vCenter Object.
- Click on the Configure tab.
- Click on the Storage Providers.
- Click on the following two buttons:
- Synchronizes all Storage Providers with the current state of the environment.
- Rescan the storage provider for new storage systems and storage capabilities.
- After pressing the buttons, you don’t see any tasks running on the vCenter Server (expected behavior). After 5 seconds everything should be working and provisioning should be possible.