Home » ITQ

Tag: ITQ

VyOS Configuring Management VRF

In the latest release of VyOS, a new feature has been added to the product called VRF. VRF or Virtual Routing and Forwarding is a technology that makes it possible to create multiple routing tables on a single router. In this blog post, we are going to set up a VyOS management VRF for out-of-band management traffic.

VRF is for a lot of people in network land a known technology and is leveraged in companies all over the world. The only limit was that VyOS was not capable of running a VRF before. So after the release of the VRF feature is was time to figure out if it working as I would expect it.

So what is a VRF?

I already talked a little bit about Virtual Routing and Forwarding but here is the official statement from the Wikipedia website:

“Virtual routing and forwarding (VRF) is a technology that allows multiple instances of a routing table to co-exist within the same router at the same time. One or more logical or physical interfaces may have a VRF and these VRFs do not share routes therefore the packets are only forwarded between interfaces on the same VRF. VRFs are the TCP/IP layer 3 equivalent of a VLAN. Because the routing instances are independent, the same or overlapping IP addresses can be used without conflicting with each other. Network functionality is improved because network paths can be segmented without requiring multiple routers.”

Goal

The goal for me was to create an out-of-band management interface on my virtual VyoS router that is running on VMware vSphere. This can only be achieved by the new VRF feature because you get an extra/new routing table that is used by the VRF only. The main reason for me was to split the SSH and SNMP traffic from the rest of the traffic. One of the perks of having a dedicated interface is to improve security and it makes creating firewall rules easier because all of the out-of-band interfaces are in one dedicated network.

Here is an overview of the vSphere VM running VyOS with two virtual network cards connected. As you can see one NIC is connected to a portgroup that allows multiple VLANs and the other is connected to a dedicated network for out-of-band management.

VRF Configuration

Now it is time to start configuring VyOS to leverage the VRF. Below you will find the IP addresses that I have used as an example in this blog post.

The first step is setting up an interface that will be leveraged by the VRF in the next part of the configuration.

### Create a new interface
set interfaces ethernet eth1 address 192.168.200.1/24

### Set interface description (optional)
set interfaces ethernet eth1 description 'Dedicated Out-of-Band Management Interface'

Now it is time to set up the VRF configuration and link it to the newly created interface. After that point, the VyOS Management VRF should be reachable in the network.

### Create a VRF called OOB-Management with a new routing table
set vrf name OOB-Management table 100

### Add a description
set vrf name OOB-Management description Out-Of-Band_Management

### Assign the physical interface to the VRF
set interfaces ethernet eth1 vrf OOB-Management

### Add a static route for the VRF to get access to a gateway
set protocols vrf OOB-Management static route 0.0.0.0/0 next-hop 192.168.200.254

Here are some troubleshooting commands that I used when configuring the VRF on VyOS.

### Routing table VRF
show ip route vrf OOB-Management

### Ping
ping 192.168.200.254 vrf OOB-Management

Now it is up and running it is time to set up the out-of-band management services. In my case, this will be SSH & SNMP. SSH is used for access to the command-line of the VyOS router and SNMP is used for monitoring.

### SSH - Activate the service on the VRF
set service ssh vrf OOB-Management

### SSH - Active listing address for SSH on Out-of-Band network
set service ssh listen-address 192.168.200.1

### SNMP - Active the service on the VRF
set service snmp vrf OOB-Management

### SNMP - Add permissions
set service snmp community routers authorization ro
set service snmp community Public
set service snmp community routers client 192.168.200.20

### SNMP - Set the location and contact
set service snmp location "Be-Virtual.net - Datacenter"
set service snmp contact "admin@be-virtual.net"

### SNMP - Activate the listening address
set service snmp listen-address 192.168.200.1 port 161


Here is some information about my IP numbers:

  • VyOS IP Address for Out-of-Band Management = 192.168.200.1
  • Gateway of the Out-of-Band Management network = 192.168.200.254
  • Monitoring server that monitors with SNMP = 192.168.200.100

Wrap-up

The VRF feature that is added to VyOS is really great! It is a great addition to an already great product. There are a lot of use cases think about multiple routers with different routing protocols running on a single VyOS box with there own routing table.

For me, this was an easy step to test the VRF feature with the Out-of-Band management test. This is just the first of testing the VRF. The next step will be to connect with my lab environment and leveraging BGP. Currently, I am running multiple boxes for multi-site just to test VMware NSX-T in my Lab environment. This can be simplified with VRFs!

Thanks for reading this blog post and see you next time. If you have any comments? Please respond below! 🙂

Sources

Here are some sources I used for setting up the management VRF:

vRealize Automation 7 – Creating Business Groups Automatically

In the blog post were are going to automatically create Business Groups in vRealize Automation 7.X. This can be handy when a customer has a lot of Business Groups and adds additional Business Groups overtime. So it was time to write a little bit of code that makes my life easier.

I wrote it in the first place for using it in my lab environment to set up vRealize Automation 7.X quickly for testing deployments and validating use cases.

Advantages of orchestrating this task:

  • Quicker
  • Consistent
  • History and settings are recorded in vRealize Orchestrator (vRO)

Environment

My environment where I am testing this vRO workflow is my Home Lab. At home, I have a Lab environment for testing and developing stuff. The only products you need for this workflow are:

  • vRealize Automation 7.6 in short vRA.
  • vRealize Orchestrator 7.6 in short vRO.

Note: The vRealize Automation endpoint must be registered to make it work.

vRealize Orchestrator Code

Here is all the information you need for creating the vRealize Orchestrator workflow:

  • Workflow Name: vRA 7.X – Create Business Group
  • Version: 1.0
  • Description: Creating a vRealize Automation 7.X Business Group in an automated way.
  • Inputs:
    • host (vCACCAFE:VCACHost)
    • name (string)
    • adname (string)
  • Outputs:
    • None
  • Presentation:
    • See the screenshots below.

Here is the vRealize Orchestrator code in the Scriptable Task:

// Variables
var domain = "company.local";
var mailDomain = "company.com";

// Input validation
if (!domain) {
	throw "Defined variable 'domain' cannot be null";
}
if (!mailDomain) {
	throw "Defined variable 'mailDomain' cannot be null";
}
if (!host) {
	throw "Input variable 'host' cannot be null";
}
if (!name) {
	throw "Input variable 'name ' cannot be null";
}
if (!adname) {
	throw "Input variable 'adname' cannot be null";
}

// Construct Group Object
var group = new vCACCAFEBusinessGroup();
	group.setName("BG-" + name);
	group.setDescription("vRA Business Group: BG-" + name);
	group.setActiveDirectoryContainer("");
	group.setAdministratorEmail("vra-admin" + "@" + mailDomain);
	group.setAdministrators(["vra-admin@vsphere.local", "vra_" + adname + "@" + domain]);
	group.setSupport(["vra-admin@vsphere.local", "vra_" + adname + "@" + domain]);
	group.setUsers(["vra_" + adname + "@" + domain]);

// Create the group; return the ID of the group.
var service = host.createInfrastructureClient().getInfrastructureBusinessGroupsService();
var id = service.create(group);

// Get the SubTenant entity from vRA
group = vCACCAFEEntitiesFinder.findSubtenants(host , "BG-" + name)[0];

// Add custom property to Business Group
vCACCAFESubtenantHelper.addCustomProperty(group, "Company.BusinessGroup", name, false, false);

// Create update client and save the local entity to the vRA entity
var service = host.createAuthenticationClient().getAuthenticationSubtenantService();
	service.updateSubtenant(group.getTenant(), group);

Screenshots

Here are some screenshot(s) of the Workflow configuration that helps you set up the workflow as I have done!

Wrap-up

This is a vRealize Orchestrator workflow example that I use in my home lab. It creates vRealize Automation Business Groups to improve consistency and speed.

Keep in mind: Every lab and customer is different. In this workflow I use for example the prefix BG- for Business Groups. What I am trying to say is modify it in a way that is bested suited for your environment.

Thanks for reading and if you have comments please respond below.

Synology DS1618+ Homelab Review

This blog post is about replacing my Synology DS1515+ with a Synology DS1618+. I was forced to replace my Synology DS1515+ because it fell victim to the Intel Atom bug twice. The Synology is used for my primary storage in my VMware Home Lab.

This blog post is a bit later than expected to be honest… I already swapped out the Synology NAS about eleven months ago! So this is going to be a review based on my eleven-months experience and so information about why I bought the DS1618+ as a replacement.

Synology DS1515+ Atom Bug

In about six months two Synology DS1515+ past away in my Home Lab because of a hardware issue. One day they are working as they should and the next day you come home and they are dead. No lights, no sound, nothing is working “Bricked”.

The Synology DS1515+ is not a bad device… but it is using the Intel Atom C2000 CPU that is notorious for failing because it has an internal fault.

To get it clear it is not the fault of Synology… A lot of other vendors are also dealing with the Intel Atom C2000 fall out. Like Asrock, Cisco, HP, Netgear, Supermicro, and this list goes on. Here is an article from The Register with some more information surrounding this topic.

That is enough about the old let’s move on to the new!



Synology DS1618+ Setup

Here is an overview of the current Synology DS1618+ setup in my Home Lab environment. I have created two LACP bonds to load balancing iSCSI traffic from VMware ESXi on two dedicated VLANs.

  • Synology DS1618+ (default 4 GB memory/upgraded to 32 GB)
  • Storage pool 1: 2x Samsung EVO 850 500 GB – RAID 1
  • Storage pool 2: 2x Samsung EVO 860 500 GB – RAID 1
  • Storage pool 3: 2x Samsung EVO 860 500 GB – RAID 1
  • Network: 2x 1 Gbit LACP and 2x 1 Gbit LACP

All three storage pools represent a VMware Datastore and are made available with iSCSI to the VMware Hosts.

Here is an image that illustrates the current storage setup of my Home Lab environment. Nothing too fancy, all ports in the illustration are 1 Gbit.

Performance

Let’s start by looking at the Synology DS1618+ performance! An important aspect in my environment, it is not the size that matters but the speed!

Network

I have moved my SSD drives from the Synology DS1515+ to the Synology DS1618+ and the performance is identical… Say what? This is because the are limited to the same issue! Both devices are running against the network bandwidth limitation.

Both devices are out of the box delivered with 4x 1 Gbit network interfaces which can be easily matched by the three storage pools that I have installed.

Luckily the DS1618+ has an expansion slot, this is something the DS1515+ does not have! You can install a 10 Gbit network card which will improve the bandwidth drastically!

Memory

Already the memory issues/limitations in another blog post. Here is a reference to that blog post on my website.



Power Usage

Like all my Home Lab devices I like to know what the power usage is of each device. Synology indicates the following power consumption values on their website:

Factory measurementsWattage
Power Consumption – HDD Hibernation25.76 Watt
Power Consumption – Access 56.86 Watt

I have tested this with my power meter. In my case, the system was booted up and was supplying two ESXi Host with storage and a total of fourteen active virtual machines. The room temperature was 20 degrees celsius. I personally think 21.1 watts is not bad at all 🙂 surely compared to the DS1515+ that was using 25.3 watts with two drives less!

Tips

Here are some tips I have learned so far about the Synology DS1618+ unit:

  • If you are in need of performance install a 10 Gbit expansion card in the expansion slot of the DS1618+. Surely when using all-flash storage! This will easily outperform the out of the box network cards (4x 1 Gbit).
  • Install as much memory as you can in the device, this will reduce the disk swapping of the Synology OS and increase the performance and stability of the virtual machines running. Here is my blog post about this issue.
  • I have performed some tests with a cache drive that was an SSD device with a storage pool that was also an SSD device this did not improve performance (a maximum of about 5% in total, which is quite low if you ask me). If you are interested in a cache drive look at the NVMe expansion card but beware you only got one slot so… or you go with an NVMe expansion card or 10 Gbit NIC. So choose wisely depending on your requirements.

If you got some additional tips for people who are interested in a DS1618+ please respond below!

Sources

Here are some interesting websites related to the Synology DS1618+:

vRealize Automation 8 Changing Product License

After a recent deployment in my Lab environment with a new vRealize Automation 8 installation I figured out that my NFR product license was about to expire within a week. So it was time to change the product key on my running environment. Here is a write-up to change the license in vRealize Automation 8 with a standard installation (standalone-node) that is running with an Enterprise license.

Keep in mind: as explained in the vRealize Automation 8 release notes you cannot change the version of the license “After configuring vRealize Automation with the Enterprise license, the system can not be re-configured to use the Advanced License.“.

Connecting with vRA8

Start a connection with the vRealize Automation 8 appliance to get shell access to the system. I like to use Putty but you can use any terminal emulator you prefer that supports SSH.

Procedure:

  1. Start a terminal emulator like Putty on your desktop.
  2. Connect with the FQDN/hostname of the vRealize Automation 8 Appliance.
  3. Login with the root account.


Viewing product license

To validate the currently installed license key on the vRealize Automation 8 appliance you need to enter the following command “vracli license current“. Here can you find a screenshot of the output in my lab environment (keep in mind multiple lines are hidden):

Installing product license

To install a new license in vRA8 you need to perform some steps on the command line.

In this example we are changing the product license from one license key to the other:

  • New license key: AAAAA-AAAAA-AAAAA-AAAAA-AAAAA
  • Old license key: ZZZZZ-ZZZZZ-ZZZZZ-ZZZZZ-ZZZZZ
### List current license installed
vracli license current

### Install new license
vracli license add AAAAA-AAAAA-AAAAA-AAAAA-AAAAA

### Remove old license
vracli license remove ZZZZZ-ZZZZZ-ZZZZZ-ZZZZZ-ZZZZZ

### Reboot the appliance to apply the license change
reboot

Wrap-up

I think this covers this small blog about changing the vRealize Automation 8 product license on a running system because there was no procedure available in the official documentation. I have not tested this procedure yet on a clustered deployment with three vRealize Automation 8 appliances. This might behave differently.

Be aware: I have tested this procedure on vRealize Automation 8.0.1 Hot Fix 1. The result may defer on another hotfix or version because of the ongoing product evolution.

Thanks for reading this blog and see you next time!

VMware vRealize Log Insight content pack for Cisco ASA

In this blog, we are going to set up the VMware vRealize Log Insight content pack for a Cisco ASA device for capturing syslog information. With setting up this pack we are able to provide a central location for storing the logging information and a way to maintain the data for longer periods of time.

Almost a year ago I moved from pfSense to a physical Cisco ASA firewall and it was time to improve the visibility into the firewall rules that were blocking and allowing traffic in my network. This was a nice opportunity to configure VMware vRealize Log Insight with an additional content pack.

Environment

When I was writing this blog post I was using the following software releases:

In essence, the procedure is the same for older and newer versions of Log Insight and a Cisco ASA.

Log Insight Content Pack

Let’s start by installing the content pack on vRealize Log Insight. Make sure you install the Cisco ASA content pack for vRealize Log Insight. This can be found in the VMware marketplace that is available in the central VMware vRealize Log Insight interface.

Here is a screenshot with the location of were you can find the content pack:



Cisco ASA Configuration

Login into your Cisco ASA firewall with a console or SSH session and configure the syslog settings as displayed below. Keep in mind this is an example configuration, change the config based on your needs!

Here is an basic configuration example:

config t
  logging enable
  logging timestamp
  logging trap debugging
  logging host %interface% %ip-address_syslog_facility%
exit

To verify the status of the configuration run the following commands


### Show configuration and logging forwarding status
show logging

### View configuration
show run | grep logging

Here is an example output of my Cisco ASA:



Viewing information

After everything has been set up the dashboards will be populated with information received from the Cisco ASA.

Here are some screenshots from my environment:

Here are some useful examples of with kind of information you can expect from the Cisco ASA Content Pack for vRealize Log Insight. I personally think it is one of the best free content packs because the dashboards are really good at providing a lot of information with good solid diagrams.

Synology DS1618+ Memory Expansion

After replacing the Synology DS1515+ with a Synology DS1618+ last year it was time for another investment in the Synology DS1618+. Overall it is a great device that is running my VMware iSCSI storage for my ESXi Hosts but based on some metrics the memory was experiencing some issues, so it was time for a memory expansion!

The reason why I am expanding my memory is physical memory swap usage. Based on my monitoring tooling the system is swapping to disk and when that happens the storage latency is increasing extensively on the iSCSI volumes (2500 ms / 2.5 seconds latency dips). The hypervisor and virtual machine survive but they don’t experience it as a good thing ;).

After a good session with a Synology Engineer on VMworld 2019 Europe, he explained that the storage latency I am experiencing multiple times a day must be caused by the swapping to disk and refreshing the read-cache in the physical memory. Synology is using physical memory as a read cache to boost performance by default.

Synology Statement

Here is the official statement from Synology surrounding performance and memory: “Memory usage remains high because the system stores frequently accessed data in the cache, so the data can be quickly obtained without accessing the hard disk. Cache memory will be released when the overall memory is insufficient. High swap space usage indicates insufficient system memory, and will also affect the system performance. You can view the rate of swap in and swap out by choosing Swap from the drop-down menu on the top.”

To clarify, my Synology DS1618+ is only running iSCSI storage with two volumes with both SSD drives in RAID1. The only services that are enabled are SSH, SNMP and off course iSCSI. The machine has no other purpose!



Memory Swap

The metric here is showing the usage of memory swap, the value 100% means it is completely empty, so no swap usage. The value 0% means that all swap is allocated/completely full.

As you can see in the graph there was always some swap activity going on in the last months around 92%. On 03-09-2020 / 03-10-2020 I installed the 32 GB DDR4 memory in the system and it is a steady 100% (so no swap in use).

Memory Expansion

I bought the following kit from Crucial 32GB Kit (2 x 16GB) DDR4-3200 SODIMM with the following part nr “CT2K16G4SFD832A“. This is the suited memory for the DS1618+ as you can verify on the Crucial website. The memory configurator tool can be found over here: Crucial Advisor Tool.

Luckily expanding the memory in a Synology DS1618+ is quite easy! I created a brief write-up and some photos are located below.

Procedure:

  1. Poweroff the VMware workload.
  2. Poweroff the Synology NAS.
  3. Remove the DS1618+ from his rack/shelf.
  4. Flip the device, the memory hatch is located on the bottom.
  5. Remove the two screws.
  6. Open the hatch.
  7. Remove the original memory.
  8. Install the new memory.
  9. Close the hatch.
  10. Install the two screws.
  11. Install in the rack.
  12. Power on the system (the first time booting will be longer than normal. The DS1618+ is performing a memory check, in my case, it took about 15 minutes).
  13. Power on workload.

Source

Here is the some additional links surrounding the memory expansion:

ProLiant ML10 v2 CPU Swap

In this blog post, I am talking about the HPE ProLiant ML10 v2 home lab servers that I have been using for the last three years. I had some performance issues related to the processor with the number of virtual machines and containers running on the little ML10 v2 servers. So it was time for a CPU Swap!

On the internet, there are a lot of speculations on which CPUs are supported in the HPE ProLiant ML10 v2. So that is why I did this blog post.

The servers were originally bought with Intel® Pentium® Processor G3240 CPUs. This was the smallest CPU available at the time. At first, I was looking at the Intel Xeon E3-1220 v3 CPUs but I decided to buy the Intel® Core™ i3-4170 Processor on Ebay.com for a couple of bucks. The choice was related to the pricing difference and the amount of power usage.

I can confirm that both HPE ProLiant ML10 v2 servers detected the i3-4170 CPUs without any issues. The systems are running 24×7 and the CPU temperature is around fourth to fifty degrees with the fans running on their lowest operating mode.

Comparison

As you already figured out the G3240 is a slow CPU compared to the i3-4170. So it was a well worth invested upgrade it for about 40 euro’s for both CPUs in total.

The hypervisor (VMware ESXi) and workload performance improved drastically. Because of the additional instruction sets like AES-NI and clock speed. So it was a good investment at least in my opinion.

Here is a comparison provided by the Intel ARK website. Click here for the link.



Screenshot(s)

Here are some screenshots of one of the HPE ML10 v2 server that was upgraded with the new CPU. As you can see the screenshots are from the HPE Integrated Lights-out or in short (iLO). The first screenshot is of the new CPU that is detected, the second one is the memory configuration and the third screenshot is the operating temperatures after running a couple of days with the workload.

Result

As you can see the Intel i3-4170 CPU is working without any issues in the ML10 v2 server. Currently, they have been running for about 100 days without any reboot. So I can confirm they are stable and do not overheat! The CPU swap is successful!

Notes:

  • I use stock cooling.
  • I do not use a modified BIOS.

Thanks for reading and see you next time!

Dell EMC VxRail NSX-T Considerations

Currently, I have been involved in a Dell EMC VxRail design & deployment with VMware Cloud Foundation on Dell EMC VxRail. There were some noticeable items that you need to consider when using the Dell EMC VxRail as your hardware layer in combination with VMware NSX-T as a network overlay. So it was time to write down the items that I have learned so far surrounding the VxRail NSX-T considerations.

This blog post is focused on the NSX design considerations that are related to the physical level when using the Dell EMC VxRail hardware.

At first, I am going to talk about VMware NSX-V because a lot of customers are already running Dell EMC VxRail in combination with NSX-V and need to move to NSX-T in some time.

VMware NSX-V

In case you are already using Dell EMC VxRail with VMware NSX-V. Your physical NIC configuration would in most cases look like one of the following:

  • Scenario 01: Dual port physical NIC – 10 Gbit
  • Scenario 02: Dual port physical NIC – 25 Gbit

The default configuration that I see in the field at this moment is based on a single dual-port card with either 10 Gbit or 25 Gbit. This is for fine for VMware NSX-V but not for his replacement…



VMware NSX-T

When using Dell EMC VxRail with VMware NSX-T you are required to use four physical NICs! This is because of the limitation surrounding the Dell EMC VxRail software that makes a “PowerEdge server” a “VxRail server”.

The first official Dell EMC statement from there VMware Cloud Foundation on VxRail Architecture Guide: “NSX-T based VI WLD will require additional uplinks, whatever uplinks were used to deploy the VxRail vDS cannot be used or the NSX-T N-VDS“.

The second official Dell EMC statement from there VMware Cloud Foundation on VxRail Architecture Guide: “Note: NSX-T will use the next two available vmnics that are both the same speed for every node in the cluster“.

So this leaves us with three scenarios provided by Dell EMC for the VxRail nodes:

  • Scenario 01: Quad-port physical NIC
  • Scenario 02: Quad-port physical NIC (two ports used) with dual-port physical NIC
  • Scenario 03: Dual-port physical NIC with dual-port physical NIC.


Advise

Dell EMC VxRail is the only hardware platform currently on the market that requires four physical NICs to operate with NSX-T. This means you have to make sure your hardware and datacenter are capable of supporting this requirement. You need to make some choices surrounding the physical network cards, network capacity and datacenter rack space.

So let’s start with my list of VxRail NSX-T considerations!

Physical Network Card

When you are at a point of buying the Dell EMC VxRail solution, buy at least a quad-port NIC configuration. Personally, I prefer the double dual-port NIC setup. As shown here below:

I prefer this hardware setup because of the hardware redundancy created by two cards with there separate chips and PCIe slots. This reduces the change of losing all your network connections when a physical NIC dies.

Another recommendation should be to buy physical NICs that support 25 Gbit. It is a minimum price difference and will make the setup more future proof.

Top of Rack (TOR)

As discussed in the last paragraph: when you move to VMware NSX-T you are forced to use four physical NICs in each VxRail node. After installing the card you need to make sure you have enough physical ports in your Top of Rack switches/Leaf switches.

At the customer where I am currently working, they are forced to increase there Top of Rack switches capacity from two ports per server with NSX-V to four ports per server with NSX-T. This meant a full redesign of there datacenter rack topology and network topology. The spine switches were also not able to connect with that amount of leaf switches.

Keep in mind: This is only required of course when you are running a decent amount of servers per rack. In the customer case, they are running 32 VxRail nodes per rack. This means they require at least 128 physical switch ports per rack without uplink ports counted.

Here is an overview of the scenarios as just described, the first is the NSX-V scenario and the second the NSX-T scenario.

Near future

I know that VMware & Dell EMC are currently working on a solution for the VxRail hardware but time will tell. At this point keep your eyes open when moving from NSX-V to NSX-T with Dell EMC VxRail. Customers how are deploying greenfield also need to be aware that they need additional network capacity.

So that wraps up my VxRail NSX-T Considerations blog post. Thanks for reading my blog post and see you next time!

VCSA 6.7 Out of Space (SEAT)

Today I was greeted by the following error message when logging into the VMware vCenter Server also known as VCSA: “Could not connect to one or more vCenter Server systems: https://%fqdn%:443/sdk“. So it was time for a quick write-up on how to resolve this issue.

The issues were already present a couple of hours earlier based on monitoring and logging. For example, Veeam Backup & Replication tried to perform a backup but failed because there were no vSphere Tags available. Veeam Backup & Replication generated the following message “Tag Backup SLA – Bronze is unavailable, VMs residing on it will be skipped from processing.“.

I’m running a VMware vCenter Server as in a VCSA 6.7 appliance and it has an embedded Platform Services Controller. The exact version of the appliance was at the moment of the issue “6.7.0.32000 – Build 14070457“.

Could not connect to one or more vCenter Server systems:

At first glance everything looks fine, the web-interfaces are online, authentication is working but after login, the following message appears “Could not connect to one or more vCenter Server systems: https://%fqdn%:443/sdk“. None of the pages are displaying any content. Here is a screenshot:



After performing a simple reboot nothing had happened, the result was the same. So it was time to dig deeper. Luckily the reboot did trigger a new event in the Appliance Management Page (5480). It appeared that the /storage/seat disk had filled up. The alert that popped-up was “File system /storage/seat is low on database storage space. Increase the size of disk /storage/seat or decrease the data retention.” Here is a screenshot:



Increasing Disk Space

After finding the error message it appeared to be an easy fix. Here is an overview of the commands I used. The commands are also usable for expanding one of the other VCSA virtual disks.

Keep in mind: before increasing disk capacity make sure you have a backup or snapshot available.

It this case we are going to expand this /storage/seat volume. The seat volume is responsible for Stats, Events, Alarms, and Tasks (SEAT) for VMware Postgres (Database).

# Step 01: Connect with the vCenter Server with an SSH Session (use for example Putty).

# Step 02: Login with the root account (root/your-password).

# Step 03: Enable the shell
shell

# Step 04: Run the command to verify the current disk space:
df -h

# Step 05: Increase disk capacity with the Host Client because the vCenter Web-interface is not working ;) (see screenshots)

# Step 06: Run the disk expansion command, the expected output should be: VC_CFG_RESULT=0
vpxd_servicecfg storage lvm autogrow

# Step 07: Verify the disk again, the disk should be bigger!
df -h

# Step 08: Reboot the VCSA
reboot

# Step 09: Verify the working of the VCSA Appliance after reboot.

Here is a collection of screenshots of me performing the procedure.

Conclusion

VMware made it easy for the system administrators to identify the issue and quickly expand the virtual disk from the vCenter Appliance. This is a huge improvement compared to the past. The only thing you need to watch out for is the number of virtual disks connected to the VCSA. If you do not watch out you could expand the wrong disk.

The reason that the disk filled up was caused by two things in my case. 1) I created and destroyed lots of virtual machines in the days before the incident. 2) The VCSA is configured as a tiny footprint so that is why the disks are relatively small.

So this was the write-up! If you got any comments or questions please respond in the section below.