Failed updates/configuration persistence with vSphere 7.0u1 on USB install

Tim Borland
3 min readNov 29, 2020

--

Do you or a loved one suffer from failed vSphere Lifecycle Manager patches or groundhog day on your host? If so, you may be entitled to unending frustration!

But in all seriousness, don’t go through what I did… please… I beg you.

After upgrading my homelab from vSphere 6.7 to 7.0u1 (via fresh install), I was unable to apply any subsequent patches or persist changes on my hosts. I put this on the back-burner for a bit, but with the recent release of vSphere 7.0 u1b… it was time to solve it.

So how painful was it?

Well other than re-configuring hosts when they rebooted…

I originally built my cluster with the new image management feature, and was unable to apply any new image profiles. After a few attempts, I decided to create a new cluster and migrate all my hosts to it. Then I found the next surprise… updates failed with standard baselines!

Assuming now it was related to the 4GB flash drive I installed to, I decided to purchase and wait for some 64GB flash drives. The day arrived, and I backed up my host config and reinstalled ESXi 7.0U1… and no luck.

Maybe it was because of an issue in the restored config? So I re-installed yet again, and manually configured the host. Time to update… and denied!

Then I rebooted the host, and noticed when it came back online it was no longer in maintenance! Why would the host reboot and drop out of maintenance mode?

So what happened?

If you see something like this when trying to update your host, read on!

So something useful will be in the logs, right?

“esximage.Errors.StatelessError: The transaction is not supported:”

Wrong. This isn’t a stateless install, so why is it failing?

bootbank is pointing to a temporary location, and no altbootbank!

Well, it isn’t supposed to be stateless…

So how do we fix it?

It appears that during boot, the VMFS partition isn’t being found and mounted before the next part of the init process. This leaves us with a temporary /bootbank being mapped to /tmp/_random_bootbankid, and no altbootbank!

I remembered seeing something similar a while back on SAN boot disks… and figured why not try it here?

The solution is to give the host more time to identify and mount the VMFS partition:

devListStabilityCount=20
  • Enable SSH on the host
  • Modify /vmfs/volumes/BOOTBANK1/boot.cfg and /vmfs/volumes/BOOTBANK2/boot.cfg and append “devListStabilityCount=20” to the kernelopt line. (Or see the link below to find better timings for your environment!)
  • Reboot & profit!
Houston, we have bootbanks!

Don’t want to take my word for it?

After discovering the solution, and looking out on the web for it… I found this.

Pardon me while I go enjoy a victory coffee as my cluster remediates!

--

--

No responses yet