Couldn’t format VMFS on nested ESXi (VSAN datastore)

I was working on a blog post and I needed a nested lab for it. So I deployed one on my newly created 2 node VSAN cluster. I used the templates from the Content Library William Lam is providing because in the near futures I would like to test nested VSAN so this is a very good point to start. I only needed two hosts so I used the 3 node template and installed this two hosts. When I try to deploy something on this hosts I saw that I need some storage to deploy, so I decided to use the 8GB disk which is normally used as MD in the VSAN deployment and formated it. Unfortunately I received the following error:
VI Client:

Call “HostDatastoreSystem.CreateVmfsDatastore” for object “datastoreSystem-9” on vCenter Server “172.16.70.100” failed.
Operation failed, diagnostics report: Unable to create Filesystem, please see VMkernel log for more details: ATS on device /dev/disks/naa.6000c293ef85cb0f741b3b643b8e9765:1: not supported.

WebClient:

The “Create VMFS datastore” operation failed for the entity with the following error message.
An error occurred during host configuration.
Operation failed, diagnostics report: Unable to create Filesystem, please see VMkernel log for more details: ATS on device /dev/disks/naa.6000c293ef85cb0f741b3b643b8e9765:1: not supported.

I also got some errors in the vmkernel.log:

NMP: nmp_ThrottleLogForDevice:3298: Cmd 0x1a (0x439d80256e80, 0) to dev “mpx.vmhba32:C0:T0:L0” on path “vmhba32:C0:T0:L0” Failed: H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE
FSS: 5334: No FS driver claimed device ‘naa.6000c29f45e078d5420c002b734f4b7c:1’: No filesystem on the device
FSS: 5334: No FS driver claimed device ‘mpx.vmhba32:C0:T0:L0’: No filesystem on the device
VC: 3551: Device rescan time 5 msec (total number of devices 5)
VC: 3554: Filesystem probe time 12 msec (devices probed 5 of 5)
VC: 3556: Refresh open volume time 0 msec
FSS: 5334: No FS driver claimed device ‘naa.6000c293ef85cb0f741b3b643b8e9765:1’: No filesystem on the device
FSS: 5334: No FS driver claimed device ‘naa.6000c29f45e078d5420c002b734f4b7c:1’: No filesystem on the device
ScsiDeviceIO: 2651: Cmd(0x439d80255e00) 0x16, CmdSN 0xcf7 from world 0 to dev “naa.6000c293ef85cb0f741b3b643b8e9765” failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x24 0x0.
LVM: 9294: LVMProbeDevice failed on (3303172960, naa.6000c293ef85cb0f741b3b643b8e9765:1): Device does not contain a logical volume

This problem was really weird so I played a little bit with different controllers and other disks but the problem still exists. I also did a really long research through Google and VMTN but then I found a hint on the #vBrownbag site. Cody Bunch had also the problem that he couldn’t format a disk in an nested ESXi host. In his case the disk was located on a NFS datastore. I don’t have an explanation why it was a problem then but the post is already 6 years old and a lot of things changed since then. But I used this information and moved the nested ESXi host to local storage. After that it was no problem to format the nested disk with VMFS. I also moved the nested host to NFS and there again no problem.
So the problem must be with the VSAN datastore. But how and why? I thought it was time to ask the question on Twitter and I got really fast a feedback.
William Lam wrote a post in response to Tim Smith’s blog post about the problem that he couldn’t install an nested ESXi host on top of a VSAN datastore. I assume that the problem was that both used an install target that was large enough that the installer wanted to create a VMFS datastore on it. In my case the template from William only used 2GB for installing ESXi so the problem wasn’t there during installation. That was also the reason why I didn’t find the post and the solution earlier.
Here is a quote from William’s post why you couldn’t format a nested disk on a VSAN datastore:
The problem is with a SCSI-2 reservation being generated as part of creating a default VMFS datastore. Even though VMFS-5 no longer uses SCSI-2 reservations, the underlying LVM (Logical Volume Manager) driver for VMFS still requires it. Since VSAN does not make use of SCSI-2 reservations, it did not make sense to support it and hence the issue.
I have made 2 screenshots, one before the VMFS format and one after the failed format.
vmfs_vsan_01
vmfs_vsan_02
The first screenshot shows the disk info before the VMFS format took place. The second shows that after the failed VMFS format the disk already has an VMFS file system. Also the vmkernel.log shows this error:
LVM: 9294: LVMProbeDevice failed on (3303172960, naa.6000c293ef85cb0f741b3b643b8e9765:1): Device does not contain a logical volume
These are all evidence that the VMFS is created but LVM can’t complete the process to mount it on the ESXi host.
To fix this behaviour run the following esxcli command on the ESXi console or shell of the host where the nested host is running.

esxcli system settings advanced set -o /VSAN/FakeSCSIReservations -i 1

You don’t need to reboot the host. The setting is working instantly. The only drawback was that even after running the command you don’t see it under advanced settings nor when running esxcli system settings advanced list |grep VSAN.

Leave a Reply

Your email address will not be published. Required fields are marked *