If the VSAN detects failure to write to a drive, it may kick it out of the VSAN even if the hardware sensors have not seen a condition that would cause them to mark the disk as faulted yet. If that disk is a cache drive or if using deduplication and compression, the VSAN will have to take the entire diskgroup offline. While this can lead to the above-mentioned conditions, it is not the underlying cause. The cause lies in corrupted metadata or disks that still have partitions (from their former configuration) from which they are not recovered and ready to be added back to the VSAN. This can also occur for other reasons when something is inadvertently written over disk metadata improperly. The data is usually intact but no longer accessible and the VSAN will have to recover storage policy compliance with a resync.
A drive with this type of partition may believe it is still part of a diskgroup and show a cache drive where there should not be one. This cache drive will not have normal information like the capacity or name (naa info missing). You can't remove it, however, due to the host thinking there is a drive there which is not mounted. You are also unable to correct this by re-scanning the storage controllers (this can cause a host crash) or by rebooting the host.
***Note: If the below steps do not correct the issue, the quickest and best resolution is usually to factory reset the host. If assistance is needed with performing the steps, or they have been tried but there are still problems, a Service Request is needed. Contact Dell Technical Support or your Authorized Service Representative, and quote this Knowledgebase Article ID.
Fix: Any "Not mounted..." drives must have their partitions removed or hidden and any ghost disks must be removed from the environment. If partitions are masked, this should still allow them to show up as 'Eligible for use by VSAN' again. Adding them to a diskgroup should wipe anything that was on them during the process. After fixing that, and removing any ghost disks, you may still need to reboot the host. This is done after everything is showing up properly on the host. In vCenter's Cluster > Configure > Disk Management area, you can create a diskgroup as normal.
Steps:
*It is best to place the host into Maintenance Mode (Ensure Accessibility) first, if possible, to protect data on the host from any mistakes or unexpected issues. Make sure that the rest of the VSAN is healthy as well. If there is a VSAN resync going on, this needs to complete before any disks or diskgroups with data on them can be removed from the VSAN.
1. Run the below command on the host (in Putty) and copy the output to a document. Putty is not required but being able to copy and paste is helpful.
vdq -qH
2. Identify drives that are "Ineligible for Use by VSAN" AND either show "Reason: Not mounted on this host" or have nothing in the Name field (no naa).
3. Correct drives showing "Not mounted..." first:
a. Get the NAA of the ineligible disk from the output of
vdq -qH on the host - Run this command to mask the partitions on the disk:
partedUtil mklabel /dev/disks/<naa.#'s> gpt
b. Run the below command again and ensure the drive now shows "Eligible for use by VSAN".
vdq -qH
*If not, a reboot and then repeat of the previous step is needed. You should attempt to remove ghost disks before rebooting to avoid a long boot process as the host initializes disks and vSAN services attempt to start.
4. Remove ghost disks. You can usually do this in the same Disk Management area. If not, use command line on the host.
$ esxcli vsan storage remove -u <UUID>
Note the UUID of disks without naa names, from your output in step 1.
5. Check that everything is looking how it is supposed to. Refresh vCenter and check Disk Management again as well as run the command "vdq -qH" on the host to ensure all expected drives appear and show "Eligible for use by VSAN" now. If not, reboot the host as some drives may not have been initialized yet and check again.
6. Create the disk group or add disks to existing disk groups as normal (if using deduplication and or compression, full disk group re-creation is needed).