Thin provisioned disks and sparse files at VMFS datastores
A lot of ESXi servers use thin provision configuration where resources are not allocated at once but provided on request for virtual machines until quota or end of the disk is reached.
Sparse file is a realization of thin provision ideology for individual files, rather than disk volumes. Please be advised that sparse files in VMFS can be used in snapshots only, base VMDK files can not be set up as sparse files. Also, the realization of sparse files in VMFS differs from the ones used on NTFS or Linux file systems.
Let’s examine more closely how these features may affect data recovery on VMFS disks.
First of all, you need to know that thin provision and sparse files are realized by the VMFS file system and are normally completely transparent for 3rd party software.
For example: let’s imagine that you have a 1Tb VMFS partition and have set up 4 thin provision virtual machines 1Tb each on this disk. This means that backing up all 4 VMs will result in 4 Tb of uncompressed data, regardless of how much data is physically allocated by each VM on the VMFS partition. Space will be filled with zeroes.
Sparse files, as VMware describes, are implemented by Hypervisor on top of VMFS level. Unlike NTFS or Linux’s realization where sparse files flexibly allocate disk storage space, based on the minimum space required by each user at any given time. VMFS sparse files are used in snapshots only and therefore store only changes to the base VMDK since the moment when the snapshot was created. There are two realizations of such snapshot format:
VMFSsparse -- used in VMFS 5 file system and is limited to 2Tb.
SEsparse -- is a default format for delta snapshots in VMFS 6. It supports disks over 2Tb and is available since ESXi Server v5.5. SEsparse has a minor speed advantage on most loads and block sizes compared to VMFSsparse and can reclaim dead space from thin provisioned disks automatically.
Such deep emulation of the traditional thick provisioned disk increases compatibility with 3rd party software and legacy applications. However, such an approach significantly increases the size of recovered data which is especially critical for slow SSH connections.
There are two methods to reduce the size of recovered data:
Saving thin-provisioned data
You can save thin provisioned VMDK or sparse files in bespoke sparse format which can be converted to Windows sparse file after the transfer is complete. During upload, and\or deploy you may require vmkfstools for the additional conversion of uploaded VMDK files into a compatible format.
To save thin-provisioned data using VMFS Recovery™: enable Developer mode by clicking on "View"->"Preferences" and press CTRL+E. Switch to the hidden Developer tab and check "Developer Mode". Click "Ok" to save settings.
Now right-click on the VMDK file and select “Expert-> Save as di_sparce”. VMDK will be saved in di_sparse format and should be converted to Windows sparse before transferring it to the new datastore or mount in VMFS Recovery™ to gain access data inside VMDK (Drives -> Mount image).
Saving guest OS data
Mount VMDK file in VMFS Recovery™ (right-click on the VMDK file. select "Expert" -> "Mount as disk") open it in Reader mode and save the required data inside this VMDK. This data can be transferred, deployed, or uploaded for further use in a new VMDK file. In this case, files will be transferred as standard thick provisioned data and require no conversion.
VMFS sparse to NTFS sparse
To convert di_sparse to Windows sparse format, open command line in VMFS Recovery™ directory and use following command:
vmfs_recovery.exe -sparse %path to di_sparse% %path to unpack file%
For example:
"C:\Program Files\DiskInternals\VmfsRecovery\vmfs_recovery64.exe" -sparse C:\tmp\test.di_sparse C:\tmp\test.vmdk
Please change “%path to di_sparse%” and “%path to unpack file%” parameters to full paths to the di_sparse file and where the new NTFS sparse file should be located.