Thin provisioned disks and sparse files at VMFS datastores
A lot of ESXi servers use thin provision configuration where resources are not allocated at once but provided on request for virtual machines until quota or end of the disk is reached.
Sparse file is realization of thin provision ideology for individual files, rather than disk volumes. Please be advised that sparse files in VMFS can be used in snapshots only, base VMDK file can not be set up as a sparse file. Also, realization of sparse files in VMFS differs to the ones that used on NTFS or Linux file systems.
Let’s examine more closely how these features may affect data recovery on VMFS disks.
First of all, you need to know that thin provision and sparse files are realized by VMFS file system and is normally completely transparent for 3rd party software.
For example: let’s imagine that you have 1Tb VMFS partition and have set up 4 thin provision virtual machines 1Tb each on this disk. This means that backing up all 4 VMs will result in 4 Tb of uncompressed data, regardless of how much data physically allocated by each VM on VMFS partition. Empty space will be filled with zeroes.
Sparse files, as VMware describes, are implemented by Hypervisor on top of VMFS level. Unlike NTFS or Linux’s realization where sparse files flexibly allocate disk storage space, based on the minimum space required by each user at any given time. VMFS sparse files are used in snapshots only and therefore store only changes to the base VMDK since the moment when the snapshot was created. There are two realizations of such snapshot format:
VMFSsparse -- used in VMFS 5 file system and is limited to 2Tb.
SEsparse -- is a default format for delta snapshots in VMFS 6. It supports disks over 2Tb and available since ESXi Server v5.5. SEsparse has minor speed advantage on most loads and block sizes compared to VMFSsparse and can reclaim dead space from thin provisioned disks automatically.
Such deep emulation of traditional thick provisioned disk increases compatibility with 3rd party software and legacy applications. However, such an approach significantly increase the size of recovered data which is especially critical for slow SSH connections.
There are two methods to reduce the size of recovered data:
Saving thin-provisioned data
You can save thin provisioned VMDK or sparse files as in Diskinternals sparse format which can be converted to Windows sparse file after the transfer is complete. During upload and\or deploy you may require vmkfstools for additional conversion of uploaded VMDK files into compatible format.
To save thin-provisioned data using VMFS Recovery: enable Developer mode by clicking on "View"->"Preferences" and press CTRL+E. Switch to hidden Developer tab and check "Developer Mode". Click "Ok" to save settings.
Now right click on VMDK file and select “Expert-> Save as di_sparce”. VMDK will be saved in di_sparse format should be converted to Windows sparse before transferring it to the new datastore or mount it in VMFS Recovery to gain access data inside VMDK (Drives -> Mount image).
Saving guest OS data
Mount VMDK file in VMFS Recovery (right-click on VMDK -> Expert -> Mount as disk) open it in Reader mode and save required data inside this VMDK. This data can be transferred, deployed, or uploaded for further use in a new VMDK file. At this case files will be transferred as standard thick provisioned data and require no conversion.
VMFS sparse to NTFS sparse
To convert di_sparse to Windows sparse format, open command line in VMFS Recovery directory and use following command:
vmfs_recovery.exe -sparse %path to di_sparse% %path to unpack file%
"C:\Program Files\DiskInternals\VmfsRecovery\vmfs_recovery64.exe" -sparse C:\tmp\test.di_sparse C:\tmp\test.vmdk
Please change “%path to di_sparse%” and “%path to unpack file%” parameters to full paths to di_sparse file and where new windows sparse file should be located.