How VMFS Works: Inside VMware’s File System Architecture
Understand VMFS structure, metadata layout, snapshots, and file allocation for improved VMFS recoverability.
VMware’s VMFS is a specialized cluster file system tailored for storing virtual machine disk files. In this deep dive, we’ll explore the VMFS structure, metadata, snapshot mechanics, and file allocation—plus quick sysadmin tips based on real-world experience to help you prevent data loss and improve VMFS recoverability.
VMFS Structure Overview
1. Partition Layer
VMFS typically lives on top of a raw LUN or physical disk. It can use MBR or GPT, though GPT is preferred for large volumes.
Sysadmin Tip (1-min check):
If you're working with older storage, check if your partition table is still MBR—it silently limits you to 2TB, and you may not notice until it's too late.
parted /dev/sdX print
2. Volume Header
This region contains the UUID, VMFS version, and metadata pointer map. It's the first thing checked when mounting.
Sysadmin Tip (2-min disaster prep):
Keep a copy of esxcfg-volume -l
output or run it weekly as a cron job. If the header gets damaged, you’ll need the volume UUID or label to recover or mount it manually.
esxcfg-volume -l > /var/log/vmfs-volumes.txt
3. Metadata Blocks
Metadata tracks everything: file locations, locks, snapshots, directories.
Sysadmin Tip (5-min audit habit):
If you ever have slow datastore performance, suspect metadata congestion. Running too many VMs with snapshot chains on a single datastore can cripple metadata response times. I once cleared 20GB worth of stale snapshot deltas and cut latency in half.
Bonus: Avoid running backup solutions that create too many concurrent snapshots without consolidating.
4. File Blocks / Data Extents
VMFS splits files into extents. These aren't necessarily contiguous.
Sysadmin Tip (3-min review):
Before increasing VM disk size, check if the file is fragmented across extents. VMs spread over many extents may be harder to recover if volume corruption occurs.
vmkfstools -D /vmfs/volumes/datastore1/VM/VM.vmdk
5. Sub-Block Allocation
Sub-blocks are used for small files (like .vmx
, .log
, .nvram
) under 1MB to reduce waste.
Sysadmin Tip (1-min cleanup habit):
Clear old .log
files and unused snapshots periodically. These sit in sub-blocks and clutter metadata. I automate this monthly with a short PowerCLI script.
How Metadata Works in VMFS
Metadata is structured in B+ trees and journaling zones.
Sysadmin Tip (5-min read & react):
Don’t ignore logs like vmkernel.log
reporting delayed journaling or metadata locks—this can mean your metadata is reaching a tipping point. I once lost a host cluster because ignored logs led to full metadata journal corruption.
Snapshots in VMFS
Each snapshot is a delta disk. Chains can go several layers deep.
Sysadmin Tip (2-min visibility check):
Use vmkfstools -t0
or PowerCLI to visualize snapshot trees. The deeper the chain, the higher the risk of corruption.
Rule of thumb: Never go beyond 2–3 snapshot levels unless you're absolutely forced to. I had a client with 9 chained deltas—when corruption hit, it was unrecoverable beyond the 3rd level.
File Allocation in VMFS
VMFS uses large block sizes and dynamic extents for file allocation.
Sysadmin Tip (3-min visual proof):
Use VMware’s datastore browser to spot “split” or thin VMDKs with unusual names or sizes. If they grow inconsistently, verify if something else (like backup snapshots) is interfering.
Sysadmin Tip (2-min alert setup):
Regularly check available space on your VMFS volumes. Running out of free space can lead to serious corruption issues, especially during snapshot commits or VM migrations. I’ve seen a full VMFS volume take out a cluster’s performance for hours.
Recommendation: Set a threshold alert in vCenter (e.g. 80–90% usage), or use a scheduled script to monitor usage and notify you before it's too late.
Bonus Tip: Always consolidate snapshots after large operations like patching or database dumps. It re-merges extents and reduces future recovery headaches.
VMFS Versions: Compatibility & Recovery Impact
VMFS Version | Introduced With | Max Volume Size | Notable Changes |
---|---|---|---|
VMFS-3 | ESX 3.x | 2 TB | Widespread but limited |
VMFS-5 | ESXi 5.x | 64 TB | Unified block size |
VMFS-6 | ESXi 6.5+ | 64 TB | Async UNMAP, better metadata performance |
Sysadmin Tip (2-min compatibility check):
Use vmfs6
only if all your backup tools and disaster recovery scripts fully support it. I had to roll back a datastore upgrade after discovering our backup tool couldn't parse VMFS-6 snapshots properly.
VMFS Recoverability: What You Can (and Can't) Recover
Knowing what VMFS stores and how it manages data helps determine what’s recoverable.
Sysadmin Tip (5-min post-crash drill):
After a crash, resist the urge to reformat or run disk checks. Instead:
1. Clone the disk,
2. Check esxcfg-volume -l
,
3. Try mounting as a snapshot (esxcfg-volume -M
).
Many recoveries fail not because of corruption, but due to user panic and overwrites.
Conclusion
VMFS isn't just a storage format—it’s a layered, intelligent system with complex metadata, snapshots, and block structures. Understanding how VMFS allocates files and maintains snapshots gives you the upper hand when disaster strikes. And with some proactive maintenance and sysadmin discipline, you can greatly increase your VMFS recoverability success rate.