What’s the difference between fixed sized VHD or VHDX and dynamic VHD / VHDX files in Hyper-V? How bad is the performance impact?
Understanding the difference between the two requires a deeper understanding of how virtual disks and real hard drives work.
After reviewing the pros and cons, many people choose against using dynamic disks. It’s true that dynamic disks are very convenient for a development, demo, or test server; however, on a production system they can quickly kill the awesome RAID performance you were expecting to reach.
Why Dynamic Disks Are Inefficient On Production Systems
Both real hard drives and Hyper-V VHD / VHDX files are storing data in a huge sequence of bytes.
In reality, however, data is stored in sectors, which are chunks of data. Hard drives are not byte-oriented, they are sector oriented. Similarly, the file system on top of the raw hard drive area is also chunk based. In NTFS it’s called a cluster. For best performance you would want a cluster to be one or more full hard drive sectors. You wouldn’t want a 513 byte cluster in NTFS on a 512 byte sector sized hard drive because then it would be quite tricky to figure out which hard drive sectors need to be read and written each time a file system cluster is updated.
Hyper-V virtual disks are a similar abstraction. In VHDs the sectors are 512 bytes. In VHDX they are 4KB, which matches the internal sector size of modern 3TB+ hard drives.
A fixed VHD or VHDX in Hyper-V, once allocated, is very predictable and matches 1:1 the underlying structure. If you need cluster N you simply multiply N by the cluster size and add the start offset of the VHD file and the system immediately knows where to write the data. This is a linear process and it’s almost as fast as writing directly to NTFS from a physical OS.
A dynamic VHD or VHDX, on the other hand, is a performance disaster. The reason for that is in plain physics. The way dynamically growing disks work, they are guaranteed to result in heavy file and disk fragmentation. Even without any fragmentation occurring (as when the VHD is the only file on the disk) the disk heads will end up moving like crazy after long periods of operating system activity.
Why is that so?
Dynamic VHDX needs to extend the file when there is no more room to write to. However, Hyper-V first needs to figure out if it really needs to grow or if there is room elsewhere in the file. This process puts a burden on CPU already. Then, if the file needs to be grown, a certain number of additional blocks is added. However, these additional blocks are likely to be used by various non-contiguous blocks.
For example, on a fixed disk, sectors 1,2, and 3 are all consecutive blocks. This means they are in sequence. Hard drives are optimized very well to work with sequential data. On a basic 4TB drive today you can get over 50 to 150MB/sec consecutive read data flow thanks to read ahead caching and other techniques used in the operating system as well as in the hard disk hardware. On dynamic disks, sectors 1,2, and 3 aren’t necessarily written one after the other because they are written on demand and the dynamic VHD is extended and shrunk on demand. This results in a random pattern of sectors.
For example, a typical dynamic disk may have these sectors written in sequence: 1, 17, 3484. If the system wants to read sector 1,2, and 3, it may need to “jump” twice in order to read or write all of them. For each sector it needs to look up the real location of the block. As you can imagine that’s not so good for performance because each time a hard drive needs to move heads, it can take around 5-10 msec. In addition, the cache is optimized for sequential reads but when the head jumps to a new area on disk, the cache won’t help much at all because it can’t be filled using read ahead strategies.
Imagine Hyper-V wants to read 100 sectors (51,200 bytes = 50KB) in one go but the heads need to jump 100 times because the blocks are stored in different locations of disk. That would take somewhere between 500 to 1000 msec seek time without even reading and processing the blocks! And that example is for just 50KB. An email these days is megabytes long! Hence, it should be clear now that dynamic disks are a definite performance killer and should not be used on production machines.
Our experiments show are that you could likely add twice as many VMs to the same server if you stick to fixed sized VHDs instead.
Due to the above reasons, the nature of dynamic VHD and VHDX files is such that they severely impact performance over time. The main reason behind the performance impact is that blocks end up in almost random order and thus read and write operations can’t make proper use of caching technology. Each block access comes with an additional penalty when the hard drive heads need to be moved to a new raw sector, back and forth. The farther these sectors are apart, the longer it takes to access them.
If you wonder how a great super fast server can be brought down to unacceptable performance levels, simply add a couple of virtual machines and use dynamic VHD disks on them. Over time, the hard drive won’t be doing anything else but jumping back and forth even for minuscule pieces of information.
As always we recommend to back up your fixed and dynamic VHD and VHDX virtual machines using BackupChain!