DEV Community

Use raw ZFS volume for VirtualBox guest

Michal Nowak on August 04, 2019

If you’ve ever run virtual guests on platforms like KVM, Xen, Hyper-V, VMware, or VirtualBox, you probably think of disk attached to the guest as a...
Collapse
 
mgerdts profile image
Mike Gerdts

Things to explore as to why it could be slower on zfs:

  • Depending on how vbox is doing the writes, this could be sync writes for every write that is happening in the guest. It would be useful to measure the number of iops seen by the host in each of your tests.
  • If sync writes are happening such that they don't align with the dataset's block size (recordsize for filesystem, volblocksize for volume) you could be seeing write inflation. In the host, look for a lot of reads during a write workload and look to see if the amount of data written by the host is significantly more than that written to the guest.
  • If you are doing writes that are at least 32k (zfs_immediate_write_sz) but vbox is chopping them up into smaller writes, the data may be written to the zil (zil exists even if log devices don't) and again to its permanent home.
  • I believe that the Samsung 840 Pro advertises a physical sector size of 512b, which is almost certainly not true. It is likely 4k or 8k. The zpool will have been created with ashift=9, and all allocations will fall on a 512 byte boundary. If that 512 byte boundary just happens to also be a boundary that is the real sector size (4k or 8k), you will avoid the drive having to do a read-modify-write cycle when writing multiples of the drive's actual physical sector size. If the real sector size is 4k, a 512 byte allocation only has a 12.5% chance of landing on the right boundary. Repeat your benchmarks many times, looking for wild variations.
  • ZFS is doing more work than it would be without ZFS. The protections offered by copy-on-write, checksumming, etc. are quite cheap compared to the cost of operations on hard disks. The cost becomes more noticeable on faster disks. Normally I think of these costs being noticeable on NVMe and not so much on an earlier generation SATA SSD.
  • If the tests are with encryption enabled, I'd be interested to see what it's like without encryption.
Collapse
 
mnohime profile image
Michal Nowak

Thanks Mike! Very much appreciate your comment. I will go thru those things and see what difference they make, just started with ashift, which was indeed set to 9 (512B) but should be to 13 (8K).

Collapse
 
nobozo profile image
Jon Forrest

Minor typo:

"control it's resources, and more." ->
"control its resources, and more."

Collapse
 
mnohime profile image
Michal Nowak

Thanks, Jon. Typo fixed.

Collapse
 
osde8info profile image
Clive Da

a brilliant use of virtualisation ! using it to experiment with new file systems WITHOUT messing up your host or native env