In this post I’m talking a bit about virtualisation and the disk caching options, how I interpret what they’re doing, and why I’m choosing the settings that I am.
This discussion is really only applicable in a world where your storage is direct attached to your host, usually as SATA disk. If you’re using network attached storage (NAS) or a full-fat SAN environment, then you’d typically attach the storage directly to the virtual machine, and therefore the host is unlikely to be providing any caching for you.
Think firstly about what’s going on here. Every time the virtual machine makes an I/O request, it goes through the Linux kernel on that virtual machine. That Linux kernel has an I/O scheduler and a caching layer. The virtual then passes the I/O off to the physical host, which is also running Linux, and has it’s own I/O scheduler and caching layer.
Part of my thinking here is that you generally want to avoid having two caches and having two schedulers.
Let’s consider the cache first. If you were to use the cache on the physical machine then you’ll probably have access to a larger cache, and that cache will be shared across all your virtuals. If your various virtual machines share a lot of file systems, or if you have usage patterns where occasionally one machine is under load, then later on a different machine, then having this large cache would make sense. You’d keep a small amount of allocated memory on your virtual machines (which constrains the cache on each machine), and leave a large pool of memory available in your host.
Conversely, if your usage pattern is that each virtual is reasonably independent, and you prefer that the behaviour of one machine not impact the others, then pushing the cache largely into the virtuals may make more sense. In this case you’d set your I/O options on the virtual machine to nocache (telling the host not to cache your I/O), and give a reasonable amount of memory to your virtuals so they can do their own caching. This also has the advantage that the cache is closer to the end user – you don’t have to run through the virtual machine’s full I/O stack to get to the cache.
For my machines I tend to choose the latter. So, for example, for my mythtv media server I needed a reasonable size memory anyway because I occasionally run an X server on it. May as well use that RAM as cache when it’s not otherwise used. I also network attach the core media file systems (recorded TV, music and the like), they are NFS attached so don’t cache. Therefore this machine is largely caching the mysql database, which gives a much snappier UI. Since it’s cached in the virtual, other activity on my other servers doesn’t evict the database from the filesystem cache – when I next use it everything is still in cache and fast.
Another option on the cache settings in the virtual is writeback or writethrough. Writeback means that your virtual machine will consider the write to be complete when it has handed off to the host. If you lose power before the host writes to disk then the write is lost, but it’s potentially faster. Writethrough means that reads are cached, but writes aren’t considered complete until they’re all the way on disk. I’m thinking that during the install process it may make sense to set the cache to writeback, which should make the install much faster (it’s IO intensive, and if you crash during the install you’re going to have to restart it anyway, so nothing lost). After the install is complete I would then set the cache to none.
The other option is the IO scheduler. What the IO scheduler does is decide when it’s going to actually perform a read or a write when a running process requests it. The aim is to chunk up reads and writes so that the disk heads don’t need to jump around so much. My thought is that it makes no sense to have one IO scheduler layered on top of another, since no doubt they’ll conflict and end up waiting too long for a particular action to be performed. Further still, the virtual has little insight into the physical layout of the disk – in my case it’s running against a logical volume that is really a slice of a RAID device underneath. The IO scheduler in the virtual is unlikely to make good decisions about when to perform an action, the IO scheduler in the host actually knows something about the underlying hardware. My recommendation is to turn off the IO scheduler in your virtual, you do this by altering your grub command line, updating /etc/defaults/grub