Damn, networked file systems are slow

Published October 19, 2020

I’m in the process of rewriting my package build script at the moment, cause aurutils just wasn’t working well enough for packages built from version control. While I’m doing this, I’m constantly rebuilding my whole repository, something that should take a fair amount of time. I’ve rewrote my script to build packages in parallel, so there’s a lot of taxing on the system’s cpu and more importantly file system. I had put all the package’s build files and repository on the host’s file system, so that the VM doesn’t run out of space. This left me with the problem of how to connect the remote file system to the VM.

After looking around online, I had settled on using NFS. Most resources online suggested that its performance was decent and it was easy to set up. This was generally working well for me, that is until I started building packages in parallel. My initial indication was that this was just a lot of load which was causing the whole system to slow down. However, upon poking around it was really just operations involving the remote file system which was causing trouble. For reference, running ls could take over 5 seconds.

Obviously this wasn’t good enough. My first turn was to try and find some tweaks to make, but nothing seemed to work. It seemed like it was random I/O operations which were the culprits. Any tweaks I made seemed to do nothing, so I settled on creating a new virtual disk and trying to build on that. After a bit of stuffing around getting the PKGBUILDs to work in new a new directory, I was ready for a test. The difference was immediately noticeable. A build that took over an hour before now was taking 5 minutes. Astounded, I ran a rebuild of all the packages. Less than half an hour! I am absolutely amazed at the difference it made. I didn’t passthrough a disk, I just created a qcow2 volume and gave it to the VM and this is the performance difference I am seeing. And to top it off, the system stays responsive while building.

Now, hey, I may have done something wrong, but there shouldn’t be this large a difference. I suspect it was the fact that it was networked, but this is a VM running accessing a disk on the host, the network overhead shouldn’t be that large. I suspect it has to do with some other features that make it more useful as a general remote file system. Nonetheless, there should be some solution for connecting VMs to remote file systems with minimal overhead. I’ll have to look into it, and if not it may be worth writing one if I can get it to work. Something with performance at all costs.

Still, that much of a penalty for using a networked file system. Damn!