NFS server based on CentOS 7
A how-to guide for configuring an NFS server to share binaries for a HPC cluster. Optimized for performance for a read-only scenario.
NFS
Network File System(NFS) is a widely used distributed filesystem that works reasonably well in most scenarios where you must share a filesytem across several computers. It can use UDP or TCP underneath for transport, and is ofcourse very much configurable to get the best out of it for specific usecases.
At our research cluster, we use NFS for making user's home directories and research datasets available on all compute nodes. For such a serious usecase, the NFS server needs to be sophisticated enough with caches at several layers to stand up for the performance and durability needs of the workload. So, we rely on an enterprise grade Network Attached Storage system - Isilon for an optimized NFS server.
Context
What I'm dealing with today, is to share a filesystem with software binaries across all compute nodes in the cluster in a read-only manner. Basically, I will install a software package on one server, and copy the binaries to a location, and voila! it is ready for use by all compute nodes instantly.
My goal is to setup in such a way that clients mount this filesystem in read-only mode, and cache it heavily for local use. I, as an administrator will have a way to purge this client-side cache in an automated way when I update the binaries on the NFS server.
Server-side setup
As a good first step, update the server and grab the latest NFS utilities.
Add our domain name to idmapd. I want to export /opt on the NFS server to be available for all compute nodes in the cluster. To learn more about export options, check the man page.. Let's tell systemd to start our NFS server and start it at boot everytime. Add permissions through the firewall so that our clients can talk to this server. Let's add a test file to later verify it on the client.Performance notes:
Let's talk threads. If you're going for maximum utilisation of the server resources for this NFS server, you should tell NFS to start atleast as many daemon threads as many CPU cores available in the system. In my case, my server has 8 cores. The default # of threads with NFS on Centos 7 is 8. However, I expect this NFS server to concurrently serve about 100 compute nodes when they use software binaries from /opt. So, I could use the extra bit of concurrency.
It can be checked at
I modified it to 16 to have 16 NFS threads servicing clients. I will increase this number in future if real workloads show a need for it.There are other knobs to turn on both server and client to adapt to the kind of workload that is expected, but those are out of scope of this post. If you're running a rw workload over NFS and have a lot of time to spare to understand NFS caching and squeezing every last bit of performance out of your setup, the following references would help..
- https://helpful.knobs-dials.com/index.php/NFS_notes
- https://docstore.mik.ua/orelly/networking_2ndEd/nfs/ch07_04.htm
- https://www.avidandrew.com/understanding-nfs-caching.html
- https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/ch-nfs
- http://www.admin-magazine.com/HPC/Articles/Useful-NFS-Options-for-Tuning-and-Management
In my case, since my only focus is on servicing reads. I optimized my NFS server with bigger system read buffers so that the NFS daemon threads can service reads faster from memory. The following tweaks the configuration in a way that buffer sizes start from 256 KiB and can be increased to 512 KiB upon need.
echo "net.core.rmem_default = 262144" >> /etc/sysctl.conf
echo "net.core.rmem_max = 524288" >> /etc/sysctl.conf
sysctl -p
Client-side setup
As a good first step, update the system and grab the latest NFS utilities.
Add our domain name to idmapd. Enable rpcbind to let the NFS client to talk to the NFS server, Mount the filesystem for testing by Now, let's see if we are able to read the test file we created on the server. Great, things work as we expected. Let's make it permanent and happen at boot time. In simple environments, you can add an entry to /etc/fstab. However, I make use of autofs in my environment. In my specific case, the file /etc/autofs.centos tells the kernel where the mount locations are. The following does it for me.echo "/opt -noatime,nodiratime,ro opt.ghpc.au.dk:/opt" >> /etc/autofs.centos
systemctl reload autofs.service
> cat /proc/mounts|grep opt
/etc/auto.centos.ghpc /opt autofs rw,relatime,fd=19,pgrp=26422,timeout=300,minproto=5,maxproto=5,direct,pipe_ino=1272693 0 0
opt.ghpc.au.dk:/opt /opt nfs4 ro,noatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.16.2.30,local_lock=none,addr=172.16.2.37 0 0
However, it is not always true that NFSv4.1 is faster than NFSv3. Some enterprise vendors' implementation of the NFSv4.1 protocol are still immature, and they would recommend you to stay with NFSv3. Obey their recommendations unless you want a huge performance hit.
Conclusion
Without any surprises, we were able to setup a simple NFS client & server. The client setup can be replicated in many other compute nodes to share the file system.