Synopsis | |
---|---|
Begin | 2025-09-12 08:25:00 |
Affected | clusternodes, lilo’s using /vol and /home |
When we moved the gateways of our networks from the old location to the new firewalls, we have received some complaints about NFS filesystems having slowness, longer delays or unavailability.
In general NFS should never be a requirement for a clusternode job if you can avoid it, because this I/O is always much slower than local I/O from /scratch.
We are investigating how we can optimise the network to resolve this issue, but we are hard pressed to know the exact cause of the problem. We are assuming the root cause might be that, now more traffic is passing through our firewalls, this degrades the performance in some way.
Note that NFS shares always have performance variability, due to the shared nature of the network. We currently have no QoS in place for NFS/Samba (meaning that any one user can use all the bandwidth and, unintentionally, make life worse for all other users).