I have to "fudge" a change in /etc/exports in order to get mount -a to execute, what's the issue?

Corepuncher asked:

Whenever I have to reboot server1, the only way I can get NFS mounts back up on server2 is to change one of the “fsid” integers in the /etc/exports file on server1. Otherwise, the mount -a command just hangs on server2.

Typical Scenario:

Server1 is rebooted. On server1, I have two lines of code in /etc/exports:

/mnt/ramdisk/dir1 *(fsid=0,rw,no_root_squash,no_subtree_check,async)
/mnt/ramdisk/dir2 *(fsid=1,rw,no_root_squash,no_subtree_check,async)

I issue this command:

"exportfs -r".

On server2, I have this in /etc/fstab:

xxx.xxx.x.x:/server1_dir1/ /dir1_server2 nfs async,noatime 0 0
xxx.xxx.x.x:/server2_dir2  /dir2_server2 nfs async,noatime 0 0

I first “umount” the old dirs that now have stale NFS handles. Then,

"mount -a"

The command hangs. After I kill it, df shows dir1 mounted, but dir2 did not.

The only way to get both dirs to mount is to change the fsid integer to something else. For example, on server1 we now have:

/mnt/ramdisk/dir1 *(fsid=0,rw,no_root_squash,no_subtree_check,async)
/mnt/ramdisk/dir2 *(fsid=2,rw,no_root_squash,no_subtree_check,async)

I changed fsid=1 to =2. I again issue the exportfs -r command, and voila, the mount -a command works on server2.

Perhaps I do not understand what fsid really does, but obviously there must be a better way to “remount” NFS than having to randomly edit the fsid number every time?

EDIT: If I do not have fsid included in my exports file on server1, it gives me

"Warning: /mnt/ramdisk/dir1 requires fsid= for NFS export" 

And, if I set fsid=0 for both lines (dir1 and dir2), then the mount points end up being the same as all my files were being copied to the dir1 location! So it seems the only way for this to work is to constantly switch fsid integers, somewhat randomly.

EDIT2: I removed fsid=0 since it is “special”, and changed them to fsid=1 and fsid=2 in /etc/exports on server1. This of course worked (since file was changed). But today just had to reboot forcibly, and after (slowly) unmounting the stale drives from server2, mount -a failed, as before. So, (as before), I edited the exports file on server1, this time, fsid=2 and fsid=3, exportfs -r, and voila, mount -a works again on server2. Back to square 1.

EDIT3 Critical info: If I take everything down in a controlled manner (i.e…server1 does not “crash”), and first unmount the dirs on server2, then reboot server1, THEN mount -a on server2, it works great. It’s only when the mounts on server2 are abruptly cut off that this problem occurs. So I”m guessing something needs to be reset on server2? I know it takes a long time to unmount the stale handles on server2 after server1 crashes.

My answer:

My guess would be that the problem is caused by the use of fsid=0 in one of your exports.

Remember that the fsid is meant to uniquely identify devices when the underlying filesystem driver doesn’t provide its own unique IDs. And in particular, fsid=0 has a special meaning:

For NFSv4, there is a distinguished filesystem which is the root of all exported filesystem. This is specified with fsid=root or fsid=0 both of which mean exactly the same thing.

Since this clearly isn’t what you want, always use an fsid other than 0.

View the full question and answer on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.