Discussion:
[libvirt-users] Breaking a virtlockd lock?
Steve Gaarder
2018-07-03 14:20:29 UTC
Permalink
I have several Qemu/kvm servers running VMs hosted on an NFS share, and am
using virtlockd. (lock_manager = "lockd" in qemu.conf) After a power
failure, one of the VMs will not start, claiming that it is locked. How do
I get out of this?

thanks,

Steve Gaarder
System Administrator, Dept of Mathematics
Cornell University, Ithaca, NY, USA
***@math.cornell.edu
Daniel P. Berrangé
2018-07-03 20:09:01 UTC
Permalink
Post by Steve Gaarder
I have several Qemu/kvm servers running VMs hosted on an NFS share, and am
using virtlockd. (lock_manager = "lockd" in qemu.conf) After a power
failure, one of the VMs will not start, claiming that it is locked. How do I
get out of this?
Libvirt uses fcntl() for locking disk image. In NFS v2 and v3, locking is
a side band protocol and when an NFS client host dies while holding locks,
the server will not release them. When the host comes back online it tell
the server to flush all locks it previously held. The problems obviously
arise if your dead host doesn't come back online, as nothgin will release
the locks and so other hosts won't be able to lock the VM.

In NFS v4 the situation is much improved, as locking is part of the main
protocol implemented as continually renewed leases. Thus when a client host
dies, it is possible for the server to timeout any locks it held without
waiting for the host to come back online.

My best recommendation would thus be to use NFS v4. Note that there's still
a 60 second timeout IIRC by default before the server releases the dead
client's locks.

Take a read of "man 5 nfs" if you want to learn more - see the section
headings

"Using file locks with NFS"

and

"NFS version 4 Leases"


Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Loading...