Discussion:
[libvirt-users] timeout on VM actions prone to hang
Nikola Ciprich
2018-11-07 12:46:29 UTC
Permalink
Hi fellow libvirt users,

I'd like to ask, whether somebody possibly dealt with similar
problem we're hitting.. Some of libvirt VM operations (ie
fs freeze) are prone to hang for long time, in case the guest
agent is in some bad state.. My question is, if it's possible
to set some timeout for such operations, or we have to deal with
it ie with separate thread and some timers? we're using python
libvirt bindings..

I'll appreciate any advice

BR

nik
--
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.: +420 591 166 214
fax: +420 596 621 273
mobil: +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ***@linuxbox.cz
-------------------------------------
Michal Privoznik
2018-11-07 13:59:59 UTC
Permalink
Post by Nikola Ciprich
Hi fellow libvirt users,
I'd like to ask, whether somebody possibly dealt with similar
problem we're hitting.. Some of libvirt VM operations (ie
fs freeze) are prone to hang for long time, in case the guest
agent is in some bad state.. My question is, if it's possible
to set some timeout for such operations, or we have to deal with
it ie with separate thread and some timers? we're using python
libvirt bindings..
I'll appreciate any advice
We explicitly chose not to have any timeouts becuase no one can know how
big the timeout should be. Nor libvirt, nor mgmt application. What I am
saying is that even if you'd set timeout of X seconds, fs freeze might
still time out. But given that Murphy's law are correct the freeze will
finish right after timeout is reported. Problem with this is that domain
is in different state than libvirt thinks.

But specifically for qemu guest agent related issues, there is
virDomainQemuAgentCommand() through which you can send 'guest-ping' to
check that the agent is responsive. If it fails, then don't issue fs
freeze API and vice versa.

Michal
Peter Krempa
2018-11-07 16:38:05 UTC
Permalink
Post by Michal Privoznik
Post by Nikola Ciprich
Hi fellow libvirt users,
I'd like to ask, whether somebody possibly dealt with similar
problem we're hitting.. Some of libvirt VM operations (ie
fs freeze) are prone to hang for long time, in case the guest
agent is in some bad state.. My question is, if it's possible
to set some timeout for such operations, or we have to deal with
it ie with separate thread and some timers? we're using python
libvirt bindings..
I'll appreciate any advice
We explicitly chose not to have any timeouts becuase no one can know how
big the timeout should be. Nor libvirt, nor mgmt application. What I am
saying is that even if you'd set timeout of X seconds, fs freeze might
still time out. But given that Murphy's law are correct the freeze will
finish right after timeout is reported. Problem with this is that domain
is in different state than libvirt thinks.
But specifically for qemu guest agent related issues, there is
virDomainQemuAgentCommand() through which you can send 'guest-ping' to
check that the agent is responsive. If it fails, then don't issue fs
freeze API and vice versa.
Well, internally libvirt actually pings the guest agent prior to issuing
an API, but after we indeed issue the API, the call is synchronous.

If they weren't synchronous it would be impossible to figure out what
the actual state is without an elaborate event based infrastructure.

Libvirt's APIs are specifically designed to be synchronous (except those
that are not ... obviously - mostly device hotplug and block jobs).
Loading...