Discussion:
[libvirt-users] live migration via unix socket
Daniel P. Berrangé
2018-08-29 08:55:10 UTC
Permalink
Hey,
Over in KubeVirt we're investigating a use case where we'd like to perform
a live migration within a network namespace that does not provide libvirtd
with network access. In this scenario we would like to perform a live
migration by proxying the migration through a unix socket to a process in
another network namespace that does have network access. That external
process would live on every node in the cluster and know how to correctly
route connections between libvirtds.
virsh example of an attempted migration via unix socket.
virsh migrate --copy-storage-all --p2p --live --xml domain.xml my-vm
qemu+unix:///system?socket=destination-host-proxy-sock
In this example, the src libvirtd is able to establish a connection to the
destination libvirtd via the unix socket proxy. However, the migration-uri
appears to require either tcp or rdma network connection. If I force the
migration-uri to be a unix socket, I receive an error [1] indicating that
qemu+unix is not a valid transport.
qemu+unix is a syntax for libvirt's URI format. The URI scheme for
migration is not the same, so you can't simply plug in qemu+unix here.
Technically with qemu+kvm I believe what we're attempting should be
possible (even though it is inefficient). Please correct me if I'm wrong.
Is there a way to achieve this migration via unix socket functionality this
using Libvirt? Also, is there a reason why the migration uri is limited to
tcp/rdma
Internally libvirt does exactly this when using its TUNNELLED live migration
mode. In this QEMU is passed an anonymous UNIX socket and the data is all
copied over the libvirtd <-> libvirtd connection and then copied again back
to QEMU on another UNIX socket. This was done because QEMU has long had no
ability to encrypt live migration, so tunnelling over libvirtd's own TLS
secured connection was only secure mechanism.

We've done work in QEMU to natively support TLS now so that we can get rid
of this tunnelling, as this architecture decreased performance and consumed
precious CPU memory bandwidth, which is particularly bad when libvirtd and
QEMU were on different NUMA nodes. It is already a challenge to get live
migration to successfully complete even with a direct network connection.
Although QEMU can do it at the low level, we've never exposed anything
other than direct network transports at the API level.

Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
David Vossel
2018-09-10 18:38:48 UTC
Permalink
Hey,
Over in KubeVirt we're investigating a use case where we'd like to
perform
a live migration within a network namespace that does not provide
libvirtd
with network access. In this scenario we would like to perform a live
migration by proxying the migration through a unix socket to a process in
another network namespace that does have network access. That external
process would live on every node in the cluster and know how to correctly
route connections between libvirtds.
virsh example of an attempted migration via unix socket.
virsh migrate --copy-storage-all --p2p --live --xml domain.xml my-vm
qemu+unix:///system?socket=destination-host-proxy-sock
In this example, the src libvirtd is able to establish a connection to
the
destination libvirtd via the unix socket proxy. However, the
migration-uri
appears to require either tcp or rdma network connection. If I force the
migration-uri to be a unix socket, I receive an error [1] indicating that
qemu+unix is not a valid transport.
qemu+unix is a syntax for libvirt's URI format. The URI scheme for
migration is not the same, so you can't simply plug in qemu+unix here.
Technically with qemu+kvm I believe what we're attempting should be
possible (even though it is inefficient). Please correct me if I'm wrong.
Is there a way to achieve this migration via unix socket functionality
this
using Libvirt? Also, is there a reason why the migration uri is limited
to
tcp/rdma
Internally libvirt does exactly this when using its TUNNELLED live migration
mode. In this QEMU is passed an anonymous UNIX socket and the data is all
copied over the libvirtd <-> libvirtd connection and then copied again back
Sorry for the delayed response here, I've only just picked this task back
up again recently.

With the TUNNELLED and PEER2PEER migration flags set, Libvirt won't allow
the libvirtd <-> libvirtd connection over a unix socket.

Libvirt returns this error "Attempt to migrate guest to the same host".
The virDomainMigrateCheckNotLocal() function ensures that a peer2peer
migration won't occur when the destination is a unix socket.

Is there anyway around this? We'd like to tunnel the destination connection
through a unix socket. The other side of the unix socket is a network proxy
in a different network namespace which properly performs the remote
connection.
to QEMU on another UNIX socket. This was done because QEMU has long had no
ability to encrypt live migration, so tunnelling over libvirtd's own TLS
secured connection was only secure mechanism.
We've done work in QEMU to natively support TLS now so that we can get rid
of this tunnelling, as this architecture decreased performance and consumed
precious CPU memory bandwidth, which is particularly bad when libvirtd and
QEMU were on different NUMA nodes. It is already a challenge to get live
migration to successfully complete even with a direct network connection.
Although QEMU can do it at the low level, we've never exposed anything
other than direct network transports at the API level.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/
dberrange :|
|: https://libvirt.org -o-
https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/
dberrange :|
Martin Kletzander
2018-09-12 10:59:05 UTC
Permalink
Post by David Vossel
Hey,
Over in KubeVirt we're investigating a use case where we'd like to
perform
a live migration within a network namespace that does not provide
libvirtd
with network access. In this scenario we would like to perform a live
migration by proxying the migration through a unix socket to a process in
another network namespace that does have network access. That external
process would live on every node in the cluster and know how to correctly
route connections between libvirtds.
virsh example of an attempted migration via unix socket.
virsh migrate --copy-storage-all --p2p --live --xml domain.xml my-vm
qemu+unix:///system?socket=destination-host-proxy-sock
In this example, the src libvirtd is able to establish a connection to
the
destination libvirtd via the unix socket proxy. However, the
migration-uri
appears to require either tcp or rdma network connection. If I force the
migration-uri to be a unix socket, I receive an error [1] indicating that
qemu+unix is not a valid transport.
qemu+unix is a syntax for libvirt's URI format. The URI scheme for
migration is not the same, so you can't simply plug in qemu+unix here.
Technically with qemu+kvm I believe what we're attempting should be
possible (even though it is inefficient). Please correct me if I'm wrong.
Is there a way to achieve this migration via unix socket functionality
this
using Libvirt? Also, is there a reason why the migration uri is limited
to
tcp/rdma
Internally libvirt does exactly this when using its TUNNELLED live migration
mode. In this QEMU is passed an anonymous UNIX socket and the data is all
copied over the libvirtd <-> libvirtd connection and then copied again back
Sorry for the delayed response here, I've only just picked this task back
up again recently.
With the TUNNELLED and PEER2PEER migration flags set, Libvirt won't allow
the libvirtd <-> libvirtd connection over a unix socket.
Libvirt returns this error "Attempt to migrate guest to the same host".
The virDomainMigrateCheckNotLocal() function ensures that a peer2peer
migration won't occur when the destination is a unix socket.
Is there anyway around this? We'd like to tunnel the destination connection
through a unix socket. The other side of the unix socket is a network proxy
in a different network namespace which properly performs the remote
connection.
IMHO that is there just for additional safety since the check with serves the
same purpose is done again in more sensible matter later on (checking that the
hostnames and UUIDs are different). Actually it's just an older check before
the UUID and hostname were sent in the migration cookie. And that's there for
quite some time.

IMHO that check can go. In the worst case we can skip that check
(!tempuri->server) if you ask for unsafe migration.

Also, just to try it out, you *might* be able to work around that check by using
something like unix://localhost.localdomain/path/to/unix.socket (basically
adding any hostname different than localhost there), but I might be wrong there.
Post by David Vossel
to QEMU on another UNIX socket. This was done because QEMU has long had no
ability to encrypt live migration, so tunnelling over libvirtd's own TLS
secured connection was only secure mechanism.
We've done work in QEMU to natively support TLS now so that we can get rid
of this tunnelling, as this architecture decreased performance and consumed
precious CPU memory bandwidth, which is particularly bad when libvirtd and
QEMU were on different NUMA nodes. It is already a challenge to get live
migration to successfully complete even with a direct network connection.
Although QEMU can do it at the low level, we've never exposed anything
other than direct network transports at the API level.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/
dberrange :|
|: https://libvirt.org -o-
https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/
dberrange :|
David Vossel
2018-09-14 16:55:20 UTC
Permalink
Post by Martin Kletzander
Post by David Vossel
Hey,
Over in KubeVirt we're investigating a use case where we'd like to
perform
a live migration within a network namespace that does not provide
libvirtd
with network access. In this scenario we would like to perform a live
migration by proxying the migration through a unix socket to a process
in
another network namespace that does have network access. That external
process would live on every node in the cluster and know how to
correctly
route connections between libvirtds.
virsh example of an attempted migration via unix socket.
virsh migrate --copy-storage-all --p2p --live --xml domain.xml my-vm
qemu+unix:///system?socket=destination-host-proxy-sock
In this example, the src libvirtd is able to establish a connection to
the
destination libvirtd via the unix socket proxy. However, the
migration-uri
appears to require either tcp or rdma network connection. If I force
the
migration-uri to be a unix socket, I receive an error [1] indicating
that
qemu+unix is not a valid transport.
qemu+unix is a syntax for libvirt's URI format. The URI scheme for
migration is not the same, so you can't simply plug in qemu+unix here.
Technically with qemu+kvm I believe what we're attempting should be
possible (even though it is inefficient). Please correct me if I'm
wrong.
Is there a way to achieve this migration via unix socket functionality
this
using Libvirt? Also, is there a reason why the migration uri is limited
to
tcp/rdma
Internally libvirt does exactly this when using its TUNNELLED live migration
mode. In this QEMU is passed an anonymous UNIX socket and the data is all
copied over the libvirtd <-> libvirtd connection and then copied again back
Sorry for the delayed response here, I've only just picked this task back
up again recently.
With the TUNNELLED and PEER2PEER migration flags set, Libvirt won't allow
the libvirtd <-> libvirtd connection over a unix socket.
Libvirt returns this error "Attempt to migrate guest to the same host".
The virDomainMigrateCheckNotLocal() function ensures that a peer2peer
migration won't occur when the destination is a unix socket.
Is there anyway around this? We'd like to tunnel the destination connection
through a unix socket. The other side of the unix socket is a network proxy
in a different network namespace which properly performs the remote
connection.
IMHO that is there just for additional safety since the check with serves the
same purpose is done again in more sensible matter later on (checking that the
hostnames and UUIDs are different). Actually it's just an older check before
the UUID and hostname were sent in the migration cookie. And that's there for
quite some time.
IMHO that check can go. In the worst case we can skip that check
(!tempuri->server) if you ask for unsafe migration.
Also, just to try it out, you *might* be able to work around that check by using
something like unix://localhost.localdomain/path/to/unix.socket (basically
adding any hostname different than localhost there), but I might be wrong there.
I tried a few variations of this and none of them worked :(

Any chance we can get the safety check removed for the next Libvirt
release? Does there need to be an issue opened to track this?
Post by Martin Kletzander
Post by David Vossel
to QEMU on another UNIX socket. This was done because QEMU has long had no
ability to encrypt live migration, so tunnelling over libvirtd's own TLS
secured connection was only secure mechanism.
We've done work in QEMU to natively support TLS now so that we can get rid
of this tunnelling, as this architecture decreased performance and consumed
precious CPU memory bandwidth, which is particularly bad when libvirtd and
QEMU were on different NUMA nodes. It is already a challenge to get live
migration to successfully complete even with a direct network connection.
Although QEMU can do it at the low level, we've never exposed anything
other than direct network transports at the API level.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/
dberrange :|
|: https://libvirt.org -o-
https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/
dberrange :|
Fabian Deutsch
2018-09-17 12:17:39 UTC
Permalink
Post by David Vossel
Post by Martin Kletzander
Post by David Vossel
Hey,
Over in KubeVirt we're investigating a use case where we'd like to
perform
a live migration within a network namespace that does not provide
libvirtd
with network access. In this scenario we would like to perform a live
migration by proxying the migration through a unix socket to a
process in
another network namespace that does have network access. That
external
process would live on every node in the cluster and know how to
correctly
route connections between libvirtds.
virsh example of an attempted migration via unix socket.
virsh migrate --copy-storage-all --p2p --live --xml domain.xml my-vm
qemu+unix:///system?socket=destination-host-proxy-sock
In this example, the src libvirtd is able to establish a connection to
the
destination libvirtd via the unix socket proxy. However, the
migration-uri
appears to require either tcp or rdma network connection. If I force
the
migration-uri to be a unix socket, I receive an error [1] indicating
that
qemu+unix is not a valid transport.
qemu+unix is a syntax for libvirt's URI format. The URI scheme for
migration is not the same, so you can't simply plug in qemu+unix here.
Technically with qemu+kvm I believe what we're attempting should be
possible (even though it is inefficient). Please correct me if I'm
wrong.
Is there a way to achieve this migration via unix socket functionality
this
using Libvirt? Also, is there a reason why the migration uri is
limited
to
tcp/rdma
Internally libvirt does exactly this when using its TUNNELLED live migration
mode. In this QEMU is passed an anonymous UNIX socket and the data is all
copied over the libvirtd <-> libvirtd connection and then copied again back
Sorry for the delayed response here, I've only just picked this task back
up again recently.
With the TUNNELLED and PEER2PEER migration flags set, Libvirt won't allow
the libvirtd <-> libvirtd connection over a unix socket.
Libvirt returns this error "Attempt to migrate guest to the same host".
The virDomainMigrateCheckNotLocal() function ensures that a peer2peer
migration won't occur when the destination is a unix socket.
Is there anyway around this? We'd like to tunnel the destination connection
through a unix socket. The other side of the unix socket is a network proxy
in a different network namespace which properly performs the remote
connection.
IMHO that is there just for additional safety since the check with serves the
same purpose is done again in more sensible matter later on (checking that the
hostnames and UUIDs are different). Actually it's just an older check before
the UUID and hostname were sent in the migration cookie. And that's there for
quite some time.
IMHO that check can go. In the worst case we can skip that check
(!tempuri->server) if you ask for unsafe migration.
Also, just to try it out, you *might* be able to work around that check by using
something like unix://localhost.localdomain/path/to/unix.socket (basically
adding any hostname different than localhost there), but I might be wrong there.
I tried a few variations of this and none of them worked :(
Any chance we can get the safety check removed for the next Libvirt
release? Does there need to be an issue opened to track this?
Regardless of Martin's answer :): Please file one.
Please file an RFE requesting the change and stating the motivation.

- fabian
Post by David Vossel
Post by Martin Kletzander
Post by David Vossel
to QEMU on another UNIX socket. This was done because QEMU has long had
no
ability to encrypt live migration, so tunnelling over libvirtd's own TLS
secured connection was only secure mechanism.
We've done work in QEMU to natively support TLS now so that we can get
rid
of this tunnelling, as this architecture decreased performance and consumed
precious CPU memory bandwidth, which is particularly bad when libvirtd and
QEMU were on different NUMA nodes. It is already a challenge to get live
migration to successfully complete even with a direct network connection.
Although QEMU can do it at the low level, we've never exposed anything
other than direct network transports at the API level.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/
dberrange :|
|: https://libvirt.org -o-
https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/
dberrange :|
Martin Kletzander
2018-10-12 08:50:53 UTC
Permalink
Post by Fabian Deutsch
Post by David Vossel
Any chance we can get the safety check removed for the next Libvirt
release? Does there need to be an issue opened to track this?
Regardless of Martin's answer :): Please file one.
Please file an RFE requesting the change and stating the motivation.
Is there any BZ or issue created where I could post an update? I spent some
time with this and I got stuck at what looks like the daemon not having a remote
driver instantiated "at some times". Either it's something very peculiar or I'm
missing something.
David Vossel
2018-10-12 17:50:37 UTC
Permalink
Post by Martin Kletzander
Post by Fabian Deutsch
Post by David Vossel
Any chance we can get the safety check removed for the next Libvirt
release? Does there need to be an issue opened to track this?
Regardless of Martin's answer :): Please file one.
Please file an RFE requesting the change and stating the motivation.
Is there any BZ or issue created where I could post an update? I spent some
time with this and I got stuck at what looks like the daemon not having a remote
driver instantiated "at some times". Either it's something very peculiar or I'm
missing something.
Hey, I've created a few BZ's for issues we encountered attempting to
introduce live migrations into KubeVirt.

Unable to migrate between to libvirt environments with the same hostname
https://bugzilla.redhat.com/show_bug.cgi?id=1638882

Unable to perform tunnelled migration to destination libvirt over unix
socket
https://bugzilla.redhat.com/show_bug.cgi?id=1638889

Libvirt Lifecycle events not firing after migration.
https://bugzilla.redhat.com/show_bug.cgi?id=1638894

Loading...