OpenStack Juno Live-Migration never completes for instances with high load and size >64GB

Red Cricket asked:

I have run into situations where live migrations never seem to complete or error out.

Here is how I have been able to reproduce the problem.

Here is the instance I am migrating:

[root@osc1-mgmt-001 tmp]# nova show gb72-net-002-org-001
+--------------------------------------+---------------------------------------------------------------------+
| Property                             | Value                                                               |
+--------------------------------------+---------------------------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                                              |
| OS-EXT-AZ:availability_zone          | nova                                                                |
| OS-EXT-SRV-ATTR:host                 | osc1-net-002.example.com                                          |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | osc1-net-002.example.com                                          |
| OS-EXT-SRV-ATTR:instance_name        | gb72-net-002-org-001                                                |
| OS-EXT-STS:power_state               | 1                                                                   |
| OS-EXT-STS:task_state                | migrating                                                           |
| OS-EXT-STS:vm_state                  | active                                                              |
| OS-SRV-USG:launched_at               | 2016-05-12T20:01:23.000000                                          |
| OS-SRV-USG:terminated_at             | -                                                                   |
| accessIPv4                           |                                                                     |
| accessIPv6                           |                                                                     |
| config_drive                         |                                                                     |
| created                              | 2016-05-12T20:00:58Z                                                |
| flavor                               | gb72_vm (668ca3b4-a7c0-4309-a11e-4fb5377e4180)                      |
| hostId                               | 44206a2390a038b0ede2a4375f1239b0cef917149bd5976fcada6781            |
| id                                   | 3b176ee2-fcf3-41a6-b658-361ffd19639e                                |
| image                                | CentOS-7-x86_64-GenericCloud (588e035d-2e1e-4720-94c4-8b000bf9d2ef) |
| key_name                             | nk                                                                  |
| metadata                             | {}                                                                  |
| name                                 | gb72-net-002-org-001                                                |
| os-extended-volumes:volumes_attached | [{"id": "16afe52c-31b0-4a3a-b718-aa1789df2852"}]                    |
| public-47 network                    | 10.29.105.13                                                        |
| security_groups                      | default                                                             |
| status                               | MIGRATING                                                           |
| tenant_id                            | 9d011b7c8d104af1b887e229cee436d2                                    |
| updated                              | 2016-05-13T17:07:48Z                                                |
| user_id                              | fa8b956c89304124967bb4bcea54124b                                    |
+--------------------------------------+---------------------------------------------------------------------+

The flavor gb72_vm is one I created and looks like this:

[root@osc1-mgmt-001 tmp]# nova flavor-show gb72_vm
+----------------------------+--------------------------------------+
| Property                   | Value                                |
+----------------------------+--------------------------------------+
| OS-FLV-DISABLED:disabled   | False                                |
| OS-FLV-EXT-DATA:ephemeral  | 0                                    |
| disk                       | 20                                   |
| extra_specs                | {}                                   |
| id                         | 668ca3b4-a7c0-4309-a11e-4fb5377e4180 |
| name                       | gb72_vm                              |
| os-flavor-access:is_public | True                                 |
| ram                        | 72000                                |
| rxtx_factor                | 1.0                                  |
| swap                       | 16000                                |
| vcpus                      | 8                                    |
+----------------------------+--------------------------------------+

After I launched the instance I installed stress and I am running stress on the instance like so:

[centos@gb72-net-002-org-001 stress-1.0.4]$ stress -c 6 -m 4 --vm-bytes 512M

I am also running top on the instance and this is what that looks like:

top - 17:17:02 up 21:15,  1 user,  load average: 10.11, 10.08, 10.06
Tasks: 149 total,  12 running, 137 sleeping,   0 stopped,   0 zombie
%Cpu(s): 62.0 us, 38.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 72323392 total, 70503632 free,  1344768 used,   474988 buff/cache
KiB Swap: 16383996 total, 16383996 free,        0 used. 70740048 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
10273 centos    20   0    7260     96      0 R  86.7  0.0   1008:21 stress
10276 centos    20   0    7260     96      0 R  84.7  0.0   1008:22 stress
10271 centos    20   0    7260     96      0 R  84.1  0.0   1008:00 stress
10275 centos    20   0    7260     96      0 R  82.1  0.0   1009:28 stress
10270 centos    20   0  531552 218716    176 R  80.7  0.3   1011:42 stress
10272 centos    20   0  531552 142940    176 R  80.4  0.2   1012:40 stress
10269 centos    20   0    7260     96      0 R  78.7  0.0   1008:38 stress
10274 centos    20   0  531552 333404    176 R  73.1  0.5   1012:32 stress
10267 centos    20   0    7260     96      0 R  70.4  0.0   1008:41 stress
10268 centos    20   0  531552  38452    176 R  65.8  0.1   1011:29 stress
    1 root      20   0  191352   6652   3908 S   0.0  0.0   0:06.00 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.02 kthreadd
    3 root      20   0       0      0      0 S   0.0  0.0   0:01.45 ksoftirqd/0
    5 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/0:0H
    6 root      20   0       0      0      0 S   0.0  0.0   0:00.12 kworker/u16:0
    7 root      rt   0       0      0      0 S   0.0  0.0   0:00.62 migration/0
    8 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcu_bh
    9 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcuob/0
   10 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcuob/1
   11 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcuob/2
   12 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcuob/3
   13 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcuob/4
   14 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcuob/5
   15 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcuob/6
   16 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcuob/7
   17 root      20   0       0      0      0 R   0.0  0.0   0:02.42 rcu_sched
   18 root      20   0       0      0      0 S   0.0  0.0   0:00.44 rcuos/0
   19 root      20   0       0      0      0 S   0.0  0.0   0:00.29 rcuos/1
   20 root      20   0       0      0      0 S   0.0  0.0   0:00.32 rcuos/2

I issued the command …

# nova live-migration gb72-net-002-org-001 osc6-net-001.example.com

… at May 12 20:10:41 GMT 2016. It is currently Fri May 13 17:13:46 GMT 2016
and the live migration is still going. It will complete successfully as soon
as I kill “stress” on the instance.

In production environments I have instances that are running hot for one reason or
another and I would like live migrate them without causing application outages
by killing off those applications that are causing high load.

Is there some configuration item I can tweek or some virsh trick I can use to migrate
the instance without first reducing the load on the instance?

UPDATE: What version of Qemu do I have?

Thank you for an excellent answer Michael . I am trying to figure out what version of qemu I have:

# rpm -qa | grep qemu
ipxe-roms-qemu-20130517-8.gitc4bce43.el7_2.1.noarch
libvirt-daemon-driver-qemu-1.2.17-13.el7_2.3.x86_64
qemu-img-rhev-2.1.2-23.el7_1.4.x86_64
qemu-kvm-common-rhev-2.1.2-23.el7_1.4.x86_64
qemu-kvm-rhev-2.1.2-23.el7_1.4.x86_64
[root@osc1-net-002 ~]# virsh -v
1.2.17

Update II:

I just want to make sure I am issuing the virsh command correctly:

On my compute node where my VM lives. I show that I have a good version of qemu:

[root@osc1-net-002 ~]# qemu-io --version
qemu-io version 2.1.2

Now I do a virsh list to get the instance name of the VM I want to live migrate like so:

[root@osc1-net-002 ~]# virsh list
 Id    Name                           State
----------------------------------------------------
 50    gb72-net-002-org-001           running

So based on that I would execute this command on my compute server, ocs1-net-002, to throttle gb72-net-002-org-002:

[root@osc1-net-002 ~]# virsh qemu-monitor-command gb72-net-002-org-002 --hmp migrate_set_capability auto-converge on

Then I can attempt to perform my live migrations like so:

[root@osc1-mgmt-001 ~]# nova live-migration gb72-net-002-org-002 osc6-net-001.example.com

Is that to correct set of commands to issue?

Update III. Michael got back to me and verified that the virsh command looks alright. Thanks Michael!

I have issue the live migration as I mention above and I am seeing this in the /etc/nova/nova-compute.log on osc1-net-002:

DEBUG nova.virt.libvirt.driver [-] [instance: bf616c8b-0054-47ee-a547-42c2a946be2e] Migration running for 2405 secs, memory 2% remaining; (bytes processed=2520487990540, remaining=1604055040, total=75515105280) _live_migration_monitor /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py:5721

Something I noticed is that the live migration has been running for 40 minutes. Also the bytes processed=2525969442377 is greater than the total=75515105280 which makes me think that if my VM is being throttled it is not being throttled enough.

UPDATE IV:

I was able to successful live migrate a VM that was experiencing heavy load. On the compute server I was migrating off of I executed:

[root@osc1-net-002 ~]# virsh qemu-monitor-command gb72-net-002-org-001 -hmp stop
error: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainMigratePerform3)
[root@osc1-net-002 ~]# virsh suspend gb72-net-002-org-001
Domain gb72-net-002-org-001 suspended

I am not sure why I am getting the error but it doesn’t seem to matter.

Now I checked to see if the live migration has completed:

[root@osc1-net-002 ~]# nova list
+--------------------------------------+----------------------+--------+------------+-------------+-----------------------+
| ID                                   | Name                 | Status | Task State | Power State | Networks              |
+--------------------------------------+----------------------+--------+------------+-------------+-----------------------+
| de335b04-8632-48e3-b17c-d80ac2d02983 | gb72-net-002-org-001 | ACTIVE | -          | Running     | public-47=10.29.105.9 |
| 229d8775-3a3c-46a6-8f40-7f86ca99af88 | test-net-001-org     | ACTIVE | -          | Running     | public-47=10.29.105.4 |
| 6d2ddad3-3851-4495-bf14-b787fed2ad99 | test-net-001-org-2   | ACTIVE | -          | Running     | public-47=10.29.105.7 |
+--------------------------------------+----------------------+--------+------------+-------------+-----------------------+
[root@osc1-net-002 ~]# nova show gb72-net-002-org-001
+--------------------------------------+---------------------------------------------------------------------+
| Property                             | Value                                                               |
+--------------------------------------+---------------------------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                                              |
| OS-EXT-AZ:availability_zone          | nova                                                                |
| OS-EXT-SRV-ATTR:host                 | osc6-net-001.example.com                                          |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | osc6-net-001.example.com                                          |
| OS-EXT-SRV-ATTR:instance_name        | gb72-net-002-org-001                                                |
| OS-EXT-STS:power_state               | 1                                                                   |
| OS-EXT-STS:task_state                | -                                                                   |
| OS-EXT-STS:vm_state                  | active                                                              |
...

The suspending of the VM did not seem to interfere with any of the processes running on the VM. Maybe I just didn’t look hard enough.

Then on the destentation compute server, osc6-net-001.example.com, I executed these commands:

[root@osc6-net-001 ~]# virsh qemu-monitor-command --hmp gb72-net-002-org-001 cont


[root@osc6-net-001 ~]# virsh resume gb72-net-002-org-001
Domain gb72-net-002-org-001 resumed

My answer:


Failing migration due to too-busy VMs has been recognized as a problem. Unfortunately, while qemu provides a solution, it is not exposed via the libvirt API, and is thus unavailable to OpenStack.

Qemu’s solution is called auto-convergence. This means that, if a VM is so busy that the migration is predicted to never complete, the execution of the VM will be throttled, so as to possibly allow the migration to finish.

Auto-convergence is available from qemu 1.6 onward, which should be present in your OpenStack Juno installation. In this version, the amount of throttling is fixed. Since qemu 2.5 (which at this time is completely new, and you won’t have it yet) the throttling is dynamic, and if the VM is busy it can be throttled anywhere up to 99% dynamically, but only as much as necessary to allow the migration to finish.

Because this monitor command isn’t exposed in the libvirt API, OpenStack can’t take advantage of it. For the moment you will have to apply auto-converge manually to a running VM. For example, log in as root to the compute node which is currently running the VM and execute:

virsh qemu-monitor-command instance-000007e1 --hmp migrate_set_capability auto-converge on

which should output nothing and return 0 if it succeeded. You can then begin the migration.

In qemu 2.5 you can tune the dynamic throttling with monitor commands migrate_set_parameter x-cpu-throttle-initial ## and migrate_set_parameter x-cpu-throttle-increment ## which set the initial throttling percentage, and the increment of additional throttling to be used if migration still won’t complete, respectively.

Hopefully these will eventually be added to the libvirt API so that a future version of OpenStack can manage this directly.


View the full question and answer on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.