So this one is pretty simple. However, I found a lot of misinformation along the way, so I figured that I would jot the proper (and most simple) process here.
Symptoms: a RHEL (or variant) VM that takes a very long time to boot. On the VM console, you can see the following output while the VM boot process is stalled and waiting for a timeout. Note that the message below has nothing to do with cloud init, but its the output that I have most often seen on the console while waiting for a VM to boot.
[106.325574} random: crng init done
Note that I have run into this issue in both OpenStack (when booting from external provider networks) and in KVM.
Upon initial boot of the VM, run the command below.
Seriously, that’s it. No need to disable or remove cloud-init services. See reference.
Now we need to resize the underlying filesystems using “virt-resize“. Note, however, that that “virt-resize” CANNOT resize disk images in-place. So we need to use make a backup copy and use the backup copy of the qcow as input and use the original qcow as output. See example below.
First, we make a backup copy of the disk as shown below.
# cp undercloud.qcow2 undercloud-orig.qcow2
Then we run the command below to grow /dev/sda.
NOTE: In this example /dev/sda1 is not the /boot partition. So be careful you are growing the correct partitions on your qcow.
According to Wikipedia, Numa is — “a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can access its own local memory faster than non-local memory (memory local to another processor or memory shared between processors). The benefits of NUMA are limited to particular workloads, notably on servers where the data are often associated strongly with certain tasks or users.“
So what does this mean for a Virtual Machine optimization under KVM/Libvirt? It means that for best performance, you want to configure your multi-vcpu VMs to use only cores from the same physical CPU (or numa node).
So how do we do this? See the example below from one of my homelab servers. This machine has two hyperthreaded quad core Xeons (x5550) — for a total of 16 cores.
First we use the “lspcu” command to determine which cpu cores are tied to which CPU. This is in bold below.
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
On-line CPU(s) list: 0-15
Thread(s) per core: 2
Core(s) per socket: 4
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model name: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
CPU MHz: 2668.000
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K NUMA node0 CPU(s): 0-3,8-11 NUMA node1 CPU(s): 4-7,12-15
Using the virsh command, we can inspect the CPU pinning for my test VM called “mytestvm“.