How to Manage Libvirt VMs via OpenStack Ironic

Ironic_mascot_color

Bear Metal

 

In this post I will document the steps that I am using to create a fully virtualized OSP 13 environment in my lab. The undercloud node is a VM, as well as the overcloud nodes. We will configure libvirt so that ironic has the ability to boot and shutdown the VMs on the underlying hypervisor via Ironic.

Add the stack user on your hypervisor. In this case my hypervisor’s hostname is virt01, however we will refer to it as hypervisor for clarity.

hypervisor# useradd stack
hypervisor# echo “password” | passwd stack --stdin

Modify polkit to allow stack user to manage libvirt.

hypervisor # cat << EOF > /etc/polkit-1/localauthority/50-local.d/50-libvirt-user-stack.pkla
[libvirt Management Access]
Identity=unix-user:stack
Action=org.libvirt.unix.manage
ResultAny=yes
ResultInactive=yes
ResultActive=yes
EOF

Now attempt to libvirt as stack via a remote session. Here we are just connecting back to the localhost, virt01. In the example below, 10.1.99.112 is the ip of the hypervisor. The undercloud has an ip of 10.1.99.10

undercloud# virsh --connect qemu+ssh://stack@10.1.99.112/system list --all

Now ssh as stack to your undercloud vm

Copy stack’s public key to your hypervisor (virt01 in this case). In the command below you will replace the ip address shown with the ip that your undercloud vm will use to connect to libvirt on the hypervisor

undercloud# ssh-copy-id -i ~/.ssh/id_rsa.pub stack@10.1.99.112

Now we need to create a few Virtual Machines. Specifically I am building an environment with 5 virtual machines to run virtualized Red Hat Openstack 13. My overcloud will consist of 2 computes and three controller nodes

I will use the command below to create 5 qcows.

hypervisor# cd /var/lib/libvirt/images/
hypervisor# for i in {1..5}; do qemu-img create -f qcow2 \
-o preallocation=metadata overcloud-node$i.qcow2 60G; done
Formatting ‘overcloud-node1.qcow2′, fmt=qcow2 size=64424509440 encryption=off cluster_size=65536 preallocation=’metadata’ lazy_refcounts=off
Formatting ‘overcloud-node2.qcow2′, fmt=qcow2 size=64424509440 encryption=off cluster_size=65536 preallocation=’metadata’ lazy_refcounts=off
Formatting ‘overcloud-node3.qcow2′, fmt=qcow2 size=64424509440 encryption=off cluster_size=65536 preallocation=’metadata’ lazy_refcounts=off
Formatting ‘overcloud-node4.qcow2′, fmt=qcow2 size=64424509440 encryption=off cluster_size=65536 preallocation=’metadata’ lazy_refcounts=off
Formatting ‘overcloud-node5.qcow2′, fmt=qcow2 size=64424509440 encryption=off cluster_size=65536 preallocation=’metadata’ lazy_refcounts=off

The command below will create 5 xml files and use those to spawn my 5 VMs.

hypervisor# for i in {1..5}; do \
virt-install --ram 16384 --vcpus 4 --os-variant rhel7 \
--disk path=/var/lib/libvirt/images/overcloud-node$i.qcow2,device=disk,bus=virtio,format=qcow2 \
--noautoconsole --vnc --network network:provisioning --network bridge:br99 \
--network network:default --name overcloud-node$i \
--dry-run --print-xml > /tmp/overcloud-node$i.xml; \
hypervisor# virsh define --file /tmp/overcloud-node$i.xml; done

You should end up with the following virtual machines

 

hypervisor# virsh list --all
Id Name State
----------------------------------------------------
1 undercloud running
-- overcloud-node1 shut off
-- overcloud-node2 shut off
-- overcloud-node3 shut off
-- overcloud-node4 shut off
-- overcloud-node5 shut off

Back on the undercloud we use the command below to grab the provisioning network mac address from each virtual machine running on the hypervisor. We could run this command locally on the hypervisor, but since we need the mac addresses for ironic on the undercloud, we will run it here.

undercloud$ for i in {1..5}; do \
virsh -c qemu+ssh://stack@192.168.122.1/system domiflist overcloud-node$i | awk ‘$3 == “provisioning” {print $5};’; \
done > /tmp/nodes.txt

Now we use our temp file above to populate the instackenv.json that we will import into ironic. See gist below

At this point we are ready to import our nodes via Ironic.

Note that I do not claim to be the original author of the steps documented above, rather I wanted to ensure that I could easily consume these steps in the future.

Also, I look forward to experimenting with the vbmc ironic driver and might stop using pxe_ssh altogether.

Advertisements

Cockpit for Centos and RHEL 7: Install and Configure

Snail_On_White_Background_600

Introduction

I have recently purchased 3 Dell servers, and put myself to task to build out a new lab. My old lab was in desperate need of updating as I had long past the time when 48GB of memory per node was sufficient. The cost of memory, old or new was not even closely in line with cheap server grade CPUs that were perfect for lab servers. Today you can buy a used E7540, a low power, 12 core (HT enabled) Xeon for less than $30 (USD) from a reputable retailer. Cram two of these into an 11 gen Dell and you are in business.

So, three new (to me) Dell rackmounts, deployed as virtualization servers, and I want a simple way to view performance stats in a nice clean single pain of glass. I am not in any way shape or form looking to build fancy dashboard and setup any sort of historical monitoring. I just want to know where the performance hot spots are when my environment seems to be running slowly.

I installed Cockpit before on a laptop or two and thought it might foot the bill, especially since you could use one dashboard for multiple nodes.

So here we are going to deploy Cockpit on all three nodes, on each the steps are the same.

Prerequisites

First we must open a firewall port on each node.

Continue reading

Creating and Deleting OpenStack Pacemaker IP Addresses

bild_go-clusters-34-nichtinsocialmediaverwenden

You can use the steps below if you need to change managed IP resources, for example, if you need to re-IP your RHEL OSP Overcloud endpoints.

In this example, we are changing a managed VIP from one IP to another.

First, we get a good look at the resource that we want to delete. Here we are going to delete the resource ip-99.239.203.25. This resource starts the VIP, 99.239.203.25.

# pcs resource show ip-99.239.203.25
Resource: ip-99.239.203.25 (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=99.239.203.25 cidr_netmask=32
Operations: start interval=0s timeout=20s (ip-99.239.203.25-start-interval-0s)
stop interval=0s timeout=20s (ip-99.239.203.25-stop-interval-0s)
monitor interval=10s timeout=20s (ip-99.239.203.25-monitor-interval-10s)

Now let’s actually delete it.

# pcs resource delete ip-99.239.203.25
Attempting to stop: ip-99.239.203.25…Stopped

Now lets create the replacement VIP

# pcs resource create ip-99.239.203.10 ocf:heartbeat:IPaddr2 ip=99.239.203.10 cidr_netmask=32 op monitor interval=10s

Now, let’s take a good look at it.

# pcs resource show ip-99.239.203.10
Resource: ip-99.239.203.10 (class=ocf provider=heartbeat type=IPaddr2)
Attributes: ip=99.239.203.10 cidr_netmask=32
Operations: start interval=0s timeout=20s (ip-99.239.203.10-start-interval-0s)
stop interval=0s timeout=20s (ip-99.239.203.10-stop-interval-0s)
monitor interval=10s (ip-99.239.203.10-monitor-interval-10s)

Now we need to check to make sure that the VIP started on one of our OpenStack controllers.

# pcs status | grep 99.239.203.10
ip-99.239.203.10 (ocf::heartbeat:IPaddr2): Started ctrl01

For good measure, let’s make sure we can ping it.

# ping 99.239.203.10
PING 99.239.203.10 (99.239.203.10) 56(84) bytes of data.
64 bytes from 99.239.203.10: icmp_seq=1 ttl=64 time=0.781 ms
64 bytes from 99.239.203.10: icmp_seq=2 ttl=64 time=1.21 ms

 

 

Configuring ControlPlaneSubnetCidr in RHEL OSP 7.2

how-apple-cider-vinegar-is-helpful-for-weight-loss-21822999

Background

In previous versions of RHEL OSP 7 the Control Plane/Provisioning network interface was assigned via DHCP and not managed via Heat. Starting in 7.2, this interface is now managed via Heat.

Sample Heat Template

Below is an example from /home/stack/templates/nic-configs/compute.yaml or /home/stack/templates/nic-configs/controller.yaml. In this example we are hard coding the interface name, however this is not required (although I recommend it).

resources:
  OsNetConfigImpl:
    type: OS::Heat::StructuredConfig
    properties:
      group: os-apply-config
      config:
        os_net_config:
          network_config:
            -
              type: interface
              name: em3
              use_dhcp: false
              addresses:
                -
                  ip_netmask:
                    list_join:
                      - '/'
                      - - {get_param: ControlPlaneIp}
                        - {get_param: ControlPlaneSubnetCidr}
              routes:
                -
                  ip_netmask: 169.254.169.254/32
                  next_hop: {get_param: EC2MetadataIp}

Note that this new configuration requires an additional parameter to be added to your top-level template, usually named network-environment.yaml.

ControlPlaneSubnetCidr: "23"

Stick this next to the “ControlPlaneIP” under “parameter_defaults”

  ControlPlaneSubnetCidr: "23"
  ControlPlaneDefaultRoute: 172.99.99.1

Note that if you forget to add this param, the network CIDR for this network will default to “24” which may or may not be correct for your environment. So watch out.

RHEL 7 Two-Factor SSH Via Google Authenticator

650x300xgoogle-authenticator-header.png.pagespeed.gp+jp+jw+pj+js+rj+rp+rw+ri+cp+md.ic.194erIegOZ.jpg

In this post,  I am going to walk you through the process of installing and configuring two- factor SSH authentication via Google Authenticator. My base system is running a fresh install of RHEL 7.2

Installation Steps

The first step on my system was to install autoreconf, automake, and libtool. These packages are required by the bootstrap.sh script that we will need to in a couple more steps.

# yum -y install autoconf automake libtool

Now, we are going to install Git.

#yum -y install git

One more dependency to knock out. Install pam-devel as shown below.

# yum -y install pam-devel

Next, we clone the google-authenticator Git repo. In this example, I am cloning to /root

# git clone https://github.com/google/google-authenticator.git
Cloning into ‘google-authenticator’…
remote: Counting objects: 1435, done.
remote: Total 1435 (delta 0), reused 0 (delta 0), pack-reused 1435
Receiving objects: 100% (1435/1435), 2.32 MiB | 0 bytes/s, done.
Resolving deltas: 100% (758/758), done.

Now change directory as shown below and run bootstrap.sh.

# cd /root/google-authenticator/libpam

# ./bootstrap.sh

Now run the following commands to finalize the module installs.

# ./configure

#make

#make install

Assuming that you do not run into any errors, the following modules will be installed.

  • /usr/local/lib/security/pam_google_authenticator.so
  • /usr/local/lib/security/pam_google_authenticator.la

Continue reading

NUMA Node to PCI Slot Mapping in Red Hat Enterpise Linux

 

understanding-dpdk-31-638

Sandybridge I/O Controller to PCI-E Mapping 

 

Using a few simple commands you can easily map a PCI slot back to its directly connected NUMA node. This information comes in very handy when implementing NFV leveraged technologies such as CPU Pinning and SRIOV.

 

First, you will need to install hwloc and hwloc-gui, if it is not already installed on your system. hwloc-gui provides the lstopo command, so you will need to install the gui package even if you are going to run the command on a headless system.

# yum -y install hwloc.x86_64 hwloc-gui.x86_64

Now you can run lstopo. Below is the output from one of my dual socket, quad core Xeon systems.

# lstopo
Machine (40GB)
NUMANode L#0 (P#0 16GB) + Socket L#0 + L3 L#0 (8192KB)
L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
PU L#0 (P#0)
PU L#1 (P#8)
L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
PU L#2 (P#1)
PU L#3 (P#9)
L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
PU L#4 (P#2)
PU L#5 (P#10)
L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
PU L#6 (P#3)
PU L#7 (P#11)
NUMANode L#1 (P#1 24GB) + Socket L#1 + L3 L#1 (8192KB)
L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4
PU L#8 (P#4)
PU L#9 (P#12)
L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5
PU L#10 (P#5)
PU L#11 (P#13)
L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6
PU L#12 (P#6)
PU L#13 (P#14)
L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7
PU L#14 (P#7)
PU L#15 (P#15)
HostBridge L#0
PCIBridge
PCI 8086:10c9
Net L#0 “enp8s0f0”
PCI 8086:10c9
Net L#1 “enp8s0f1”
PCIBridge
PCIBridge
PCIBridge
PCI 8086:10e8
Net L#2 “enp5s0f0”
PCI 8086:10e8
Net L#3 “enp5s0f1”
PCIBridge
PCI 8086:10e8
Net L#4 “enp4s0f0”
PCI 8086:10e8
Net L#5 “enp4s0f1”
PCIBridge
PCI 102b:0532
GPU L#6 “card0”
GPU L#7 “controlD64”
PCI 8086:3a22
Block L#8 “sr0”
Block L#9 “sda”
Block L#10 “sdb”
Block L#11 “sdc”

The first 27 lines of output tell you which cores are in each socket.

Lines starting with “HostBridge L#0” list the PCI devices attached to socket 0. On more modern dual socket systems (think Sandybridge) you would have a “HostBridge L#8” section as well.

 

“The PCI host bridge provides an interconnect between the processor and peripheral components. Through the PCI host bridge, the processor can directly access main memory independent of other PCI bus masters. For example, while the CPU is fetching data from the cache controller in the host bridge, other PCI devices can also access the system memory through the host bridge. The advantage of this architecture lies in its separation of the I/O bus from the processor’s host bus.”

 

Unfortunately, my lab systems are Nehalem based machines which implement what is called QPI to share a host bridge between CPU sockets.  See image below.

 

019_QPI_1IOH

Nehalem QPI Architecture

 

Nonetheless, we are able to determine which CPU socket is associated with a specific PCI device. For this example, we will focus on the devices below since they are both directly attached to the PCI Host Bridge and not the PCI Bus.

 

HostBridge L#0
PCIBridge
PCI 8086:10c9
Net L#0 “enp8s0f0”
PCI 8086:10c9
Net L#1 “enp8s0f1”

Now using the lspci command I can find the exact devices per NUMA node.

lspci -nn | grep 8086:10c9
08:00.0 Ethernet controller [0200]: Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01)
08:00.1 Ethernet controller [0200]: Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01)

 

 

 

 

Red Hat OpenStack Technical Preview Features

redhat-logo

 

According to this page,  a Technology Preview is a feature that is  currently unsupported, may not have complete functionality, and are not suitable for deployment in production. However, Red Hat provides these features and makes them availbile to the customer as a courtesy with the primary goal of exposing the feature to a wider audience.

I am quite often asked by my clients, which features are in tech preview for each release.  Instead of spending time looking these things up each time I figured I would document them here.