OpenStack Ironic Troubleshooting – Neutron Port in Use

keyboard-key-board-melt-broken-computer_p

Ran into issues with a deploy via RHEL OSP director, which caused our heat stack scale compute scale-out to fail.

We corrected the issue, and then attempted our deploy again. This time around, we were able to scale out by several compute nodes. However, several nodes failed to deploy properly.

After trudging through the logs for a bit we were able to find the error below in /var/log/nova-conductor.log

2016-07-25 18:34:39.453 27374 ERROR nova.scheduler.utils [req-1caa3ca0-9e13-4340-b8b2-80b1cf2f8a7f 8408462bd5a8445c9742ea4dfbc20d70 cc44dc9e68064e64899697ac610c8f06 – – -] [instance: e44e3a04-a47e-4a8e-9eb5-0037c1175e4d] Error from last host: fatmin.lab.localdomain (node c5177fbf-ae0f-49db-94f4-087537b3dd53): [u’Traceback (most recent call last):\n’, u’ File “/usr/lib/python2.7/site-packages/nova/compute/manager.py”, line 1905, in _do_build_and_run_instance\n filter_properties)\n’, u’ File “/usr/lib/python2.7/site-packages/nova/compute/manager.py”, line 2058, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=six.text_type(e))\n’, u’RescheduledException: Build of instance e44e3a04-a47e-4a8e-9eb5-0037c1175e4d was re-scheduled: Port 14:18:77:3e:1a:bf is still in use.\n’]

To resolve we ran the following command, using the MAC address shown above to narrow down the search.

#neutron port-list | grep “14:18:77:3e:1a:bf”
| 25a366df-1e6c-4eb8-853c-5b7db82637f0 | | 14:18:77:3e:1a:bf | {“subnet_id”: “0310f210-63ad-4616-9338-d59ac13cc0be”, “ip_address”: “10.20.0.134”} |

The command “neutron port show“, showed us that this port was down and was not responding to ping.

We then deleted the port via neutron

#neutron port-delete 25a366df-1e6c-4eb8-853c-5b7db82637f0

We then re-ran our deploy and were able to scale without issue.

 

Managing RHEV VMs Via the Virsh CLI

cropped-space-cadet.jpg

Out of the box you are not going to be able to run virsh commands on the cli as root. Libvirt, Virsh, and KVM are just not that tightly integrated in RHEV.

You can however, follow the procedure below, and create a user and password to use for authentication.

# saslpasswd2 -a libvirt fatmin
Password:
Again (for verification):

Now enter the credentials you entered above.

# virsh list –all
Please enter your authentication name: fatmin
Please enter your password:
Id Name State
—————————————————-
10 HostedEngine running

Now you can shut down or start a VM. Here I am shutting down my RHEV HostedEngine.

# virsh destroy HostedEngine
Please enter your authentication name: fatmin
Please enter your password:
Domain HostedEngine destroyed

 

 

RHEV: Remotely Connect to Hosted Engine Console via VNC

Snail_On_White_Background_600

 

Honestly, this one is not hard to figure out, as it’s documented in multiple places. However, I have found that the documentation varies greatly depending if you are using RHEV or Ovirt, and the version of each that you are using seems to matter as well. At least, that has been my experience trying to figure out how to get this working.

So I figured I would document it here so that I would not have to try to remember which google result worked for me.

Note that this example is on RHEV 3.6.1.

First, you need to connect to the RHEV-h machine that is hosting the HostedEngine. Then you need to set a console password. See example below.

Note: This is a one-time password, and must be set each time you want to connect to the console.

# hosted-engine –add-console-password
Enter password:
code = 0
message = ‘Done’

Now via your remote machine (mine is Linux). Run the following command. Replace the IP address below with the IP or hostname of your RHEV-H host.

$ remote-viewer vnc://10.1.0.112:5900

If everything is successful, you should get a pop-up window similar to what is shown below.

kvm

Note, I have run into several issues in the past getting this to work. Not sure why, but if I run into any, I will document them here.

 

 

RHEL 7 Two-Factor SSH Via Google Authenticator

650x300xgoogle-authenticator-header.png.pagespeed.gp+jp+jw+pj+js+rj+rp+rw+ri+cp+md.ic.194erIegOZ.jpg

In this post,  I am going to walk you through the process of installing and configuring two- factor SSH authentication via Google Authenticator. My base system is running a fresh install of RHEL 7.2

Installation Steps

The first step on my system was to install autoreconf, automake, and libtool. These packages are required by the bootstrap.sh script that we will need to in a couple more steps.

# yum -y install autoconf automake libtool

Now, we are going to install Git.

#yum -y install git

One more dependency to knock out. Install pam-devel as shown below.

# yum -y install pam-devel

Next, we clone the google-authenticator Git repo. In this example, I am cloning to /root

# git clone https://github.com/google/google-authenticator.git
Cloning into ‘google-authenticator’…
remote: Counting objects: 1435, done.
remote: Total 1435 (delta 0), reused 0 (delta 0), pack-reused 1435
Receiving objects: 100% (1435/1435), 2.32 MiB | 0 bytes/s, done.
Resolving deltas: 100% (758/758), done.

Now change directory as shown below and run bootstrap.sh.

# cd /root/google-authenticator/libpam

# ./bootstrap.sh

Now run the following commands to finalize the module installs.

# ./configure

#make

#make install

Assuming that you do not run into any errors, the following modules will be installed.

  • /usr/local/lib/security/pam_google_authenticator.so
  • /usr/local/lib/security/pam_google_authenticator.la

Continue reading

NUMA Node to PCI Slot Mapping in Red Hat Enterpise Linux

 

understanding-dpdk-31-638

Sandybridge I/O Controller to PCI-E Mapping 

 

Using a few simple commands you can easily map a PCI slot back to its directly connected NUMA node. This information comes in very handy when implementing NFV leveraged technologies such as CPU Pinning and SRIOV.

 

First, you will need to install hwloc and hwloc-gui, if it is not already installed on your system. hwloc-gui provides the lstopo command, so you will need to install the gui package even if you are going to run the command on a headless system.

# yum -y install hwloc.x86_64 hwloc-gui.x86_64

Now you can run lstopo. Below is the output from one of my dual socket, quad core Xeon systems.

# lstopo
Machine (40GB)
NUMANode L#0 (P#0 16GB) + Socket L#0 + L3 L#0 (8192KB)
L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
PU L#0 (P#0)
PU L#1 (P#8)
L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
PU L#2 (P#1)
PU L#3 (P#9)
L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
PU L#4 (P#2)
PU L#5 (P#10)
L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
PU L#6 (P#3)
PU L#7 (P#11)
NUMANode L#1 (P#1 24GB) + Socket L#1 + L3 L#1 (8192KB)
L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4
PU L#8 (P#4)
PU L#9 (P#12)
L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5
PU L#10 (P#5)
PU L#11 (P#13)
L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6
PU L#12 (P#6)
PU L#13 (P#14)
L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7
PU L#14 (P#7)
PU L#15 (P#15)
HostBridge L#0
PCIBridge
PCI 8086:10c9
Net L#0 “enp8s0f0”
PCI 8086:10c9
Net L#1 “enp8s0f1”
PCIBridge
PCIBridge
PCIBridge
PCI 8086:10e8
Net L#2 “enp5s0f0”
PCI 8086:10e8
Net L#3 “enp5s0f1”
PCIBridge
PCI 8086:10e8
Net L#4 “enp4s0f0”
PCI 8086:10e8
Net L#5 “enp4s0f1”
PCIBridge
PCI 102b:0532
GPU L#6 “card0”
GPU L#7 “controlD64”
PCI 8086:3a22
Block L#8 “sr0”
Block L#9 “sda”
Block L#10 “sdb”
Block L#11 “sdc”

The first 27 lines of output tell you which cores are in each socket.

Lines starting with “HostBridge L#0” list the PCI devices attached to socket 0. On more modern dual socket systems (think Sandybridge) you would have a “HostBridge L#8” section as well.

 

“The PCI host bridge provides an interconnect between the processor and peripheral components. Through the PCI host bridge, the processor can directly access main memory independent of other PCI bus masters. For example, while the CPU is fetching data from the cache controller in the host bridge, other PCI devices can also access the system memory through the host bridge. The advantage of this architecture lies in its separation of the I/O bus from the processor’s host bus.”

 

Unfortunately, my lab systems are Nehalem based machines which implement what is called QPI to share a host bridge between CPU sockets.  See image below.

 

019_QPI_1IOH

Nehalem QPI Architecture

 

Nonetheless, we are able to determine which CPU socket is associated with a specific PCI device. For this example, we will focus on the devices below since they are both directly attached to the PCI Host Bridge and not the PCI Bus.

 

HostBridge L#0
PCIBridge
PCI 8086:10c9
Net L#0 “enp8s0f0”
PCI 8086:10c9
Net L#1 “enp8s0f1”

Now using the lspci command I can find the exact devices per NUMA node.

lspci -nn | grep 8086:10c9
08:00.0 Ethernet controller [0200]: Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01)
08:00.1 Ethernet controller [0200]: Intel Corporation 82576 Gigabit Network Connection [8086:10c9] (rev 01)

 

 

 

 

Mapping Libvirt VM Names with OpenStack Instance Names

openstack

Within OpenStack, each virtual machine instance running on a Compute nodes also a virtual machine running on a libvirt node.

If you ssh to a Compute node and run the command below you can get the names of each VM running or registered on this Compute node.

# virsh list –all
Id Name State
—————————————————-
2 instance-000000f0 running
– instance-00000024 shut off
– instance-00000039 shut off
– instance-000000ea shut off

So there is only one VM currently running on this Compute node, but which VM is it?

Well, we can figure that out pretty easily. See below.

# virsh dumpxml instance-000000f0 | grep uuid | grep name

<entry name=’uuid’>3103d38c-447d-40af-9607-56b26473ee72</entry>

Now we just have to map this UUID back to an OpenStack instance name

Here we have a nasty little awk grep to get the UUID and name of each OpenStack instance running in our cluster.

# nova list | grep -v “+” | grep -v ID | awk ‘{print $2 $3 $4}’

a402716b-73d0-4303-9331-202bc2386ab8|storage-perf-a
bd73f092-88c9-4af1-b569-c1176290841c|storage-perf-b
0ab8decb-6623-4e68-b8e4-b9cd522f6ea9|storage-perf-c
493c8afe-836e-4846-bdd0-029bb6e7f70e|storage-perf-d

Note that you can also get the instance name using a UUID,  see below

#nova show db79f6a2-455e-4f17-88d0-b3018d279c7c | grep instanceOS-EXT-SRV-ATTR:instance_name | instance-0000002a

OS-EXT-SRV-ATTR:instance_name | instance-0000002a

 

 

 

 

Red Hat OpenStack Technical Preview Features

redhat-logo

 

According to this page,  a Technology Preview is a feature that is  currently unsupported, may not have complete functionality, and are not suitable for deployment in production. However, Red Hat provides these features and makes them availbile to the customer as a courtesy with the primary goal of exposing the feature to a wider audience.

I am quite often asked by my clients, which features are in tech preview for each release.  Instead of spending time looking these things up each time I figured I would document them here.