OpenStack Staging-Ovirt Driver: global name \’sdk\’ is not defined

python-snek

Getting Started

The staging-ovirt driver allows OpenStack to easily use ovirt/RHV virtual machines as overcloud nodes.   For those of us running virtualized OpenStack labs, it’s a huge step forward – as we either were previously having to hack our way around pxe_ssh or vmbc. Neither was a great solution.

In order to use the staging-ovirt driver , I first I needed to configure the undercloud to use the staging-ovirt driver. See undercloud.conf below.


[DEFAULT]
local_ip = 10.1.98.2/24
undercloud_public_vip = 10.1.98.3
undercloud_admin_vip = 10.1.98.4
local_interface = eth1
masquerade_network = 10.1.98.0/24
dhcp_start = 10.1.98.100
dhcp_end = 10.1.98.120
network_cidr = 10.1.98.0/24
network_gateway = 10.1.98.2
inspection_iprange = 10.1.98.130,10.1.98.150
inspection_runbench = false
undercloud_debug = false
store_events = false
enabled_hardware_types = staging-ovirt
inspection_enable_uefi = false

view raw

undercloud.conf

hosted with ❤ by GitHub

Then create an instackenv.json.  In the example below pm_addr is the IP of my local RHV manager.


"arch": "x86_64",
"cpu": "1",
"disk": "10",
"mac": [
"00:1a:4a:16:01:5a"
],
"memory": "1024",
"name": "ospd13-ctrl01",
"pm_addr": "10.1.99.10",
"pm_password": "redhat",
"pm_type": "staging-ovirt",
"pm_user": "admin@internal",
"pm_vm_name": "ospd13-ctrl01",
"capabilities": "profile:control,boot_option:local"

view raw

gistfile1.txt

hosted with ❤ by GitHub

You should then be able to import your nodes.

[simterm]
$ openstack overcloud node import instackenv.json
[/simterm]

Troubleshooting

Note that I ran into an error importing my nodes. Error shown below.

[{u’result’: u’Node 09dfefec-e5c3-42c4-93d0-45fb44ce37a8 did not reach state “manageable”, the state is “enroll”, error: Failed to get power state for node 09dfefec-e5c3-42c4-93d0-45fb44ce37a8. Error: global name \’sdk\’ is not defined’}, {u’result’: u’Node 59dce2eb-3aea-41f9-aec2-3f13deece30b did not reach state “manageable”, the state is “enroll”, error: Failed to get power state for node 59dce2eb-3aea-41f9-aec2-3f13deece30b. Error: global name \’sdk\’ is not defined’}, {u’result’: u’Node 0895a6d0-f934-44d0-9c26-25e61b6679cb did not reach state “manageable”, the state is “enroll”, error: Failed to get power state for node 0895a6d0-f934-44d0-9c26-25e61b6679cb. Error: global name \’sdk\’ is not defined’}, {u’result’: u’Node 68bdf1cb-fe1f-48ab-b96d-fb5edaf17154 did not reach state “manageable”, the state is “enroll”, error: Failed to get power state for node 68bdf1cb-fe1f-48ab-b96d-fb5edaf17154. Error: global name \’sdk\’ is not defined’}]

Help was found here.

Apparently I was missing a package. I needed to yum install the package shown below and restart ironic-conductor

[simterm]
# sudo yum -y install python-ovirt-engine-sdk4.x86_64
# sudo systemctl restart openstack-ironic-conductor.service
[/simterm]

Red Hat OpenStack 13: Containerized Services Operations Guide

Screenshot from 2018-11-10 18-59-16.png

Contain Yourself

With the release of Red Hat OpenStack 13, the move to containerized overcloud services is complete.  Traditional systemd services such as RabbitMQ, Haproxy, Mariadb, etc, are all now running as containers in the overcloud.  This move to containers is meant to provide additional stability, control, and security to the platform. Future upgrades should be easier, and future deploys should be more flexible.

However, the move to containers brings with it a couple of new challenges. Operations.

The average OpenStack administrator no longer restarts services, they restart containers.

They no longer view a rabbit cluster’s status on the controller node, but rather within a container on the controller node.  Log locations have changed. Config file locations have changed.

So let’s retrain ourselves.

Continue reading

How to Manage Libvirt VMs via OpenStack Ironic (OSP10)

Ironic_mascot_color

Bear Metal

 

In this post I will document the steps that I am using to create a fully virtualized OSP 10 environment in my lab. The undercloud node is a VM, as well as the overcloud nodes. We will configure libvirt so that ironic has the ability to boot and shutdown the VMs on the underlying hypervisor via Ironic.

Add the stack user on your hypervisor. In this case my hypervisor’s hostname is virt01, however we will refer to it as hypervisor for clarity.

[simterm]
hypervisor# useradd stack
hypervisor# echo “password” | passwd stack –stdin
[/simterm]

Modify polkit to allow stack user to manage libvirt.

[simterm]hypervisor # cat << EOF > /etc/polkit-1/localauthority/50-local.d/50-libvirt-user-stack.pkla
[libvirt Management Access]
Identity=unix-user:stack
Action=org.libvirt.unix.manage
ResultAny=yes
ResultInactive=yes
ResultActive=yes
EOF

[/simterm]

Now attempt to libvirt as stack via a remote session. Here we are just connecting back to the localhost, virt01. In the example below, 10.1.99.112 is the ip of the hypervisor. The undercloud has an ip of 10.1.99.10

[simterm]undercloud# virsh –connect qemu+ssh://stack@10.1.99.112/system list –all

[/simterm]

Now ssh as stack to your undercloud vm

Copy stack’s public key to your hypervisor (virt01 in this case). In the command below you will replace the ip address shown with the ip that your undercloud vm will use to connect to libvirt on the hypervisor

[simterm]undercloud# ssh-copy-id -i ~/.ssh/id_rsa.pub stack@10.1.99.112

[/simterm]

Now we need to create a few Virtual Machines. Specifically I am building an environment with 5 virtual machines to run virtualized Red Hat Openstack 13. My overcloud will consist of 2 computes and three controller nodes

I will use the command below to create 5 qcows.

[simterm]hypervisor# cd /var/lib/libvirt/images/

hypervisor# for i in {1..5}; do qemu-img create -f qcow2 \
-o preallocation=metadata overcloud-node$i.qcow2 60G; done
Formatting ‘overcloud-node1.qcow2′, fmt=qcow2 size=64424509440 encryption=off cluster_size=65536 preallocation=’metadata’ lazy_refcounts=off
Formatting ‘overcloud-node2.qcow2′, fmt=qcow2 size=64424509440 encryption=off cluster_size=65536 preallocation=’metadata’ lazy_refcounts=off
Formatting ‘overcloud-node3.qcow2′, fmt=qcow2 size=64424509440 encryption=off cluster_size=65536 preallocation=’metadata’ lazy_refcounts=off
Formatting ‘overcloud-node4.qcow2′, fmt=qcow2 size=64424509440 encryption=off cluster_size=65536 preallocation=’metadata’ lazy_refcounts=off
Formatting ‘overcloud-node5.qcow2′, fmt=qcow2 size=64424509440 encryption=off cluster_size=65536 preallocation=’metadata’ lazy_refcounts=off

[/simterm]

The command below will create 5 xml files and use those to spawn my 5 VMs.

[simterm] hypervisor# for i in {1..5}; do \
virt-install –ram 16384 –vcpus 4 –os-variant rhel7 \
–disk path=/var/lib/libvirt/images/overcloud-node$i.qcow2,device=disk,bus=virtio,format=qcow2 \
–noautoconsole –vnc –network network:provisioning –network bridge:br99 \
–network network:default –name overcloud-node$i \
–dry-run –print-xml > /tmp/overcloud-node$i.xml; \

hypervisor# virsh define –file /tmp/overcloud-node$i.xml; done

[/simterm]

You should end up with the following virtual machines

 

[simterm]hypervisor# virsh list –all
Id Name State
—————————————————-
1 undercloud running
– overcloud-node1 shut off
– overcloud-node2 shut off
– overcloud-node3 shut off
– overcloud-node4 shut off
– overcloud-node5 shut off

[/simterm]

Back on the undercloud we use the command below to grab the provisioning network mac address from each virtual machine running on the hypervisor. We could run this command locally on the hypervisor, but since we need the mac addresses for ironic on the undercloud, we will run it here.

[simterm]undercloud$ for i in {1..5}; do virsh -c qemu+ssh://stack@10.1.99.112/system domiflist overcloud-node$i | awk ‘$3 == “provisioning” {print $5}’; done> /tmp/nodes.txt

[/simterm]

Now we use our temp file above to populate the instackenv.json that we will import into ironic. See gist below


undercloud$ jq . << EOF > ~/instackenv.json
{
"ssh-user": "stack",
"ssh-key": "$(cat ~/.ssh/id_rsa)",
"power_manager": "nova.virt.baremetal.virtual_power_driver.VirtualPowerManager",
"host-ip": "192.168.122.1",
"arch": "x86_64",
"nodes": [
{
"name": "overcloud-node1",
"pm_addr": "192.168.122.1",
"pm_password": "$(cat ~/.ssh/id_rsa)",
"pm_type": "pxe_ssh",
"mac": [
"$(sed -n 1p /tmp/nodes.txt)"
],
"cpu": "4",
"memory": "8192",
"disk": "60",
"arch": "x86_64",
"pm_user": "stack"
},
{
"name": "overcloud-node2",
"pm_addr": "192.168.122.1",
"pm_password": "$(cat ~/.ssh/id_rsa)",
"pm_type": "pxe_ssh",
"mac": [
"$(sed -n 2p /tmp/nodes.txt)"
],
"cpu": "4",
"memory": "8192",
"disk": "60",
"arch": "x86_64",
"pm_user": "stack"
},
{
"name": "overcloud-node3",
"pm_addr": "192.168.122.1",
"pm_password": "$(cat ~/.ssh/id_rsa)",
"pm_type": "pxe_ssh",
"mac": [
"$(sed -n 3p /tmp/nodes.txt)"
],
"cpu": "4",
"memory": "8192",
"disk": "60",
"arch": "x86_64",
"pm_user": "stack"
},
{
"name": "overcloud-node4",
"pm_addr": "192.168.122.1",
"pm_password": "$(cat ~/.ssh/id_rsa)",
"pm_type": "pxe_ssh",
"mac": [
"$(sed -n 4p /tmp/nodes.txt)"
],
"cpu": "4",
"memory": "8192",
"disk": "60",
"arch": "x86_64",
"pm_user": "stack"
},
{
"name": "overcloud-node5",
"pm_addr": "192.168.122.1",
"pm_password": "$(cat ~/.ssh/id_rsa)",
"pm_type": "pxe_ssh",
"mac": [
"$(sed -n 5p /tmp/nodes.txt)"
],
"cpu": "4",
"memory": "8192",
"disk": "60",
"arch": "x86_64",
"pm_user": "stack"
}
]
}
EOF

view raw

gistfile1.txt

hosted with ❤ by GitHub

At this point we are ready to import our nodes via Ironic.

Note that I do not claim to be the original author of the steps documented above, rather I wanted to ensure that I could easily consume these steps in the future.

Also, I look forward to experimenting with the vbmc ironic driver and might stop using pxe_ssh altogether.

OpenStack: 9 tips to properly configure your OpenStack Instances

faf3a30ac4067155dd656381da179869

Qcow vs Raw, Performance Tweaks, Cloud-init, and a short guide on Kernel Tuning – courtesy of redhatstackblog.redhat.com

via 9 tips to properly configure your OpenStack Instance

OpenStack: Deleting Zombie Cinder Volumes and VMs

cinder-1

First off let me start by saying that the new Cinder logo is wonderful. Nothing helps me think of backend storage better than the backend of a horse.

In an environment I am working in, we have a large number of cinder volumes that are in error state, due to the backend storage being ripped out. The volumes were not deleted, nor were they detached from the VMs.

End result, you cannot delete the zombie VM (at it has an attached volume) and you cannot delete the zombie/orphaned volume (as it is attached to a VM).

The following process allows you to work around the chicken-and-egg scenario above.

First we get a list of all volumes in error state.

# openstack volume list –all | grep -i error

Then we take a closer look at the volume to see if it exists/existed on the backend that was removed.

# openstack volume show 05b372ef-ee45-499b-9676-72cc4170e1b3

First we check the backend to ensure it is the affected backend – in this case it is.

| os-vol-host-attr:host | hostgroup@dellsc#openstack_dellsc

We also check for any current attachments. Below we see that this volume is attached to a vm with the uuid shown below.

| attachments | [{u’server_id’: u’d142eb4b-823d-4abd-95a0-3b02a3194c9f’,

Now we reset the state of the volume, so that it is no longer in an error state

# cinder reset-state –state available 05b372ef-ee45-499b-9676-72cc4170e1b3

Now we detach the volume via cinder.

# cinder reset-state –attach-status detached 05b372ef-ee45-499b-9676-72cc4170e1b3

Now we are free to delete the volume

# openstack volume delete 05b372ef-ee45-499b-9676-72cc4170e1b3

Confirm volume deletion

# openstack volume show 05b372ef-ee45-499b-9676-72cc4170e1b3
No volume with a name or ID of ’05b372ef-ee45-499b-9676-72cc4170e1b3′ exists

Now we can delete the VM.

# openstack server delete d142eb4b-823d-4abd-95a0-3b02a3194c9f

And now we confirm its deletion.

#openstack server show d142eb4b-823d-4abd-95a0-3b02a3194c9f
No server with a name or ID of ‘d142eb4b-823d-4abd-95a0-3b02a3194c9f’ exists.

OpenStack: Mapping Ironic Hostnames to Nova Hostnames

Ironic_mascot_color

The Hostname Problem

When deploying OpenStack via Red Hat OSP director you configure the hostname of your baremetal (ironic) nodes at time of import. This is done via json file, by default named instack-env.json (but often re-named, nodes.json). Below is an excerpt from that file.

{
“nodes” :  [
{
“arch”: “x86_64”,
“cpu”: “4”,
“disk”: “40”,
“mac”: [
“58:8a:5a:e6:c0:40”
],
“memory”: “6144”,
“name”: “fatmin-ctrl0”,
“pm_addr”: “10.10.1.100”,
“pm_password”: “Mix-A-Lot”,
“pm_type”: “pxe_ipmitool”,
“pm_user”: “sir”
}

 

In the sample instance above, I am importing a node named, “fatmin-ctrl01”. This will be the server name as it appears in Ironic.  When heat deploys the overcloud, this node will by default be renamed overcloud-controller0, and any controller nodes will iterate by 1. Same situation for compute nodes.

What is preferable is to configure what is referred to as “Predictable Hostnames”. Using “Predictable Hostnames” we can do one of two things.

  1. Specify the hostname format to use and allow nova to iterate through nodes on its own.
  2. Specify the exact hostname for nova to use for each baremetal node

Nova Scheduler Hints

Before we can use either of the two options above, we must first update each baremetal node with a nova scheduler hint. In the examples below we are tagging one node to build as controller-0 (overcloud-controller0) and one node to build as (overcloud-compute-0).

For Controllers: Repeat for each controller

# ironic node-update <id> replace properties/capabilities=”node:controller-0,boot_option:local”

For Compute Node: Repeat for each compute node

# ironic node-update <id> replace properties/capabilities=”node:compute-0,boot_option:local”

You will then need to set your nova hints

parameter_defaults:
ControllerSchedulerHints:
‘capabilities:node’: ‘controller-%index%’
ComputeSchedulerHints:
‘capabilities:node’: ‘compute-%index%’

FYI – the same process can be used for the following hostname types

  • ControllerSchedulerHints
  • ComputeSchedulerHints
  • BlockStorageSchedulerHints
  • ObjectStorageSchedulerHints
  • CephStorageSchedulerHints

Custom Nova Hostname Format

Referring to option 1 shown above, we can set a specific format to be used for hostnames instead of the default.

 ControllerHostnameFormat: ‘fatmin-controller-%index%’
ComputeHostnameFormat: ‘fatmin-compute-%index%’

Using the method above the first compute node will be names fatmin-controller-01, and the first compute node will be names fatmin-compute-01. Additional nodes will iterate the index.

While this is nice, as it allows us to set a customized hostname format  for each type of node, it does not allow us to specify the exact hostname to be used for a specific ironic node.  We can do that will the HostnameMap.

HostnameMap

Now you may want to take this a bit further. You may want to use a custom nova name for each node compute/controller node. You can accomplish this using a HostnameMap as shown below.

HostnameMap:
overcloud-controller-0: fatmin-controller-0
overcloud-controller-1: fatmin-controller-1
overcloud-controller-2: fatmin-controller-2
overcloud-compute-0: fatmin-compute-0

 

Note, when specifying the flavor profiles in the deploy command for preassigned nodes, they should be specified as ‘baremetal‘ instead of ‘control‘ and ‘compute‘. This means that you will not have to assign a profile to each host. You will let the nova scheduler hints handle the decision

–control-flavor baremetal \
–compute-flavor baremetal \

So at this point – we will be able to allign the compute or controller index in ironic, with the index in Ironic. For example you can now map your ironic-node name (for example) fatmin-ctrl0 to fatmin-controller0.

Special Notes for Special People

  1. I do not suggest setting the nova name to the exactly the same name that you defined for the ironic name. While the indexes should match, the name formats should vary enough that you can easily tell if you are looking at a nova name or an ironic name.
  2. The use of HostnameMap will easily facilitate the replacement of a failed node so that you can provision the new node with the same nova name that was used by the original node before its premature death. Otherwise, nova will blacklist the nova name of the failed node. For example if controller0 dies and you need to replace and redeploy it, it will end up being named controller4 since this is the next number in the index.

 

 

OpenStack Heat and os-collect-config

OpenStack-logo

os-collect-config

os-collect-config is a tool that starts up via systemd when a system boots. It initially runs at boot time, but continues to run looking for changes in heat metadata. In a nutshell,  os-collect-config is responsible for monitoring and downloading metadata from the Heat API.

When data changes, os-collect-config makes a call to os-refresh-config. This data provides the node with all of the information it needs to make configuration changes to the host – this data will be node specific.

Os-collect-config polls for data from sources (nova-metadata and heat) and stores them in /var/lib/os-collect-config/.

Example of the contents of the directory above.

-rw——-. 1 root root 42412 Jul 23 14:40 ComputeAllNodesDeployment.json
-rw——-. 1 root root 42412 Jun 4 20:56 ComputeAllNodesDeployment.json.last
-rw——-. 1 root root 35447 Feb 20 2017 ComputeAllNodesDeployment.json.orig
-rw——-. 1 root root 23852 Jul 23 14:40 ComputeHostsDeployment.json
-rw——-. 1 root root 23852 Jun 4 20:07 ComputeHostsDeployment.json.last
-rw——-. 1 root root 8074 Feb 20 2017 ComputeHostsDeployment.json.orig
-rw——-. 1 root root 1071 Jul 23 14:40 ec2.json
-rw——-. 1 root root 1071 Feb 20 2017 ec2.json.last
-rw——-. 1 root root 1071 Feb 20 2017 ec2.json.orig
-rw——-. 1 root root 441 Feb 20 2017 heat_local.json
-rw——-. 1 root root 441 Feb 20 2017 heat_local.json.last
-rw——-. 1 root root 441 Feb 20 2017 heat_local.json.orig
-rw——-. 1 root root 2635 Jul 23 14:40 NetworkDeployment.json
-rw——-. 1 root root 2635 Mar 1 2017 NetworkDeployment.json.last
-rw——-. 1 root root 2636 Feb 20 2017 NetworkDeployment.json.orig
-rw——-. 1 root root 13259 Jul 23 14:40 NovaComputeDeployment.json
-rw——-. 1 root root 13259 Jan 16 2018 NovaComputeDeployment.json.last
-rw——-. 1 root root 11069 Feb 20 2017 NovaComputeDeployment.json.orig
-rw——-. 1 root root 311 Jun 4 22:02 os_config_files.json
-rw——-. 1 root root 252762 Jul 23 14:40 request.json
-rw——-. 1 root root 252762 Jun 4 22:07 request.json.last
-rw——-. 1 root root 23440 Feb 20 2017 request.json.orig

Example of the contents of the directory above. You will usually find 3 versions of each config. Current, original, and last (previous).

You can view this metadata, using python, as shown below.

# python -m json.tool ComputeAllNodesDeployment.json
{
“hiera”: {
“datafiles”: {
“all_nodes”: {
“mapped_data”: {
“ca_certs_enabled”: “true”,
“ca_certs_short_node_names”: [

…trunc…

os-refresh-config

os-refresh-config is called by os-collect-config once it recognizes that metadata has changed within the Heat API for that specific node.

The important steps that os-refresh-config take are shown below

1) Apply systemctl configurables
2) Run os-apply-config (see below)
3) Configure the networking for the host
4) Download and set the hieradata files for puppet parameters
5) Configure /etc/hosts
6) Deploy software configuration with Heat

os-apply-config

os-apply-config is called by os-refresh-config to sets up configuration files on specific nodes. It is called via ‘/usr/libexec/os-refresh-config/configure.d/20-os-apply-config‘.

As is does with the undercloud deploy, os-refresh-config executes scripts under /usr/libexec/os-refresh-config/ in a specific order based on numbering.

First scripts within thje  pre-configure.d/ directory are run, then configure.d/ scripts are applied, and finally scripts in post-configure.d/.

It is within these scripts that the metadata downloaded by os-collect-config will be acted upon.

Any call to os-apply-config uses the files in /var/lib/os-collect-config as its configuration source.

The appropriate script files for doing so are as follows:

overcloud$ ll /usr/libexec/os-refresh-config/configure.d/
total 32
-rwxr-xr-x. 1 root root 396 Aug 5 07:31 10-sysctl-apply-config
-rwxr-xr-x. 1 root root 42 Aug 5 07:31 20-os-apply-config
-rwxr-xr-x. 1 root root 189 Aug 5 07:31 20-os-net-config
-rwxr-xr-x. 1 root root 629 Aug 5 07:31 25-set-network-gateway
-rwxr-xr-x. 1 root root 2265 Aug 5 07:31 40-hiera-datafiles
-rwxr-xr-x. 1 root root 1387 Aug 5 07:31 51-hosts
-rwxr-xr-x. 1 root root 5784 Aug 5 07:31 55-heat-config

If we run os-apply-config manually, we can see that it does the following:

overcloud$ sudo sh /usr/libexec/os-refresh-config/configure.d/20-os-apply-config
[2015/08/07 01:17:40 PM] [INFO] writing /etc/os-net-config/config.json
[2015/08/07 01:17:40 PM] [INFO] writing /var/run/heat-config/heat-config
[2015/08/07 01:17:40 PM] [INFO] writing /etc/puppet/hiera.yaml
[2015/08/07 01:17:40 PM] [INFO] writing /etc/os-collect-config.conf
[2015/08/07 01:17:40 PM] [INFO] success

 

os-net-config

The directory /os/net-config/ holds the config.json file that is used to modify the networking configuration on each host. The config found in this file is derived from the os-collect-config data in /var/lib/os-collect-config/.

Again you can use this file to review your networking configuration and compare and contrast to your templates. Formatting is off in the file below, but you get the point.
# python -m json.tool config.json
{
“network_config”: [
{
“addresses”: [
{
“ip_netmask”: “172.20.4.113/24”
}
],
“dns_servers”: [
“96.239.250.57”,
“96.239.250.58”
],
“name”: “em3”,
“routes”: [
{
“ip_netmask”: “169.254.169.254/32”,
“next_hop”: “172.20.4.20”
}
],
“type”: “interface”,
“use_dhcp”: false
},
{
“members”: [
{
“bonding_options”: “mode=4 lacp_rate=1 updelay=1000 miimon=50”,
“members”: [
{
“mtu”: 9216,
“name”: “em1”,
“primary”: true,
“type”: “interface”
},
{
“mtu”: 9216,
“name”: “em2”,
“type”: “interface”
}
],
“mtu”: 9216,
“name”: “bond1”,
“type”: “linux_bond”
},
{
“addresses”: [
{
“ip_netmask”: “172.20.3.30/24”
}
],
“device”: “bond1”,
“mtu”: 9000,
“type”: “vlan”,
“vlan_id”: 52