Ceph: Troubleshooting Failed OSD Creation

logo_ceph_CMYK_coated

Introduction to Ceph

According to Wikipedia “Ceph is a free software storage platform designed to present object, block, and file storage from a single distributed computer cluster. Ceph’s main goals are to be completely distributed without a single point of failure, scalable to the exabyte level, and freely-available”

More information pertaining to Ceph can be found here.

Lab Buildout

In my homelab I am building out a small Ceph cluster for testing and learning purposes. My small cluster consists or 4 virtual machines as shown below. I plan to use this cluster primarily as a backend for OpenStack.

Monitor Servers
Count 1
CPU 2
Memory (GB) 2
Primary Disk (GB) 16
OSD Servers
Count 3
CPU 2
Memory (GB) 2
Primary Disk (GB) 16
OSD Disk (GB) 10
OSD Disk (GB) 10
OSD Disk (GB) 10
SSD Journal (GB) 6

Troubleshooting OSD Creation

On my monitor server which is also serving as my Admin node, I run the following command to remove all partitioning on all disks that I intend to use for Ceph.

# for disk in sdb sdc sdd sdd; do ceph-deploy disk zap osd01:/dev/$disk; done
Next I run the command below to prepare each OSD and specify the journal disk to use for each OSD. This command “should” create a partition on each OSD, format label it as a Ceph disk, and then create a journal partition for each OSD on the journal disk (sde in this case).
#ceph-deploy osd prepare osd01:sdb:sde osd01:sdc:sde osd01:sdd:sde
Unfortunately, the command below kept failing, stating that it was unable to create some of the partitions on each disk, while creating partitions on some of the disk, and mounting them locally. This left my OSDs in a bad state as running the command again would throw all sorts of errors. So I figured that I would start over and run the zap command again. However now this command was failing with errors as some of the disks were mounted and Ceph was running.
Next step was to ssh into the OSD server, aptly named, osd1 and stop ceph.
# /etc/init.d/ceph stop
Then unmount any OSDd that were mounted.
# umount /var/lib/ceph/osd/ceph-7 /var/lib/ceph/osd/ceph-8 /var/lib/ceph/osd/ceph-9
Then using fdisk, delete any existing partitions, this seemed to be necesary to remove partitons created on the SSD journal disk. Next run partx to force the OS to re-read the partition table on each disk.
# for disk in sdb sdc sdd sde; do partx -a /dev/$disk; done
At this point I was able to log back into the admin node and re-run the prepare command.

Additional Troubleshooting

So, apparently this was not the end of all my woes. I ran into the same issue on my second OSD server, osd02. First thing I did was ssh into the OSD server and run the command below.
[root@osd02 ceph]# /etc/init.d/ceph status
=== osd.3 ===
osd.3: not running.
=== osd.13 ===
osd.13: running {“version”:”0.94.1″}
=== osd.14 ===
osd.14: running {“version”:”0.94.1″}
So I stopped Ceph.
[root@osd02 ceph]# /etc/init.d/ceph stop
=== osd.14 ===
Stopping Ceph osd.14 on osd02…kill 224396…kill 224396…done
=== osd.13 ===
Stopping Ceph osd.13 on osd02…kill 223838…kill 223838…done
=== osd.3 ===
Stopping Ceph osd.3 on osd02…done
Then I unmounted the osd.3.
[root@osd02 ceph]# umount /var/lib/ceph/osd/ceph-3
Then I locally prepared osd3, where /dev/sdb is the osd disk and /dev/sde is the journal disk.
[root@osd02 ceph]# ceph-disk -v prepare –fs-type xfs –cluster ceph — /dev/sdb /dev/sde
I then verified that I had three Ceph journal partitions on my ssd
[root@osd02 ceph]# fdisk -l /dev/sde
WARNING: fdisk GPT support is currently new, and therefore in an experimental phase. Use at your own discretion.

Disk /dev/sde: 6442 MB, 6442450944 bytes, 12582912 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: gpt

#         Start          End    Size  Type            Name
1         2048      4098047      2G  unknown         ceph journal
2      4098048      8194047      2G  unknown         ceph journal
3      8194048     12290047      2G  unknown         ceph journal

Then I checked my OSDs again. All were running
[root@osd02 ceph]# /etc/init.d/ceph status
=== osd.13 ===
osd.13: running {“version”:”0.94.1″}
=== osd.14 ===
osd.14: running {“version”:”0.94.1″}
=== osd.18 ===
osd.18: running {“version”:”0.94.1″}