The last week for linux.conf.au 2019 proposals!

Share
Dear humans of the Internet — there is ONE WEEK LEFT to propose talks for linux.conf.au 2019. LCA is one of the world’s best open source conferences, and we’d love to hear you speak!
 
Unsure what to propose? Not sure if your talk is what the conference would normally take? Just want a chat? You’re welcome to reach out to papers-chair@linux.org.au to talk things through.
 
Share

Rejected talk proposal: Design at scale: OpenStack versus Kubernetes

Share

This proposal was submitted for pyconau 2018. It wasn’t accepted, but given I’d put the effort into writing up the proposal I’ll post it here in case its useful some other time. The oblique references to OpensStack are because pycon had an “anonymous” review system in 2018, and I was avoiding saying things which directly identified me as the author.


OpenStack and Kubernetes solve very similar problems. Yet they approach those problems in very different ways. What can we learn from the different approaches taken? The differences aren’t just technical though, there are some interesting social differences too. Continue reading Rejected talk proposal: Design at scale: OpenStack versus Kubernetes

Share

Accepted talk proposal: Learning from the mistakes that even big projects make

Share

This proposal was submitted for pyconau 2018. It was accepted, but hasn’t been presented yet. The oblique references to OpensStack are because pycon had an “anonymous” review system in 2018, and I was avoiding saying things which directly identified me as the author.


Since 2011, I’ve worked on a large Open Source project in python. It kind of got out of hand – 1000s of developers and millions of lines of code. Yet despite being well resourced, we made the same mistakes that those tiny scripts you whip up to solve a small problem make. Come learn from our fail.

Continue reading Accepted talk proposal: Learning from the mistakes that even big projects make

Share

How to make a privileged call with oslo privsep

Share

Once you’ve added oslo privsep to your project, how do you make a privileged call? Its actually really easy to do. In this post I will assume you already have privsep running for your project, which at the time of writing limits you to OpenStack Nova in the OpenStack universe.

Continue reading How to make a privileged call with oslo privsep

Share

On Selecting a Well Engaged Open Source Vendor

Share

Aptira is in an interesting position in the Open Source market, because we don’t usually sell software. Instead, our customers come to us seeking assistance with deciding which OpenStack to use, or how to embed ONAP into their nationwide networks, or how to move their legacy networks to the software defined future. Therefore, our most common role is as a trusted advisor to help our customers decide which Open Source products to buy.

(My boss would insist that I point out here that we do customisation of Open Source for our customers, and have assisted many in the past with deploying pure upstream solutions. Basically, we do what is the right fit for the customer, and aren’t obsessed with fitting customers into pre-defined moulds that suit our partners.)

That makes it important that we recommend products from companies that are well engaged with their upstream Open Source communities. That might be OpenStack, or ONAP, or even something like Open Daylight. This raises the obvious question – what makes a company well engaged with an upstream project?

Read more over at my employer’s blog

Share

So you want to setup a Ceph dev environment using OSA

Share

Support for installing and configuring Ceph was added to openstack-ansible in Ocata, so now that I have a need for a Ceph development environment it seems logical that I would build it by building an openstack-ansible Ocata AIO. There were a few gotchas there, so I want to explain the process I used.

First off, Ceph is enabled in an openstack-ansible AIO using a thing I’ve never seen before called a “Scenario”. Basically this means that you need to export an environment variable called “SCENARIO” before running the AIO install. Something like this will do the trick?L:

    export SCENARIO=ceph
    

Next you need to set the global pg_num in the ceph role or the install will fail. I did that with this patch:

    --- /etc/ansible/roles/ceph.ceph-common/defaults/main.yml       2017-05-26 08:55:07.803635173 +1000
    +++ /etc/ansible/roles/ceph.ceph-common/defaults/main.yml       2017-05-26 08:58:30.417019878 +1000
    @@ -338,7 +338,9 @@
     #     foo: 1234
     #     bar: 5678
     #
    -ceph_conf_overrides: {}
    +ceph_conf_overrides:
    +  global:
    +    osd_pool_default_pg_num: 8
    
     #############
    @@ -373,4 +375,4 @@
     # Set this to true to enable File access via NFS.  Requires an MDS role.
     nfs_file_gw: true
     # Set this to true to enable Object access via NFS. Requires an RGW role.
    -nfs_obj_gw: false
    \ No newline at end of file
    +nfs_obj_gw: false
    

That of course needs to be done after the Ceph role has been fetched, but before it is executed, so in other words after the AIO bootstrap, but before the install.

And that was about it (although of course that took a fair while to work out). I have this automated in my little install helper thing, so I’ll never need to think about it again which is nice.

Once Ceph is installed, you interact with it via the monitor container, not the utility container, which is a bit odd. That said, all you really need is the Ceph config file and the Ceph utilities, so you could move those elsewhere.

    root@labosa:/etc/openstack_deploy# lxc-attach -n aio1_ceph-mon_container-a3d8b8b1
    root@aio1-ceph-mon-container-a3d8b8b1:/# ceph -s
        cluster 24424319-b5e9-49d2-a57a-6087ab7f45bd
         health HEALTH_OK
         monmap e1: 1 mons at {aio1-ceph-mon-container-a3d8b8b1=172.29.239.114:6789/0}
                election epoch 3, quorum 0 aio1-ceph-mon-container-a3d8b8b1
         osdmap e20: 3 osds: 3 up, 3 in
                flags sortbitwise,require_jewel_osds
          pgmap v36: 40 pgs, 5 pools, 0 bytes data, 0 objects
                102156 kB used, 3070 GB / 3070 GB avail
                      40 active+clean
    root@aio1-ceph-mon-container-a3d8b8b1:/# ceph osd tree
    ID WEIGHT  TYPE NAME       UP/DOWN REWEIGHT PRIMARY-AFFINITY
    -1 2.99817 root default
    -2 2.99817     host labosa
     0 0.99939         osd.0        up  1.00000          1.00000
     1 0.99939         osd.1        up  1.00000          1.00000
     2 0.99939         osd.2        up  1.00000          1.00000
    
Share

How we got to test_init_instance_retries_reboot_pending_soft_became_hard

Share

I’ve been asked some questions about a recent change to nova that I am responsible for, and I thought it would be easier to address those in this format than trying to explain what’s happening in IRC. That way whenever someone compliments me on possibly the longest unit test name ever written, I can point them here.

Let’s start with some definitions. What is the difference between a soft reboot and a hard reboot in Nova? The short answer is that a soft reboot gives the operating system running in the instance an opportunity to respond to an ACPI power event gracefully before the rug is pulled out from under the instance, whereas a hard reboot just punches the instance in the face immediately.

There is a bit more complexity than that of course, because this is OpenStack. A hard reboot also re-fetches image meta-data, and rebuilds the XML description of the instance that we hand to libvirt. It also re-populates any missing backing files. Finally it ensures that the networking is configured correctly and boots the instance again. In other words, a hard reboot is kind of like an initial instance boot, in that it makes fewer assumptions about how much you can trust the current state of the instance on the hypervisor node. Finally, a soft reboot which fails (probably because the instance operation system didn’t respond to the ACPI event in a timely manner) is turned into a hard reboot after libvirt.wait_soft_reboot_seconds. So, we already perform hard reboots when a user asked for a soft reboot in certain error cases.

Its important to note that the actual reboot mechanism is similar though — its just how patient we are and what side effects we create that change — in libvirt they both end up as a shutdown of the virtual machine and then a startup.

Bug 1072751 reported an interesting edge case with a soft reboot though. If nova-compute crashes after shutting down the virtual machine, but before the virtual machine is started again, then the instance is left in an inconsistent state. We can demonstrate this with a devstack installation:

    Setup the right version of nova cd /opt/stack/nova git checkout dc6942c1218279097cda98bb5ebe4f273720115d Patch nova so it crashes on a soft reboot cat - > /tmp/patch <<EOF > diff --git a/nova/virt/libvirt/driver.py b/nova/virt/libvirt/driver.py > index ce19f22..6c565be 100644 > --- a/nova/virt/libvirt/driver.py > +++ b/nova/virt/libvirt/driver.py > @@ -34,6 +34,7 @@ import itertools > import mmap > import operator > import os > +import sys > import shutil > import tempfile > import time > @@ -2082,6 +2083,10 @@ class LibvirtDriver(driver.ComputeDriver): > # is already shutdown. > if state == power_state.RUNNING: > dom.shutdown() > + > + # NOTE(mikal): temporarily crash > + sys.exit(1) > + > # NOTE(vish): This actually could take slightly longer than the > # FLAG defines depending on how long the get_info > # call takes to return. > EOF patch -p1 < /tmp/patch ...now restart nova-compute inside devstack to make sure you're running the patched version... Boot a victim instance cd ~/devstack source openrc admin glance image-list nova boot --image=cirros-0.3.4-x86_64-uec --flavor=1 foo Soft reboot, and verify its gone nova list nova reboot cacf99de-117d-4ab7-bd12-32cc2265e906 sudo virsh list ...virsh list should now show no virtual machines running as nova-compute crashed before it could start the instance again. However, nova-api knows that the instance should be rebooting... $ nova list +--------------------------------------+------+---------+----------------+-------------+------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+------+---------+----------------+-------------+------------------+ | cacf99de-117d-4ab7-bd12-32cc2265e906 | foo | REBOOT | reboot_started | Running | private=10.0.0.3 | +--------------------------------------+------+---------+----------------+-------------+------------------+ ...now start nova-compute again, nova-compute detects the missing instance on boot, and tries to start it up again... sg libvirtd '/usr/local/bin/nova-compute --config-file /etc/nova/nova.conf' \ > & echo $! >/opt/stack/status/stack/n-cpu.pid; fg || \ > echo "n-cpu failed to start" | tee "/opt/stack/status/stack/n-cpu.failure" [...snip...] Traceback (most recent call last): File "/opt/stack/nova/nova/conductor/manager.py", line 444, in _object_dispatch return getattr(target, method)(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 213, in wrapper return fn(self, *args, **kwargs) File "/opt/stack/nova/nova/objects/instance.py", line 728, in save columns_to_join=_expected_cols(expected_attrs)) File "/opt/stack/nova/nova/db/api.py", line 764, in instance_update_and_get_original expected=expected) File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 216, in wrapper return f(*args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py", line 146, in wrapper ectxt.value = e.inner_exc File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 195, in __exit__ six.reraise(self.type_, self.value, self.tb) File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py", line 136, in wrapper return f(*args, **kwargs) File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 2464, in instance_update_and_get_original expected, original=instance_ref)) File "/opt/stack/nova/nova/db/sqlalchemy/api.py", line 2602, in _instance_update raise exc(**exc_props) UnexpectedTaskStateError: Conflict updating instance cacf99de-117d-4ab7-bd12-32cc2265e906. Expected: {'task_state': [u'rebooting_hard', u'reboot_pending_hard', u'reboot_started_hard']}. Actual: {'task_state': u'reboot_started'}

So what happened here? This is a bit confusing because we asked for a soft reboot of the instance, but the error we are seeing here is that a hard reboot was attempted — specifically, we’re trying to update an instance object but all the task states we expect the instance to be in are related to a hard reboot, but the task state we’re actually in is for a soft reboot.

We need to take a tour of the compute manager code to understand what happened here. nova-compute is implemented at nova/compute/manager.py in the nova code base. Specifically, ComputeVirtAPI.init_host() sets up the service to start handling compute requests for a specific hypervisor node. As part of startup, this method calls ComputeVirtAPI._init_instance() once per instance on the hypervisor node. This method tries to do some sanity checking for each instance that nova thinks should be on the hypervisor:

  • Detecting if the instance was part of a failed evacuation.
  • Detecting instances that are soft deleted, deleting, or in an error state and ignoring them apart from a log message.
  • Detecting instances which we think are fully deleted but aren’t in fact gone.
  • Moving instances we thought were booting, but which never completed into an error state. This happens if nova-compute crashes during the instance startup process.
  • Similarly, instances which were rebuilding are moved to an error state as well.
  • Clearing the task state for uncompleted tasks like snapshots or preparing for resize.
  • Finishes deleting instances which were partially deleted last time we saw them.
  • And finally, if the instance should be running but isn’t, tries to reboot the instance to get it running.

It is this final state which is relevant in this case — we think the instance should be running and its not, so we’re going to reboot it. We do that by calling ComputeVirtAPI.reboot_instance(). The code which does this work looks like this:

    try_reboot, reboot_type = self._retry_reboot(context, instance) current_power_state = self._get_power_state(context, instance) if try_reboot: LOG.debug("Instance in transitional state (%(task_state)s) at " "start-up and power state is (%(power_state)s), " "triggering reboot", {'task_state': instance.task_state, 'power_state': current_power_state}, instance=instance) self.reboot_instance(context, instance, block_device_info=None, reboot_type=reboot_type) return [...snip...] def _retry_reboot(self, context, instance): current_power_state = self._get_power_state(context, instance) current_task_state = instance.task_state retry_reboot = False reboot_type = compute_utils.get_reboot_type(current_task_state, current_power_state) pending_soft = (current_task_state == task_states.REBOOT_PENDING and instance.vm_state in vm_states.ALLOW_SOFT_REBOOT) pending_hard = (current_task_state == task_states.REBOOT_PENDING_HARD and instance.vm_state in vm_states.ALLOW_HARD_REBOOT) started_not_running = (current_task_state in [task_states.REBOOT_STARTED, task_states.REBOOT_STARTED_HARD] and current_power_state != power_state.RUNNING) if pending_soft or pending_hard or started_not_running: retry_reboot = True return retry_reboot, reboot_type

So, we ask ComputeVirtAPI._retry_reboot() if a reboot is required, and if so what type. ComputeVirtAPI._retry_reboot() just uses nova.compute.utils.get_reboot_type() (aliased as compute_utils.get_reboot_type) to determine what type of reboot to use. This is the crux of the matter. Read on for a surprising discovery!

nova.compute.utils.get_reboot_type() looks like this:

    def get_reboot_type(task_state, current_power_state): """Checks if the current instance state requires a HARD reboot.""" if current_power_state != power_state.RUNNING: return 'HARD' soft_types = [task_states.REBOOT_STARTED, task_states.REBOOT_PENDING, task_states.REBOOTING] reboot_type = 'SOFT' if task_state in soft_types else 'HARD' return reboot_type

So, after all that it comes down to this. If the instance isn’t running, then its a hard reboot. In our case, we shutdown the instance but haven’t started it yet, so its not running. This will therefore be a hard reboot. This is where our problem lies — we chose a hard reboot. The code doesn’t blow up until later though — when we try to do the reboot itself.

    @wrap_exception() @reverts_task_state @wrap_instance_event @wrap_instance_fault def reboot_instance(self, context, instance, block_device_info, reboot_type): """Reboot an instance on this host.""" # acknowledge the request made it to the manager if reboot_type == "SOFT": instance.task_state = task_states.REBOOT_PENDING expected_states = (task_states.REBOOTING, task_states.REBOOT_PENDING, task_states.REBOOT_STARTED) else: instance.task_state = task_states.REBOOT_PENDING_HARD expected_states = (task_states.REBOOTING_HARD, task_states.REBOOT_PENDING_HARD, task_states.REBOOT_STARTED_HARD) context = context.elevated() LOG.info(_LI("Rebooting instance"), context=context, instance=instance) block_device_info = self._get_instance_block_device_info(context, instance) network_info = self.network_api.get_instance_nw_info(context, instance) self._notify_about_instance_usage(context, instance, "reboot.start") instance.power_state = self._get_power_state(context, instance) instance.save(expected_task_state=expected_states) [...snip...]

And there’s our problem. We have a reboot_type of HARD, which means we set the expected_states to those matching a hard reboot. However, the state the instance is actually in will be one correlating to a soft reboot, because that’s what the user requested. We therefore experience an exception when we try to save our changes to the instance. This is the exception we saw above.

The fix in my patch is simply to change the current task state for an instance in this situation to one matching a hard reboot. It all just works then.

So why do we decide to use a hard reboot if the current power state is not RUNNING? This code was introduced in this patch and there isn’t much discussion in the review comments as to why a hard reboot is the right choice here. That said, we already fall back to a hard reboot in error cases of a soft reboot inside the libvirt driver, and a hard reboot requires less trust of the surrounding state for the instance (block device mappings, networks and all those side effects mentioned at the very beginning), so I think it is the right call.

In conclusion, we use a hard reboot for soft reboots that fail, and a nova-compute crash during a soft reboot counts as one of those failure cases. So, when nova-compute detects a failed soft reboot, it converts it to a hard reboot and trys again.

Share