Introducing Shaken Fist

Share

The first public commit to what would become OpenStack Nova was made ten years ago today — at Thu May 27 23:05:26 2010 PDT to be exact. So first off, happy tenth birthday to Nova!

A lot has happened in that time — OpenStack has gone from being two separate Open Source projects to a whole ecosystem, developers have come and gone (and passed away), and OpenStack has weathered the cloud wars of the last decade. OpenStack survived its early growth phase by deliberately offering a “big tent” to the community and associated vendors, with an expansive definition of what should be included. This has resulted in most developers being associated with a corporate sponser, and hence the decrease in the number of developers today as corporate interest wanes — OpenStack has never been great at attracting or retaining hobbist contributors.

My personal involvement with OpenStack started in November 2011, so while I missed the very early days I was around for a lot and made many of the mistakes that I now see in OpenStack.

What do I see as mistakes in OpenStack in hindsight? Well, embracing vendors who later lose interest has been painful, and has increased the complexity of the code base significantly. Nova itself is now nearly 400,000 lines of code, and that’s after splitting off many of the original features of Nova such as block storage and networking. Additionally, a lot of our initial assumptions are no longer true — for example in many cases we had to write code to implement things, where there are now good libraries available from third parties.

That’s not to say that OpenStack is without value — I am a daily user of OpenStack to this day, and use at least three OpenStack public clouds at the moment. That said, OpenStack is a complicated beast with a lot of legacy that makes it hard to maintain and slow to change.

For at least six months I’ve felt the desire for a simpler cloud orchestration layer — both for my own personal uses, and also as a test bed for ideas for what a smaller, simpler cloud might look like. My personal use case involves a relatively small environment which echos what we now think of as edge compute — less than 10 RU of machines with a minimum of orchestration and management overhead.

At the time that I was thinking about these things, the Australian bushfires and COVID-19 came along, and presented me with a lot more spare time than I had expected to have. While I’m still blessed to be employed, all of my social activities have been cancelled, so I find myself at home at a loose end on weekends and evenings at lot more than before.

Thus Shaken Fist was born — named for a Simpson’s meme, Shaken Fist is a deliberately small and highly opinionated cloud implementation aimed at working well in small deployments such as homes, labs, edge compute locations, deployed systems, and so forth.

I’d taken a bit of trouble with each feature in Shaken Fist to think through what the simplest and highest value way of doing something is. For example, instances always get a config drive and there is no metadata server. There is also only one supported type of virtual networking, and one supported hypervisor. That said, this means Shaken Fist is less than 5,000 lines of code, and small enough that new things can be implemented very quickly by a single middle aged developer.

Shaken Fist definitely has feature gaps — API authentication and scheduling are the most obvious at the moment — but I have plans to fill those when the time comes.

I’m not sure if Shaken Fist is useful to others, but you never know. Its apache2 licensed, and available on github if you’re interested.

Share

Exporting volumes from Cinder and re-creating COW layers

Share

Today I wandered into a bit of a rat hole discovering how to export data from OpenStack Cinder volumes when you don’t have admin permissions, and I thought it was worth documenting here so I remember it for next time.

Let’s assume that you have a Cinder volume named “child1”, which is a 64gb volume originally cloned from “parent1”. parent1 is a 7.9gb VMDK, but the only way I can find to extract child1 is to convert it to a glance image and then download the entire volume as a raw. Something like this:

$ cinder upload-to-image $child1 "extract:$child1"

Where $child1 is the UUID of the Cinder volume. You then need to find the UUID of the image in Glance, which the Cinder upload-to-image command will have told you, but you can also find by searching Glance for your image named “extract:$child1”:

$ glance image-list | grep "extract:$cinder_uuid"

You now need to watch that Glance image until the status of the image is “active”. It will go through a series of steps with names like “queued”, and “uploading” first.

Now you can download the image from Glance:

$ glance image-download --file images/$child1.raw --progress $glance_uuid

And then delete the intermediate glance image:

$ glance image-delete $glance_uuid

I have a bad sample script which does this in my junk code repository if that is helpful.

What you have at the end of this is a 64gb raw disk file in my example. You can convert that file to qcow2 like this:

$ qemu-img convert $child1.raw $child1.qcow2

But you’re left with a 64gb qcow2 file for your troubles. I experimented with virt-sparsify to reduce the size of this image, but it doesn’t work in my case (no space is saved), I suspect because the disk image has multiple partitions because it originally came from a VMWare environment.

Luckily qemu-img can also re-create the COW layer that existing on the admin-only side of the public cloud barrier. You do this by rebasing the converted qcow2 file onto the original VMDK file like this:

$ qemu-img create -f qcow2 -b $parent1.qcow2 $child1.delta.qcow2
$ qemu-img rebase -b $parent1.vmdk $child1.delta.qcow2

In my case I ended up with a 289mb $child1.delta.qcow2 file, which isn’t too shabby. It took about five minutes to produce that delta on my Google Cloud instance from a 7.9gb backing file and a 64gb upper layer.

Share

What’s missing from the ONAP community — an open design process

Share

I’ve been thinking a fair bit about ONAP and its future releases recently. This is in the context of trying to implement a system for a client which is based on ONAP. Its really hard though, because its hard to determine how various components of ONAP are intended to work, or interoperate.

It took me a while, but I’ve realised what’s missing here…

OpenStack has an open design process. If you want to add a new feature to Nova for example, the first step is you need to write down what the feature is intended to do, how it integrates with the rest of Nova, and how people might use it. The target audience for that document is both the Nova development team, but also people who operate OpenStack deployments.

ONAP has no equivalent that I can find. So for example, they say that in Casablanca they are going to implement a “AAI Enricher” to ease lookup of data from external systems in their inventory database, but I can’t find anywhere where they explain how the integration between arbitrary external systems and ONAP AAI will work.

I think ONAP would really benefit from a good hard look at their design processes and how approachable they are for people outside their development teams. The current use case proposal process (videos, conference talks, and powerpoint presentations) just isn’t great for people who are trying to figure out how to deploy their software.

Share

Learning from the mistakes that even big projects make

Share

The following is a blog post version of a talk presented at pyconau 2018. Slides for the presentation can be found here (as Microsoft powerpoint, or as PDF), and a video of the talk (thanks NextDayVideo!) is below:

 

OpenStack is an orchestration system for setting up virtual machines and associated other virtual resources such as networks and storage on clusters of computers. At a high level, OpenStack is just configuring existing facilities of the host operating system — there isn’t really a lot of difference between OpenStack and a room full of system admins frantically resolving tickets requesting virtual machines be setup. The only real difference is scale and predictability.

To do its job, OpenStack needs to be able to manipulate parts of the operating system which are normally reserved for administrative users. This talk is the story of how OpenStack has done that thing over time, what we learnt along the way, and what I’d do differently if I had my time again. Lots of systems need to do these things, so even if you never use OpenStack hopefully there are things to be learnt here.

Continue reading “Learning from the mistakes that even big projects make”

Share

The last week for linux.conf.au 2019 proposals!

Share
Dear humans of the Internet — there is ONE WEEK LEFT to propose talks for linux.conf.au 2019. LCA is one of the world’s best open source conferences, and we’d love to hear you speak!
 
Unsure what to propose? Not sure if your talk is what the conference would normally take? Just want a chat? You’re welcome to reach out to papers-chair@linux.org.au to talk things through.
 
https://linux.conf.au/call-for-papers/
Share

Rejected talk proposal: Design at scale: OpenStack versus Kubernetes

Share

This proposal was submitted for pyconau 2018. It wasn’t accepted, but given I’d put the effort into writing up the proposal I’ll post it here in case its useful some other time. The oblique references to OpensStack are because pycon had an “anonymous” review system in 2018, and I was avoiding saying things which directly identified me as the author.


OpenStack and Kubernetes solve very similar problems. Yet they approach those problems in very different ways. What can we learn from the different approaches taken? The differences aren’t just technical though, there are some interesting social differences too. Continue reading “Rejected talk proposal: Design at scale: OpenStack versus Kubernetes”

Share

Accepted talk proposal: Learning from the mistakes that even big projects make

Share

This proposal was submitted for pyconau 2018. It was accepted, but hasn’t been presented yet. The oblique references to OpensStack are because pycon had an “anonymous” review system in 2018, and I was avoiding saying things which directly identified me as the author.


Since 2011, I’ve worked on a large Open Source project in python. It kind of got out of hand – 1000s of developers and millions of lines of code. Yet despite being well resourced, we made the same mistakes that those tiny scripts you whip up to solve a small problem make. Come learn from our fail.

Continue reading “Accepted talk proposal: Learning from the mistakes that even big projects make”

Share

Adding oslo privsep to a new project, a worked example

Share

You’ve decided that using sudo to run command lines as root is lame and that it is time to step up and do things properly. How do you do that? Well, here’s a simple guide to adding oslo privsep to your project!

Continue reading “Adding oslo privsep to a new project, a worked example”

Share

How to make a privileged call with oslo privsep

Share

Once you’ve added oslo privsep to your project, how do you make a privileged call? Its actually really easy to do. In this post I will assume you already have privsep running for your project, which at the time of writing limits you to OpenStack Nova in the OpenStack universe.

Continue reading “How to make a privileged call with oslo privsep”

Share

On Selecting a Well Engaged Open Source Vendor

Share

Aptira is in an interesting position in the Open Source market, because we don’t usually sell software. Instead, our customers come to us seeking assistance with deciding which OpenStack to use, or how to embed ONAP into their nationwide networks, or how to move their legacy networks to the software defined future. Therefore, our most common role is as a trusted advisor to help our customers decide which Open Source products to buy.

(My boss would insist that I point out here that we do customisation of Open Source for our customers, and have assisted many in the past with deploying pure upstream solutions. Basically, we do what is the right fit for the customer, and aren’t obsessed with fitting customers into pre-defined moulds that suit our partners.)

That makes it important that we recommend products from companies that are well engaged with their upstream Open Source communities. That might be OpenStack, or ONAP, or even something like Open Daylight. This raises the obvious question – what makes a company well engaged with an upstream project?

Read more over at my employer’s blog

Share