Reactions to a history of block storage at Amazon EC2

  • Post author:
  • Post category:EC2

This is an interesting read about the history of the EBS subsystem in Amazon EC2. This quote particularly stands out to me: What I didn’t realize until I joined Amazon, and seems obvious in hindsight, is that you can design an organization much the same way you can design a software system. Different algorithms have different benefits and tradeoffs in how your organization functions. Where practical, Amazon chooses a divide and conquer approach, and keeps teams small and focused on a self-contained component with well-defined APIs.  

Continue ReadingReactions to a history of block storage at Amazon EC2

The Kubernetes Book (2024 edition)

  • Post author:
  • Post category:Book

This is yet another accidental purchase of a self-published book, although I think this one makes a lot of sense as a self published work. Writing a technical reference book isn't a particularly lucrative pastime for most authors, and self publishing likely makes it more worthwhile than the traditional publisher route, especially if you can rustle up a good set of technical editors and reviewers yourself. That said, I think one of the risks with self published technical books like this is that they are overly credulous, and I think this book falls into that trap early by describing Kubernetes as the "cloud operating system". Like I get it, you're excited about Kubernetes, but making claims that all of the cloud runs on Kubernetes just undermines your work before you've even really started. I can't find any public data, either academic or anecdotal, which supports the assertion that Kubernetes is even the most popular way to run workloads in clouds. I'm sure that AWS has more VMs not running Kubernetes for example than they do have running it. That said, it is clear at this point that Kubernetes is the dominant player for container clustering. So why not just say…

Continue ReadingThe Kubernetes Book (2024 edition)

Providing stable EBS volume device files

  • Post author:
  • Post category:AWS

So I had a little adventure at work today and I am sure this is going to come up again. Imagine that you have an AWS instance with more than one EBS volume attached. On modern instance types, the EBS volumes appear as NVMe device files, but the naming of the device files is not stable -- it depends on what PCI device is detected by the kernel first etc. It turns out that providing stable names for the device files is a solved problem though! Specifically, CoreOS has udev rules which use a short script to lookup the EC2 EBS device name from the vendor-specific portion of the NVMe id-ctrl data, and provide an appropriate symlink. This saved me a fair bit of mucking around providing stable UUIDs for EBS volume templates, because we can instead just set the device name in the launch template and then have udev enforce that device name on boot. So that's nice. There is of course no real equivalent for OpenStack, as OpenStack generally uses qemu virtual disks not fake NVMe disks. I should think about that some more sometime. For what its worth, GCE uses the device serial number it seems based…

Continue ReadingProviding stable EBS volume device files

Starting your first instance on Shaken Fist (a video tutorial)

As a bit of an experiment, I've made this quick and dirty "vlog" style tutorial video to show you how to install Shaken Fist on a single machine and boot your first instance. I demonstrate how to install, setup your first virtual network, start the instance, inspect events that the instance has experienced, and then log in. Let me know if you think its useful.

Continue ReadingStarting your first instance on Shaken Fist (a video tutorial)

Deciding when to filter out large scale refactorings from code analysis

  • Post author:
  • Post category:OpenStack

I want to be able to see the level of change between OpenStack releases. However, there are a relatively small number of changes with simply huge amounts of delta in them -- they're generally large refactors or the delete which happens when part of a repository is spun out into its own project. I therefore wanted to explore what was a reasonable size for a change in OpenStack so that I could decide what maximum size to filter away as likely to be a refactor. After playing with a couple of approaches, including just randomly picking a number, it seems the logical way to decide is to simply plot a histogram of the various sizes, and then pick a reasonable place on the curve as the cutoff. Due to the large range of values (from zero lines of change to over a million!), I ended up deciding a logarithmic axis was the way to go. For the projects listed in the OpenStack compute starter kit reference set, that produces the following histogram:I feel that filtering out commits over 10,000 lines of delta feels justified based on that graph. For reference, the raw histogram buckets are:

Continue ReadingDeciding when to filter out large scale refactorings from code analysis

A quick summary of OpenStack release tags

I wanted a quick summary of OpenStack git release tags for a talk I am working on, and it turned out to be way more complicated than I expected. I ended up having to compile a table, and then turn that into a code snippet. In case its useful to anyone else, here it is: Or in python form for those so inclined: RELEASE_TAGS = { 'austin': {'all': '2010.1'}, 'bexar': {'all': '2011.1'}, 'cactus': {'all': '2011.2'}, 'diablo': {'all': '2011.3'}, 'essex': {'all': '2012.1.3'}, 'folsom': {'all': '2012.2.4'}, 'grizzly': {'all': '2013.1.5'}, 'havana': {'all': '2013.2.4'}, 'icehouse': {'all': '2014.1.5'}, 'juno': {'all': '2014.2.4'}, 'kilo': {'all': '2015.1.4'}, 'liberty': { 'glance': '11.0.2', 'keystone': '8.1.2', 'neutron': '7.2.0', 'nova': '12.0.6' }, 'mitaka': { 'glance': '12.0.0', 'keystone': '9.3.0', 'neutron': '8.4.0', 'nova': '13.1.4' }, 'newton': { 'glance': '13.0.0', 'keystone': '10.0.3', 'neutron': '9.4.1', 'nova': '14.1.0' }, 'ocata': { 'glance': '14.0.1', 'keystone': '11.0.4', 'neutron': '10.0.7', 'nova': '15.1.5' }, 'pike': { 'glance': '15.0.2', 'keystone': '12.0.3', 'neutron': '11.0.8', 'nova': '16.1.8' }, 'queens': { 'glance': '16.0.1', 'keystone': '13.0.4', 'neutron': '12.1.1', 'nova': '17.0.13' }, 'rocky': { 'glance': '17.0.1', 'keystone': '14.2.0', 'neutron': '13.0.7', 'nova': '18.3.0' }, 'stein': { 'glance': '18.0.1', 'keystone': '15.0.1', 'neutron': '14.4.2', 'nova': '19.3.2' }, 'train': { 'glance': '19.0.4', 'keystone': '16.0.1', 'neutron': '15.3.0', 'nova': '20.4.1' }, 'ussuri': {…

Continue ReadingA quick summary of OpenStack release tags

Rejected talk proposal: Shaken Fist, thought experiments in simpler IaaS clouds

This proposal was submitted for FOSDEM 2021. Given that acceptances were meant to be sent out on 25 December and its basically a week later I think we can assume that its been rejected. I've recently been writing up my rejected proposals, partially because I've put in the effort to write them and they might be useful elsewhere, but also because I think its important to demonstrate that its not unusual for experienced speakers to be rejected from these events. OpenStack today is a complicated beast -- not only does it try to perform well for large clusters, but it also embraces a diverse set of possible implementations from hypervisors, storage, networking, and more. This was a deliberate tactical choice made by the OpenStack community years ago, forming a so called "Big Tent" for vendors to collaborate in to build Open Source cloud options. It made a lot of sense at the time to be honest. However, OpenStack today finds itself constrained by the large number of permutations it must support, ten years of software and backwards compatability legacy, and a decreasing investment from those same vendors that OpenStack courted so actively. Shaken Fist makes a series of simplifying assumptions…

Continue ReadingRejected talk proposal: Shaken Fist, thought experiments in simpler IaaS clouds

Introducing Shaken Fist

The first public commit to what would become OpenStack Nova was made ten years ago today -- at Thu May 27 23:05:26 2010 PDT to be exact. So first off, happy tenth birthday to Nova! A lot has happened in that time -- OpenStack has gone from being two separate Open Source projects to a whole ecosystem, developers have come and gone (and passed away), and OpenStack has weathered the cloud wars of the last decade. OpenStack survived its early growth phase by deliberately offering a "big tent" to the community and associated vendors, with an expansive definition of what should be included. This has resulted in most developers being associated with a corporate sponser, and hence the decrease in the number of developers today as corporate interest wanes -- OpenStack has never been great at attracting or retaining hobbist contributors. My personal involvement with OpenStack started in November 2011, so while I missed the very early days I was around for a lot and made many of the mistakes that I now see in OpenStack. What do I see as mistakes in OpenStack in hindsight? Well, embracing vendors who later lose interest has been painful, and has increased the…

Continue ReadingIntroducing Shaken Fist

Configuring load balancing and location headers on Google Cloud

I have a need at the moment to know where my users are in the world. This helps me to identify what compute resources to serve their request with in order to reduce the latency they experience. So how do you do that thing with Google Cloud? The first step is to setup a series of test backends to send traffic to. I built three regions: Sydney; London; and Los Angeles. It turns out in hindsight that wasn't actually nessesary though -- this would work with a single backend just as well. For my backends I chose a minimal Ubuntu install, running this simple backend HTTP service. I had some initial trouble finding a single page which walked through the setup of the Google Cloud load balancer to do what I wanted, which is the main reason for writing this post. The steps are: Create your test instances and configure the backend on them. I ended up with a setup like this: Next setup instance groups to contain these instances. I chose unmanaged instance groups (that is, I don't want autoscaling). You need to create one per region. But wait! There's one more layer of abstraction. We need a backend…

Continue ReadingConfiguring load balancing and location headers on Google Cloud

Setting up VXLAN between nested virt VMs on Google Compute Engine

I wanted to play with a VXLAN mesh between VMs on more than one hypervisor node, but the setup for VXLAN ended up being a separate post because it was a bit long. Read that post first if you want to follow the instructions here. Now that we have a working VXLAN mesh between our two nodes we can move on to installing libvirt (which is called libvirt-daemon-system on Debian, not libvirt-bin as on Ubuntu): sudo apt-get install -y qemu-kvm libvirt-daemon-system sudo virsh net-start default sudo virsh net-autostart --network default I'm going to use a little python helper to launch my VMs, so I need some other dependancies as well: sudo apt-get install -y python3-pip pkg-config libvirt-dev git git clone https://github.com/mikalstill/shakenfist cd shakenfist git checkout 6bfac153d249752b27d224ad9d079095b640498e sudo mkdir /srv/shakenfist sudo cp template.debian.xml /srv/shakenfist/template.xml sudo pip3 install -r requirements.txt Let's launch a quick test VM to make sure the helper works: sudo python3 daemon.py sudo virsh list You can destroy that VM for now, it was just testing the install. sudo virsh destroy ...name... Next we need to tweak the template that shakenfist is using to start instances so that it uses the bridge for networking (that template is the one…

Continue ReadingSetting up VXLAN between nested virt VMs on Google Compute Engine

End of content

No more pages to load