I was talking to a friend the other day about our shared mutual appreciation of virtio-vsock, and it made me wonder something. How do virtual machines on Linux actually work? I know it involves qemu and the kernel’s KVM virtual machine implementation, but exactly how do they interact? How does the kernel get qemu to do emulation tasks as required?
qemu is several things hanging out together in a trench coat, but one of those things is software which can configure Linux’s built-in KVM virtual machine functionality to run a virtual machine, and then handle emulation of the devices that virtual machine is attached to which cannot be represented with actual physical hardware. This part of qemu is called a “KVM client” in the Linux kernel documentation. Its called that because if we ignore the emulation part for now it is just literally a client calling established APIs to the Linux kernel.
I’ve fallen into this pattern where I do an hour or so of self-directed learning in the mornings before going to work. Until recently it was an excellent CMU course on the design of SQL database systems, which I’ve mentioned previously here. I’ve finished that, so I thought I would do something shorter and fun as a break before finding another course to do. I chose The freeCodeCamp.org hot dog or not hot dog tensorflow course. 90 minutes seemed achievable, and I too wish to know if an object in front of me is a hot dog or not.
I make no claim to be an expert at this, but I did just need to convert a project from a slightly complicated setup.py / PBR configuration to pyproject.toml and thought I should write up where I landed. I say “slightly complicated” because there are a few very OpenStacky things I like to do in these things. Specifically:
version numbers are driven by git tags not hard coded in the configuration file.
console scripts are a thing.
I often include data files in the built package.
So here’s an example of all of those things that is working ok for me:
I’ve been using virtio-serial for communications between Linux hypervisors and guest virtual machines for ages. Lots of other people do it to — the qemu guest agent for example is implemented like this. In fact, I think that’s where I got my original thoughts on the matter from. However, virtio-serial is actually fairly terrible to write against as a programming model, because you’re left to do all the multiplexing of various requests down the channel and surely there’s something better?
Well… There is! virtio-vsock is basically the same concept, except it uses the socket interface. You can have more than one connection open and the sockets layer handles multiplexing by magic. This massively simplifies the programming model for supporting concurrent users down the channel. So that’s actually pretty cool. I should credit Kata Containers with noticing this quality of life improvement nearly a decade before I did, but I get there in the end.
I got interested today in trying to come up with a solid way of determining when updates were last applied to a RHEL-derived Linux instance. Previously we’d been inferring it from the kernel version, but it turns out there is a convenient “yum history” or “dnf history” command which will show you all the previous transactions that the package database has seen. However, the output is hard to parse in a script.
In New python syntax I was previously unaware of, I discussed some new operators I'd recently discovered. One of them is called the Walrus operator, which lets you write code like this: list = ['a', 'b', 'c'] def get_one(): if not list: return None return list.pop() while one := get_one(): print(one) See where we do the assignment inside the while? That code returns: c b a Which is as expected. However, the Walrus operator is strict about needing a None returned to end the iteration. I had code which was more like this: list = [('a', 1), ('b', 2), ('c', 3)] def get_one(): if not list: return None, None return list.pop() while one := get_one(): print(one) And the while loop never terminates. It just prints (None, None) over and over. So there you go.
This post documents the new syntax features I learned about while reading cpython internals. You can create more than one context manager on a single line. So for example Shaken Fist contains code like this: with open(path + '.new', 'w') as o: with open(path, 'r') as i: ... That can now be written like this: with open(path + '.new', 'w') as o, open(path, 'r') as i: ... You can assign values in a while statement, but only one. Instead of this: d = f.read(8000) while f: ... d = f.read(8000) You can write this: while d := f.read(8000): ... But unfortunately this doesn't work: while a, b := thing(): ... You can use underscores as commands in long numbers to make them easier to read. For example, you can write 1000000 or 1_000_000 and they both mean the same thing. You can refer to positional arguments by name, but you can also disable that. I didn't realise that this was valid python: def foo(bar=None): print(bar) foo(bar='banana') You can turn it off with a forward slash in the argument list though, which should separate positional arguments from named arguments: def foo(bar, /, extra=None): print(bar) print(extra) foo('banana', extra='frog') The above example…
I have been paid money to write Python code since about 2006, so I figured it was probably time that I should understand some of the inner workings of Python. I therefore picked up two books on the topic, this one being the first of the two.
This book to be honest isn’t completely what I expected. Its very well written and quite interesting, but its more about the things you’d need to know to become a Python core developer, rather than the things you should know as a user of Python like how the Python dictionary implementation is built.
(If you want that specifically, this video is an excellent introduction).
Get your guided tour through the Python 3.9 interpreter: Unlock the inner workings of the Python language, compile the Python interpreter from source code, and participate in the development of CPython. Are there certain parts of Python that just seem like magic? This book explains the concepts, ideas, and technicalities of the Python interpreter in an approachable and hands-on fashion. Once you see how Python works at the interpreter level, you can optimize your applications and fully leverage the power of Python.
So last night Shaken Fist CI jobs started failing with errors like this (editted lightly for clarity): Building wheels for collected packages: shakenfist-ci Building wheel for shakenfist-ci (setup.py): started Building wheel for shakenfist-ci (setup.py): finished with status 'error' error: subprocess-exited-with-error × python setup.py bdist_wheel did not run successfully. │ exit code: 1 ╰─> [86 lines of output] ... ...setuptools/command/install.py:37: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools. setuptools.SetuptoolsDeprecationWarning, installing to build/bdist.linux-x86_64/wheel running install ... warning: install_lib: byte-compiling is disabled, skipping. running install_egg_info Copying shakenfist_ci.egg-info to build/bdist.linux-x86_64/wheel/shakenfist_ci-0.0.1.dev2544-py3.7.egg-info running install_scripts error: invalid command 'bdist_wininst' [end of output] This was pretty concerning. I know that a setup.py / setup.cfg style install is a little old school, but it was unexpected that it broke entirely. At first I thought I'd have to convert to poetry to unblock this, but Chet helpfully pointed out that this is as simple as adding a pyproject.toml file to the directory which contains your setup.py and setup.cfg. The basic issue is that a modern pip doesn't assume that you're going to use setuptools, so you need to tell it that you're doing that in pyproject.toml. Then you're unblocked. So, just create a file named…
So, as of today by Shaken Fist CI jobs for Debian 10 are failing to install bcrypt, with an error that looks like this: Running setup.py install for bcrypt: started Running setup.py install for bcrypt: finished with status 'error' [ ... snip ... ] running build_rust =============================DEBUG ASSISTANCE============================= If you are seeing a compilation error please try the following steps to successfully install bcrypt: 1) Upgrade to the latest pip and try again. This will fix errors for most users. See: https://pip.pypa.io/en/stable/installing/#upgrading-pip 2) Ensure you have a recent Rust toolchain installed. bcrypt requires rustc >= 1.56.0. Python: 3.7.3 platform: Linux-4.19.0-21-amd64-x86_64-with-debian-10.12 pip: 18.1 setuptools: 65.2.0 setuptools_rust: 1.5.1 rustc: n/a =============================DEBUG ASSISTANCE============================= I'm not really interested in debating why installing a python package requires a rust compiler, that has been dicussed elsewhere. This specific breakage has been caused by bcrypt releasing 4.0.0, which has this in the changelog: "bcrypt is now implemented in Rust. Users building from source will need to have a Rust compiler available. Nothing will change for users downloading wheels." Unfortunately, you can't just install rustc with apt, as it is both quite big (350mb), and too old (version 1.41.1 versus the required 1.56.0 or better). I also couldn't…