virtio-vsock: python examples of running the server in the guest

I've been using virtio-serial for communications between Linux hypervisors and guest virtual machines for ages. Lots of other people do it to -- the qemu guest agent for example is implemented like this. In fact, I think that's where I got my original thoughts on the matter from. However, virtio-serial is actually fairly terrible to write against as a programming model, because you're left to do all the multiplexing of various requests down the channel and surely there's something better? Well... There is! virtio-vsock is basically the same concept, except it uses the socket interface. You can have more than one connection open and the sockets layer handles multiplexing by magic. This massively simplifies the programming model for supporting concurrent users down the channel. So that's actually pretty cool. I should credit Kata Containers with noticing this quality of life improvement nearly a decade before I did, but I get there in the end. The virtio-vsock model is only a little bit weird. The "address" for the guest virtual machine is a "CID" (Context ID). The hypervisor process is always at CID 0, CID 1 is reserved and unused, and CID 2 is any process on the host which is not…

Continue Readingvirtio-vsock: python examples of running the server in the guest

Minor questions in Linux file semantics

  • Post author:
  • Post category:Linux

I’ve known for a long time that if you delete a file on Unix / Linux but that file is open somewhere, the blocks used by the file aren’t freed until that user closes the file (or is terminated), but I was left wondering about some other edge cases.

Shaken Fist has a distributed blob store. It also has a cache of images that virtual machines are using. If the blob store and the image cache are on the same filesystem, sometimes the image cache entry can be a hard link to an entry in the blob store (for example, if the entry in the blob store doesn’t need to be transcoded before use by the virtual machine). However, if they are on different file systems, I instead use a symbolic link.

This raises questions — what happens if you rename a file which is open for writing in a program? What happens if you change a symbolic link to point somewhere else while it is open? I suspect in both cases the right thing happens, but I decided I should test these theories out.

(more…)

Continue ReadingMinor questions in Linux file semantics

Linux bridges have their MTU overwritten when you add an interface

  • Post author:
  • Post category:Linux

I discovered last night that network bridges on linux have their Maximum Transmission Unit (MTU) overwritten by whatever is the MTU value of the most recent interface added to the bridge. This is bad. Very bad. Specifically this is bad because MTU matters for accurately describing the capabilities of the network path the packets will travel on, so it shouldn't be clobbered willy nilly. Here's an example of the behaviour: # ip link add egr-br-ens1f0 mtu 1500 type bridge # ip link show dev egr-br-ens1f0 3: egr-br-ens1f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 7e:33:1b:30:d8:00 brd ff:ff:ff:ff:ff:ff # ip link add egr-eaa64a-o mtu 8950 type veth peer name egr-eaa64a-i # ip link show dev egr-br-ens1f0 3: egr-br-ens1f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 7e:33:1b:30:d8:00 brd ff:ff:ff:ff:ff:ff # brctl addif egr-br-ens1f0 egr-eaa64a-o # ip link show dev egr-br-ens1f0 3: egr-br-ens1f0: <BROADCAST,MULTICAST> mtu 8950 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether da:82:cf:34:13:60 brd ff:ff:ff:ff:ff:ff So you can see here that the bridge had an MTU of 1,500 bytes. We create a veth pair with an MTU of 8,950 bytes and add it to…

Continue ReadingLinux bridges have their MTU overwritten when you add an interface

The KSM and I

  • Post author:
  • Post category:Linux

I spent much of yesterday playing with KSM (Kernel Shared Memory, or Kernel Samepage Merging depending on which universe you come from). Unix kernels store memory in "pages" which are moved in and out of memory as a single block. On most Linux architectures pages are 4,096 bytes long. KSM is a Linux Kernel feature which scans memory looking for identical pages, and then de-duplicating them. So instead of having two pages, we just have one and have two processes point at that same page. This has obvious advantages if you're storing lots of repeating data. Why would you be doing such a thing? Well the traditional answer is virtual machines. Take my employer's systems for example. We manage virtual learning environments for students, where every student gets a set of virtual machines to do their learning thing on. So, if we have 50 students in a class, we have 50 sets of the same virtual machine. That's a lot of duplicated memory. The promise of KSM is that instead of storing the same thing 50 times, we can store it once and therefore fit more virtual machines onto a single physical machine. For my experiments I used libvirt /…

Continue ReadingThe KSM and I

The last week for linux.conf.au 2019 proposals!

Dear humans of the Internet -- there is ONE WEEK LEFT to propose talks for linux.conf.au 2019. LCA is one of the world's best open source conferences, and we'd love to hear you speak!   Unsure what to propose? Not sure if your talk is what the conference would normally take? Just want a chat? You're welcome to reach out to papers-chair@linux.org.au to talk things through.   https://linux.conf.au/call-for-papers/

Continue ReadingThe last week for linux.conf.au 2019 proposals!

Giving serial devices meaningful names

  • Post author:
  • Post category:Linux

This is a hack I've been using for ages, but I thought it deserved a write up. I have USB serial devices. Lots of them. I use them for home automation things, as well as for talking to devices such as the console ports on switches and so forth. For the permanently installed serial devices one of the challenges is having them show up in predictable places so that the scripts which know how to drive each device are talking in the right place. For the trivial case, this is pretty easy with udev: $ cat /etc/udev/rules.d/60-local.rules KERNEL=="ttyUSB*", \ ATTRS{idVendor}=="0403", ATTRS{idProduct}=="6001", \ ATTRS{serial}=="A8003Ye7", \ SYMLINK+="radish" This says for any USB serial device that is discovered (either inserted post boot, or at boot), if the USB vendor and product ID match the relevant values, to symlink the device to "/dev/radish". You find out the vendor and product ID from lsusb like this: $ lsusb Bus 003 Device 003: ID 0624:0201 Avocent Corp. Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 007 Device 002: ID 0665:5161 Cypress Semiconductor USB to Serial Bus 007 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub Bus 006 Device 001: ID 1d6b:0001…

Continue ReadingGiving serial devices meaningful names

End of content

No more pages to load