FastCDC, puzzlefs, and de-duplicating container and VM images

Since about 2017, a group at Cisco has been working on an “OCI native operating system” under the title “project machine”, which is a terrible project name. I note that most of the people publicly involved in the project according to github commits no longer work at Cisco, so I cannot vouch for the health of the overall project. That said, they did come up with some interesting ideas along the way and given its a quiet time of year I figured I could do some reading.

(more…)

Continue ReadingFastCDC, puzzlefs, and de-duplicating container and VM images

Interpreting whiteout files in Docker image layers

I've been playing again with Docker images and their internal layers a little more over the last week -- you can see some of my previous adventures at Manipulating Docker images without Docker installed. The general thrust of these adventures is understanding the format and how to manipulate it by building a tool called Occy Strap which can manipulate the format in useful ways. My eventual goal there is to be able to build OCI compliant image bundles and then have a container runtime like runc execute them, and I must say I am getting a lot closer. This time I was interested in the exact mechanisms used by whiteout files in those layers and how that interacts with Linux kernel overlay filesystem types. Firstly, what is a whiteout file? Well, when you delete a file or directory from a lower layer in the Docker image, it doesn't actually get removed from that lower layer, as layers are immutable. Instead, the uppermost layer records that the file or directory has been removed, and it is therefore no longer visible in the Docker image that the container sees. This has obvious security implications if you delete a file like a password you…

Continue ReadingInterpreting whiteout files in Docker image layers

Manipulating Docker images without Docker installed

Recently I've been playing a bit more with Docker images and Docker image repositories. I had in the past written a quick hack to let me extract files from a Docker image, but I wanted to do something a little more mature than that. For example, sometimes you want to download an image from a Docker image repository without using Docker. Naively if you had Docker, you'd do something like this: docker pull busybox docker save busybox However, that assumes that you have Docker installed on the machine downloading the images, and that's sometimes not possible for security reasons. The most obvious example I can think of is airgapped secure environments where you need to walk the data between two networks, and the unclassified network machine doesn't allow administrator access to install Docker. So I wrote a little tool to do image manipulation for me. The tool is called Occy Strap, is written in python, and is available on pypi. That means installing it is relatively simple: python3 -m venv ~/virtualenvs/occystrap . ~/virtualenvs/occystrap/bin/activate pip install occystrap Which doesn't require administrator permissions. There are then a few things we can do with Occy Strap. Downloading an image from a repository and…

Continue ReadingManipulating Docker images without Docker installed

Quick hack: extracting the contents of a Docker image to disk

Hello! Please note I've written a little python tool called Occy Strap which makes this a bit easier, and can do some fancy things around importing and exporting multiple images. You might want to read about it? For various reasons, I wanted to inspect the contents of a Docker image without starting a container. Docker makes it easy to get an image as a tar file, like this: docker save -o foo.tar image But if you extract that tar file you'll find a configuration file and manifest as JSON files, and then a series of tar files, one per image layer. You use the manifest to determine in what order you extract the tar files to build the container filesystem. That's fiddly and annoying. So I wrote this quick python hack to extract an image tarball into a directory on disk that I could inspect: #!/usr/bin/python3 # Call me like this: # docker-image-extract tarfile.tar extracted import tarfile import json import os import sys image_path = sys.argv[1] extracted_path = sys.argv[2] image = tarfile.open(image_path) manifest = json.loads(image.extractfile('manifest.json').read()) for layer in manifest[0]['Layers']: print('Found layer: %s' % layer) layer_tar = tarfile.open(fileobj=image.extractfile(layer)) for tarinfo in layer_tar: print(' ... %s' % tarinfo.name) if tarinfo.isdev(): print(' -->…

Continue ReadingQuick hack: extracting the contents of a Docker image to disk

End of content

No more pages to load