For various reasons, I wanted to inspect the contents of a Docker image without starting a container. Docker makes it easy to get an image as a tar file, like this:
docker save -o foo.tar image
But if you extract that tar file you’ll find a configuration file and manifest as JSON files, and then a series of tar files, one per image layer. You use the manifest to determine in what order you extract the tar files to build the container filesystem.
That’s fiddly and annoying. So I wrote this quick python hack to extract an image tarball into a directory on disk that I could inspect:
#!/usr/bin/python3 # Call me like this: # docker-image-extract tarfile.tar extracted import tarfile import json import os import sys image_path = sys.argv extracted_path = sys.argv image = tarfile.open(image_path) manifest = json.loads(image.extractfile('manifest.json').read()) for layer in manifest['Layers']: print('Found layer: %s' % layer) layer_tar = tarfile.open(fileobj=image.extractfile(layer)) for tarinfo in layer_tar: print(' ... %s' % tarinfo.name) if tarinfo.isdev(): print(' --> skip device files') continue dest = os.path.join(extracted_path, tarinfo.name) if not tarinfo.isdir() and os.path.exists(dest): print(' --> remove old version of file') os.unlink(dest) layer_tar.extract(tarinfo, path=extracted_path)
Hopefully that’s useful to someone else (or future me).