Here I try to figure out possible ways of invoking nspawn for the
cleanup steps of gitlab custom runners. The results might be
useful invocations beyond Gitlab's scope of application.
I begin with a chroot which will be the base for our build environments:
debootstrap --variant=minbase --include=git,build-essential buster workdir
Fully ephemeral nspawn
This would be fantastic: set up a reusable chroot, mount readonly, run the CI
in a working directory mounted on tmpfs. It sets up quickly, it cleans up after
itself, and it would make
mkdir workdir/var/lib/gitlab-runner systemd-nspawn --read-only --directory workdir --tmpfs /var/lib/gitlab-runner "$@"
run gets run multiple times, so I need the side effects of
persist inside the chroot between runs.
Also, if the CI uses a large amount of disk space, tmpfs may get into trouble.
nspawn with overlay
Federico used --overlay to keep the base chroot readonly while allowing persistent writes on a temporary directory on the filesystem.
Note that using
--overlay requires systemd and systemd-container from
buster-backports because of systemd bug #3847.
mkdir -p tmp-overlay systemd-nspawn --quiet -D workdir \ --overlay="`pwd`/workdir:`pwd`/tmp-overlay:/"
I can run this twice, and changes in the file system will persist between systemd-nspawn executions. Great! However, any process will be killed at the end of each execution.
I can give a name to
systemd-nspawn invocations using
--machine, and it
allows me to run multiple commands during the machine lifespan using
machinectl can also fully manage chroots and disk images in
/var/lib/machines, but I haven't found a way with
machinectl to start
multiple machines sharing the same underlying chroot.
It's ok, though: I managed to do that with
I can use the
--machine=name argument to
systemd-nspawn to make it visible
machinectl. I can use the
--boot argument to
systemd-nspawn to start
enough infrastructure inside the container to allow
machinectl to interact
This gives me any number of persistent and named running systems, that share the same underlying chroot, and can cleanup after themselves. I can run commands in any of those systems as I like, and their side effects persist until a system is stopped.
The chroot needs systemd and dbus for machinectl to be able to interact with it:
debootstrap --variant=minbase --include=git,systemd,systemd,build-essential buster workdir
Let's boot the machine:
mkdir -p overlay systemd-nspawn --quiet -D workdir \ --overlay="`pwd`/workdir:`pwd`/overlay:/" --machine=test --boot
Let's try machinectl:
# machinectl list MACHINE CLASS SERVICE OS VERSION ADDRESSES test container systemd-nspawn debian 10 - 1 machines listed. # machinectl shell --quiet test /bin/ls -la / total 60 […]
To run commands, rather than
machinectl shell, I need to use
--wait --pipe --machine=name, otherwise machined won't forward the exit
code. The result however is
pretty good, with working stdin/stdout/stderr redirection and forwarded exit
Good, I'm getting somewhere.
The terminal where I ran systemd-nspawn is currently showing a nice getty for the booted system, which is cute, and not what I want for the setup process of a CI.
Spawning machines without needing a terminal
/lib/systemd/system/systemd-nspawn@.service to start
machines. I suppose there's limited magic in there: start
systemd-nspawn as a
--machine to give it a name, and
machinectl manages it as if
it started it itself.
What if, instead of installing a unit file for each CI run, I try to do the
same thing with
systemd-run \ -p 'KillMode=mixed' \ -p 'Type=notify' \ -p 'RestartForceExitStatus=133' \ -p 'SuccessExitStatus=133' \ -p 'Slice=machine.slice' \ -p 'Delegate=yes' \ -p 'TasksMax=16384' \ -p 'WatchdogSec=3min' \ systemd-nspawn --quiet -D `pwd`/workdir \ --overlay="`pwd`/workdir:`pwd`/overlay:/" --machine=test --boot
It works! I can interact with it using machinectl, and fine tune
as needed to lock CI machines down.
This setup has a race condition where if I try to run a command inside the machine in the short time window before the machine has finished booting, it fails:
# systemd-run […] systemd-nspawn […] ; machinectl --quiet shell test /bin/ls -la / Failed to get shell PTY: Protocol error # machinectl shell test /bin/ls -la / Connected to machine test. Press ^] three times within 1s to exit session. total 60 […]
systemd-nspawn has the option
--notify-ready=yes that solves exactly this
# systemd-run […] systemd-nspawn […] --notify-ready=yes ; machinectl --quiet shell test /bin/ls -la / Running as unit: run-r5a405754f3b740158b3d9dd5e14ff611.service total 60 […]
On nspawn's side, I should now have all I need.
My next step will be wrapping it all together in a gitlab runner.