Systemd containers with unittest

This is part of a series of posts on ideas for an ansible-like provisioning system, implemented in Transilience.

Unit testing some parts of Transilience, like the apt and systemd actions, or remote Mitogen connections, can really use a containerized system for testing.

To have that, I reused my work on nspawn-runner. to build a simple and very fast system of ephemeral containers, with minimal dependencies, based on systemd-nspawn and btrfs snapshots:

Setup

To be able to use systemd-nspawn --ephemeral, the chroots needs to be btrfs subvolumes. If you are not running on a btrfs filesystem, you can create one to run the tests, even on a file:

fallocate -l 1.5G testfile
/usr/sbin/mkfs.btrfs testfile
sudo mount -o loop testfile test_chroots/

I created a script to setup the test environment, here is an extract:

mkdir -p test_chroots

cat << EOF > "test_chroots/CACHEDIR.TAG"
Signature: 8a477f597d28d172789f06886806bc55
# chroots used for testing transilience, can be regenerated with make-test-chroot
EOF

btrfs subvolume create test_chroots/buster
eatmydata debootstrap --variant=minbase --include=python3,dbus,systemd buster test_chroots/buster

CACHEDIR.TAG is a nice trick to tell backup software not to bother backing up the contents of this directory, since it can be easily regenerated.

eatmydata is optional, and it speeds up debootstrap quite a bit.

Running unittest with sudo

Here's a simple helper to drop root as soon as possible, and regain it only when needed. Note that it needs $SUDO_UID and $SUDO_GID, that are set by sudo, to know which user to drop into:

class ProcessPrivs:
    """
    Drop root privileges and regain them only when needed
    """
    def __init__(self):
        self.orig_uid, self.orig_euid, self.orig_suid = os.getresuid()
        self.orig_gid, self.orig_egid, self.orig_sgid = os.getresgid()

        if "SUDO_UID" not in os.environ:
            raise RuntimeError("Tests need to be run under sudo")

        self.user_uid = int(os.environ["SUDO_UID"])
        self.user_gid = int(os.environ["SUDO_GID"])

        self.dropped = False

    def drop(self):
        """
        Drop root privileges
        """
        if self.dropped:
            return
        os.setresgid(self.user_gid, self.user_gid, 0)
        os.setresuid(self.user_uid, self.user_uid, 0)
        self.dropped = True

    def regain(self):
        """
        Regain root privileges
        """
        if not self.dropped:
            return
        os.setresuid(self.orig_suid, self.orig_suid, self.user_uid)
        os.setresgid(self.orig_sgid, self.orig_sgid, self.user_gid)
        self.dropped = False

    @contextlib.contextmanager
    def root(self):
        """
        Regain root privileges for the duration of this context manager
        """
        if not self.dropped:
            yield
        else:
            self.regain()
            try:
                yield
            finally:
                self.drop()

    @contextlib.contextmanager
    def user(self):
        """
        Drop root privileges for the duration of this context manager
        """
        if self.dropped:
            yield
        else:
            self.drop()
            try:
                yield
            finally:
                self.regain()


privs = ProcessPrivs()
privs.drop()

As soon as this module is loaded, root privileges are dropped, and can be regained for as little as possible using a handy context manager:

   with privs.root():
       subprocess.run(["systemd-run", ...], check=True, capture_output=True)

Using the chroot from test cases

The infrastructure to setup and spin down ephemeral machine is relatively simple, once one has worked out the nspawn incantations:

class Chroot:
    """
    Manage an ephemeral chroot
    """
    running_chroots: Dict[str, "Chroot"] = {}

    def __init__(self, name: str, chroot_dir: Optional[str] = None):
        self.name = name
        if chroot_dir is None:
            self.chroot_dir = self.get_chroot_dir(name)
        else:
            self.chroot_dir = chroot_dir
        self.machine_name = f"transilience-{uuid.uuid4()}"

    def start(self):
        """
        Start nspawn on this given chroot.

        The systemd-nspawn command is run contained into its own unit using
        systemd-run
        """
        unit_config = [
            'KillMode=mixed',
            'Type=notify',
            'RestartForceExitStatus=133',
            'SuccessExitStatus=133',
            'Slice=machine.slice',
            'Delegate=yes',
            'TasksMax=16384',
            'WatchdogSec=3min',
        ]

        cmd = ["systemd-run"]
        for c in unit_config:
            cmd.append(f"--property={c}")

        cmd.extend((
            "systemd-nspawn",
            "--quiet",
            "--ephemeral",
            f"--directory={self.chroot_dir}",
            f"--machine={self.machine_name}",
            "--boot",
            "--notify-ready=yes"))

        log.info("%s: starting machine using image %s", self.machine_name, self.chroot_dir)

        log.debug("%s: running %s", self.machine_name, " ".join(shlex.quote(c) for c in cmd))
        with privs.root():
            subprocess.run(cmd, check=True, capture_output=True)
        log.debug("%s: started", self.machine_name)
        self.running_chroots[self.machine_name] = self

    def stop(self):
        """
        Stop the running ephemeral containers
        """
        cmd = ["machinectl", "terminate", self.machine_name]
        log.debug("%s: running %s", self.machine_name, " ".join(shlex.quote(c) for c in cmd))
        with privs.root():
            subprocess.run(cmd, check=True, capture_output=True)
        log.debug("%s: stopped", self.machine_name)
        del self.running_chroots[self.machine_name]

    @classmethod
    def create(cls, chroot_name: str) -> "Chroot":
        """
        Start an ephemeral machine from the given master chroot
        """
        res = cls(chroot_name)
        res.start()
        return res

    @classmethod
    def get_chroot_dir(cls, chroot_name: str):
        """
        Locate a master chroot under test_chroots/
        """
        chroot_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "test_chroots", chroot_name))
        if not os.path.isdir(chroot_dir):
            raise RuntimeError(f"{chroot_dir} does not exists or is not a chroot directory")
        return chroot_dir


# We need to use atextit, because unittest won't run
# tearDown/tearDownClass/tearDownModule methods in case of KeyboardInterrupt
# and we need to make sure to terminate the nspawn containers at exit
@atexit.register
def cleanup():
    # Use a list to prevent changing running_chroots during iteration
    for chroot in list(Chroot.running_chroots.values()):
        chroot.stop()

And here's a TestCase mixin that starts a containerized systems and opens a Mitogen connection to it:

class ChrootTestMixin:
    """
    Mixin to run tests over a setns connection to an ephemeral systemd-nspawn
    container running one of the test chroots
    """
    chroot_name = "buster"

    @classmethod
    def setUpClass(cls):
        super().setUpClass()
        import mitogen
        from transilience.system import Mitogen
        cls.broker = mitogen.master.Broker()
        cls.router = mitogen.master.Router(cls.broker)
        cls.chroot = Chroot.create(cls.chroot_name)
        with privs.root():
            cls.system = Mitogen(
                    cls.chroot.name, "setns", kind="machinectl",
                    python_path="/usr/bin/python3",
                    container=cls.chroot.machine_name, router=cls.router)

    @classmethod
    def tearDownClass(cls):
        super().tearDownClass()
        cls.system.close()
        cls.broker.shutdown()
        cls.chroot.stop()

Running tests

Once the tests are set up, everything goes on as normal, except one needs to run nose2 with sudo:

sudo nose2-3

Spin up time for containers is pretty fast, and the tests drop root as soon as possible, and only regain it for as little as needed.

Also, dependencies for all this are minimal and available on most systems, and the setup instructions seem pretty straightforward