Latest posts for tag systemd

This is part of a series of posts on the design and technical steps of creating Himblick, a digital signage box based on the Raspberry Pi 4.

We've started implementing reloading of the media player when media on disk changes.

One challenge when doing that, is that libreoffice doesn't always stop. Try this and you will see that the presentation keeps going:

$ loimpress --nodefault --norestore --nologo --nolockcheck --show example.odp
$ pkill -TERM loimpress

It turns out that loimpress forks various processes. After killing it, these processes will still be running:

/usr/lib/libreoffice/program/oosplash --impress --nodefault --norestore --nologo --nolockcheck --show talk.odp
/usr/lib/libreoffice/program/soffice.bin --impress --nodefault --norestore --nologo --nolockcheck --show talk.odp

Is there a way to run the media players in such a way that, if needed, they can easily be killed, together with any other process they might have spawned meanwhile?

systemd-run

Yes there is: systemd provides a systemd-run command to run simple commands under systemd's supervision:

$ systemd-run --scope --slice=player --user \
      loimpress --nodefault --norestore --nologo --nolockcheck --show media/talk.odp

This will run the player contained in a cgroup with a custom name, and we can simply use that name to stop all the things:

$ systemctl --user stop player.slice

Resulting python code

The result is this patch which simplifies the code, and isolates and easily kills all subprocesses run as players.

This is part of a series of posts on the design and technical steps of creating Himblick, a digital signage box based on the Raspberry Pi 4.

Another nice to have in a system like Himblick is the root filesystem mounted readonly, with a volatile tempfs overlay on top. This would kind of always guarantee a clean boot without leftovers from a previous run, especially in a system where the most likely mode of shutdown is going to be pulling the plug.

This won't be a guarantee about SD issues developing over time in such a scenario, but it should at least cover the software side of things.

(continue reading)

This is part of a series of posts on the design and technical steps of creating Himblick, a digital signage box based on the Raspberry Pi 4.

Time to setup ssh. We want to have admin access to the pi user, and we'd like to have a broader access to a different, locked down user, to use to manage media on the boxes via sftp.

(continue reading)

This is part of a series of posts on the design and technical steps of creating Himblick, a digital signage box based on the Raspberry Pi 4.

A RaspberryPi boots using a little FAT partition which contains kernel, device tree, configuration, and everything else necessary to boot.

It has the conveniente of being able to plug the SD card into pretty much any system, and tweak the knobs that are exposed through it.

While we don't expect that people would want to modify the config.txt that controls the boot process, we would like to give people a convenient way to set up things like host name (which makes the device findable on the net), timezone, screen orientation, and wifi passwords.

(continue reading)

This is part of a series of posts on the design and technical steps of creating Himblick, a digital signage box based on the Raspberry Pi 4.

In modern times, there are tools for provisioning systems that do useful things and allow to store an entire system configuration in text files committed to git. They are good in being able to reproducibly setup a system, and being able to inspect its contents from looking at the provisioning configuration instead of wading into it.

I normally use Ansible. It does have a chroot connector, but it has some serious limitations.

The biggest issue is that ansible's chroot connector does not mount /dev, /proc and so on, which greatly limits what can be run inside it. Specifically, installing many .deb packages will fail.

We work around it by copying Ansible needs inside the chroot (including Ansible itself), and then run it under systemd-nspawn using the local connector.

(continue reading)

This is part of a series of posts on the design and technical steps of creating Himblick, a digital signage box based on the Raspberry Pi 4.

Rapsbian is designed to be an interactive system, but we want to build a noninteractive black box out of it, which should never ever get a keyboard plug into it. See the "Museum ceiling" use case.

Ideally we should use a plain Debian as a base, but the Raspberry Pi model 4 is not supported yet for that.

Instead, we start from Raspbian Lite, and remove the bits that get in the way.

Review of raspbian's customizations

Here is a review of the Raspbian customizations that we've found, and how we chose to keep them or remove them.

raspberrypi-bootloader, raspberrypi-kernel

It's the code in /boot, and I guess also how it can get updated: keep.

raspbian-archive-keyring

This makes it possible to use the raspbian apt repositories: keep.

raspberrypi-net-mods

Source: https://github.com/RPi-Distro/raspberrypi-net-mods

It's the part that copies /boot/wpa_supplicant.conf to /etc/wpa_supplicant and does other system tweaks.

This we need to remove, to do our own customization.

raspberrypi-sys-mods

Source: https://github.com/RPi-Distro/raspberrypi-sys-mods

It contains a lot of hardware-specific setups and udev rules that should probably be kept.

It also contains the sudo rule that allows pi to sudo without password.

It does have a number of services that we need to disable:

What is the purpose of rpi-display-backlight.service? I could not find an explanation on why it is needed in the file or in the git logs.

raspi-config

Source: https://github.com/RPi-Distro/raspi-config

It's the core of Raspbian's interactive configuration, which we need to remove, to avoid interactive prompts, and replace with doing the configuration we need at rootfs setup time.

It's still useful as a reference on what is the standard way in Raspbian to do things like changing keyboard and timezone, or setting up graphical autologin.

Removing this leaves various things to be done:

  • configuring keyboard and timezone
  • setting a CPU scaling governor at boot cpufrequtils seems to do it automatically
  • sed -i 's| init=/usr/lib/raspi-config/init_resize\.sh||' /boot/cmdline.txt, or boot will fail!

The last one is important: on first boot, Raspbian won't boot the standard system, but run a script to resize the root partition, remove itself from the kernel command line, and reboot into the system proper.

We took care of partitioning ourselves and we do not need this: it would actually fail leaving the boot stuck in an interactive prompt, since it will not expect to find our media partition after the rootfs.

raspi-copies-and-fills

Partial source: https://github.com/bavison/arm-mem, it misses the .deb packaging.

This installs a ld.preload library with hardware accelerated replacements for common functions.

Since Raspbian is supposed to run unmodified on all RaspberryPi hardwares, the base libc is not optimized, and preloads are applied according to platform.

The package installs a /etc/ld.so.preload configuration which contains:

/usr/lib/arm-linux-gnueabihf/libarmmem-${PLATFORM}.so

In my case, ${PLATFORM} is not getting replaced inside the chroot environment, giving slower execution and filling the console with linker warnings.

Since we know we're running on the RaspberryPi 4, we can replace ${PLATFORM} with aarch64 in the rootfs setup.

triggerhappy

It does no harm, but it's a running service that we aren't needing yet, and it makes sense to remove it.

dhcpcd5

dhcpcd5 is a network configurator.

We would rather use systemd-networkd, which is somehow more standard and should play well with a read only root filesystem.

Replace Raspbian's customizations

For the boot partition:

    def cleanup_raspbian_boot(self):
        """
        Remove the interactive raspbian customizations from the boot partition
        """
        # Remove ' init=/usr/lib/raspi-config/init_resize.sh' from cmdline.txt
        # This is present by default in raspbian to perform partition
        # resize on the first boot, and it removes itself and reboots after
        # running. We do not need it, as we do our own partition resizing.
        # Also, we can't keep it, since we remove raspi-config and the
        # init_resize.sh script would break without it
        self.file_contents_replace(
                relpath="cmdline.txt",
                search=" init=/usr/lib/raspi-config/init_resize.sh",
                replace="")

For the rootfs:

    def cleanup_raspbian_rootfs(self):
        """
        Remove the interactive raspbian customizations from the rootfs
        partition
        """
        # To support multiple arm systems, ld.so.preload tends to contain something like:
        # /usr/lib/arm-linux-gnueabihf/libarmmem-${PLATFORM}.so
        # I'm not sure where that ${PLATFORM} would be expanded, but it
        # does not happen in a chroot/nspawn. Since we know we're working
        # on the 4B, we can expand it ourselves.
        self.file_contents_replace(
                relpath="/etc/ld.so.preload",
                search="${PLATFORM}",
                replace="aarch64")

        # Deinstall unneeded Raspbian packages
        self.dpkg_purge(["raspberrypi-net-mods", "raspi-config", "triggerhappy", "dhcpcd5", "ifupdown"])

        # Disable services we do not need
        self.systemctl_disable("apply_noobs_os_config")
        self.systemctl_disable("regenerate_ssh_host_keys")
        self.systemctl_disable("sshswitch")

        # Enable systemd-network and systemd-resolvd
        self.systemctl_disable("wpa_supplicant")
        self.systemctl_enable("wpa_supplicant@wlan0")
        self.systemctl_enable("systemd-networkd")
        self.write_symlink("/etc/resolv.conf", "/run/systemd/resolve/stub-resolv.conf")
        self.systemctl_enable("systemd-resolved")
        self.write_file("/etc/systemd/network/wlan0.network", """[Match]
Name=wlan0

[Network]
DHCP=ipv4

[DHCP]
RouteMetric=20
""")
        self.write_file("/etc/systemd/network/eth0.network", """[Match]
Name=eth0

[Network]
DHCP=all

[DHCP]
RouteMetric=10
""")

After this point, /etc/resolf.conf in the chroot will point to a broken symlink unless resolved is running. To continue working in the chroot and have internet access, we can temporarily replace it with the host's resolv.conf:

    @contextmanager
    def working_resolvconf(self, relpath: str):
        """
        Temporarily replace /etc/resolv.conf in the chroot with the current
        system one
        """
        abspath = self.abspath(relpath)
        if os.path.lexists(abspath):
            fd, tmppath = tempfile.mkstemp(dir=os.path.dirname(abspath))
            os.close(fd)
            os.rename(abspath, tmppath)
            shutil.copy("/etc/resolv.conf", os.path.join(self.root, "etc/resolv.conf"))
        else:
            tmppath = None
        try:
            yield
        finally:
            if os.path.lexists(abspath):
                os.unlink(abspath)
            if tmppath is not None:
                os.rename(tmppath, abspath)

This leaves keyboard, timezone, wifi, ssh, and autologin, still to be configured. We'll do it in the next step.

This is part of a series of posts on the design and technical steps of creating Himblick, a digital signage box based on the Raspberry Pi 4.

Now that we have taken care of updating the SD image partitioning we are free to customize the system.

Normally one would mount the system partition, chroot into it, and run scripts, install packages, run ansible playbooks. But this is an arm system, and the development systems are amd64, which won't run arm binaries. Or won't they?

It turns out that in 2019, they do:

# apt install qemu binfmt-support qemu-user-static
# chroot /media/root/rootfs
/# /usr/bin/file /usr/bin/file
/usr/bin/file: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, …
/# echo foo > /dev/null
bash: /dev/null: Permission denied

Chroot doesn't mount /dev, /sys, /proc, and the rest of the filesystems needed to run many commands. Now that we live in the future, there's a straightforward way to get that done, too:

# apt install systemd-container
# systemd-nspawn -D /media/root/rootfs
Spawning container rootfs on /media/enrico/rootfs.
Press ^] three times within 1s to kill container.
~# /usr/bin/file /usr/bin/file
/usr/bin/file: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, …
~# echo foo > /dev/null
~# cat /dev/null
~#

So, to recap:

  1. apt install qemu binfmt-support qemu-user-static systemd-container
  2. use systemd-nspawn -D instead of chroot

and you can chroot in a mostly set up filesystem for any architecture supported by QEMU.

Joy!

Update:

I have been playing with system images using ansible and chroots, and I figured that using systemd-nspawn to handle the chroots would make things nice, giving ansible commands the benefit of a running system.

There has been an attempt which was rejected.

Here is my attempt. It does boot the machine then run commands inside it, and it works nicely. The only thing I missed is a way of shutting down the machine at the end, since ansible seems to call close() at the end of each command, and I do not know enough ansible internals to do this right.

I hope this can serve as inspiration for something that works well.

# Based on chroot.py (c) 2013, Maykel Moya <mmoya@speedyrails.com>
# Based on chroot.py (c) 2015, Toshio Kuratomi <tkuratomi@ansible.com>
# (c) 2018, Enrico Zini <enrico@debian.org>
#
# This is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# Ansible is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with Ansible.  If not, see <http://www.gnu.org/licenses/>.
from __future__ import (absolute_import, division, print_function)
__metaclass__ = type

import distutils.spawn
import os
import os.path
import pipes
import subprocess
import time
import hashlib

from ansible import constants as C
from ansible.errors import AnsibleError
from ansible.plugins.connection import ConnectionBase, BUFSIZE
from ansible.module_utils.basic import is_executable

try:
    from __main__ import display
except ImportError:
    from ansible.utils.display import Display
    display = Display()


class Connection(ConnectionBase):
    ''' Local chroot based connections '''

    transport = 'schroot'
    has_pipelining = True
    # su currently has an undiagnosed issue with calculating the file
    # checksums (so copy, for instance, doesn't work right)
    # Have to look into that before re-enabling this
    become_methods = frozenset(C.BECOME_METHODS).difference(('su',))

    def __init__(self, play_context, new_stdin, *args, **kwargs):
        super(Connection, self).__init__(play_context, new_stdin, *args, **kwargs)

        self.chroot = self._play_context.remote_addr
        # We need short and fast rather than secure
        m = hashlib.sha1()
        m.update(os.path.abspath(self.chroot))
        self.machine_name = "ansible-" + m.hexdigest()

        if os.geteuid() != 0:
            raise AnsibleError("nspawn connection requires running as root")

        # we're running as root on the local system so do some
        # trivial checks for ensuring 'host' is actually a chroot'able dir
        if not os.path.isdir(self.chroot):
            raise AnsibleError("%s is not a directory" % self.chroot)

        chrootsh = os.path.join(self.chroot, 'bin/sh')
        # Want to check for a usable bourne shell inside the chroot.
        # is_executable() == True is sufficient.  For symlinks it
        # gets really complicated really fast.  So we punt on finding that
        # out.  As long as it's a symlink we assume that it will work
        if not (is_executable(chrootsh) or (os.path.lexists(chrootsh) and os.path.islink(chrootsh))):
            raise AnsibleError("%s does not look like a chrootable dir (/bin/sh missing)" % self.chroot)

        self.nspawn_cmd = distutils.spawn.find_executable('systemd-nspawn')
        if not self.nspawn_cmd:
            raise AnsibleError("systemd-nspawn command not found in PATH")
        self.machinectl_cmd = distutils.spawn.find_executable('machinectl')
        if not self.machinectl_cmd:
            raise AnsibleError("machinectl command not found in PATH")
        self.run_cmd = distutils.spawn.find_executable('systemd-run')
        if not self.run_cmd:
            raise AnsibleError("systemd-run command not found in PATH")

        existing = subprocess.call([self.machinectl_cmd, "show", self.machine_name], stdout=open("/dev/null", "wb"))
        self.machine_exists = existing == 0

    def set_host_overrides(self, host, hostvars=None):
        super(Connection, self).set_host_overrides(host, hostvars)

    def _connect(self):
        ''' connect to the chroot; nothing to do here '''
        super(Connection, self)._connect()
        if not self._connected:
            if not self.machine_exists:
                display.vvv("Starting nspawn machine", host=self.chroot)
                self.chroot_proc = subprocess.Popen([self.nspawn_cmd, "-D", self.chroot, "-M", self.machine_name, "--register=yes", "--boot"], stdout=open("/dev/null", "w"))
                time.sleep(0.5)
            else:
                self.chroot_proc = None
                display.vvv("Reusing nspawn machine", host=self.chroot)
            self._connected = True

    def _local_run_cmd(self, cmd, stdin=None):
        display.vvv(" -exec %s" % repr(cmd), host=self.chroot)
        display.vvv(" -  or %s" % " ".join(pipes.quote(x) for x in cmd), host=self.chroot)
        p = subprocess.Popen(cmd, shell=False, stdin=subprocess.PIPE,
                             stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        stdout, stderr = p.communicate(stdin)
        display.vvv(" - got %d" % p.returncode, host=self.chroot)
        display.vvv(" - out %s" % repr(stdout), host=self.chroot)
        display.vvv(" - err %s" % repr(stderr), host=self.chroot)
        return p.returncode, stdout, stderr

    def _systemd_run_cmd(self, cmd, stdin=None):
        local_cmd = [self.run_cmd, "-M", self.machine_name, "-q", "--pipe", "--wait", "-E", "HOME=/root", "-E", "USER=root", "-E", "LOGNAME=root"] + cmd
        local_cmd = [x.encode("utf8") if isinstance(x, unicode) else x for x in local_cmd]
        return self._local_run_cmd(local_cmd, stdin=stdin)

    def exec_command(self, cmd, in_data=None, sudoable=False):
        ''' run a command on the chroot '''
        super(Connection, self).exec_command(cmd, in_data=in_data, sudoable=sudoable)

        display.vvv("cmd: %s" % repr(cmd), host=self.chroot)
        return self._systemd_run_cmd(["/bin/sh", "-c", cmd], stdin=in_data)

    def _prefix_login_path(self, remote_path):
        ''' Make sure that we put files into a standard path

            If a path is relative, then we need to choose where to put it.
            ssh chooses $HOME but we aren't guaranteed that a home dir will
            exist in any given chroot.  So for now we're choosing "/" instead.
            This also happens to be the former default.

            Can revisit using $HOME instead if it's a problem
        '''
        if not remote_path.startswith(os.path.sep):
            remote_path = os.path.join(os.path.sep, remote_path)
        return os.path.normpath(remote_path)

    def put_file(self, in_path, out_path):
        ''' transfer a file from local to chroot '''
        super(Connection, self).put_file(in_path, out_path)
        display.vvv("PUT %s TO %s" % (in_path, out_path), host=self.chroot)

        out_path = pipes.quote(self._prefix_login_path(out_path))
        p = subprocess.Popen([self.machinectl_cmd, "-q", "copy-to", self.machine_name, in_path, out_path], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        stdout, stderr = p.communicate()
        if p.returncode != 0:
            raise AnsibleError("failed to transfer file %s to %s:\n%s\n%s" % (in_path, out_path, stdout, stderr))

    def fetch_file(self, in_path, out_path):
        ''' fetch a file from chroot to local '''
        super(Connection, self).fetch_file(in_path, out_path)
        display.vvv("FETCH %s TO %s" % (in_path, out_path), host=self.chroot)

        in_path = pipes.quote(self._prefix_login_path(in_path))
        p = subprocess.Popen([self.machinectl_cmd, "-q", "copy-from", self.machine_name, in_path, out_path], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        stdout, stderr = p.communicate()
        if p.returncode != 0:
            raise AnsibleError("failed to transfer file %s from %s:\n%s\n%s" % (out_path, in_path, stdout, stderr))

    def close(self):
        super(Connection, self).close()

# FIXME: how can we power off the machine? close and __del__ seem to be called after each command
#    def __del__(self):
#        ''' terminate the connection; nothing to do here '''
#        # super(Connection, self).close()
#        display.vvv("CLOSE", host=self.chroot)
#        if self._connected:
#            p, stdout, stderr = self._local_run_cmd([self.machinectl_cmd, "poweroff", self.machine_name])
#            if p == 0 and self.chroot_proc:
#                self.chroot_proc.wait()
#            self._connected = False

At SnowCamp I migrated Front Desk-related repositories to Salsa gitlab and worked on setting up Continuous Integration for the web applications I maintain in Debian.

The result is a reusable Django app that integrates with gitlab's webhooks

It is currently working for https://contributors.debian.org and I'll soon reuse it for https://nm.debian.org and https://debtags.debian.org.

The only setup needed on DSA side is to enable systemd linger on the deploy user.

The CI/deploy workflow is this:

  • gitlab runs tests in the CI
  • gitlab notifies pipeline status changes via a webhook
  • when a selected pipeline changes status to success, the application queues a deploy for that shasum by creating a shasum.deploy file in a queue directory
  • a systemd .path unit running as the deploy user triggers when the new file is created and runs manage.py deploy as the deploy user

And manage.py deploy does this:

  • git fetch
  • abort of the shasum of the head of the deploy branch does not match one of the .deploy files in the queue directory
  • abort if the head of the deploy branch is not signed by a gpg key present in a deploy keyring
  • abort if the head of the deploy branch is not a successor of the currently deployed commit
  • update the working copy
  • run a deploy script
  • remove all .deploy files seen when the script was called
  • send an email to the site admins with a log of the whole deploy process, whether it succeeded or it was aborted

For more details, see the app's README.md

I find it wonderful that we got to a stage where we can have this in Debian, and I am very grateful to all the work that has been done and is being done in setting up and maintaining Salsa.

These are the notes of a training course on systemd I gave as part of my work with Truelite.

.socket units

Socket units tell systemd to listen on a given IPC, network socket, or file system FIFO, and use another unit to service requests to it.

For example, this creates a network service that listens on port 55555:

# /etc/systemd/system/ddate.socket
[Unit]
Description=ddate service on port 55555

[Socket]
ListenStream=55555
Accept=true

[Install]
WantedBy=sockets.target
# /etc/systemd/system/ddate@.service
[Unit]
Description=Run ddate as a network service

[Service]
Type=simple
ExecStart=/bin/sh -ec 'while true; do /usr/bin/ddate; sleep 1m; done'
StandardOutput=socket
StandardError=journal

Note that the .service file is called ddate@ instead of ddate: units whose name ends in '@' are template units which can be activated multiple times, by adding any string after the '@' in the unit name.

If I run nc localhost 55555 a couple of times, and then check the list of running units, I see ddate@… instantiated twice, adding the local and remote socket endpoints to the unit name:

$ systemctl list-units 'ddate@*'
  UNIT                                             LOAD   ACTIVE SUB     DESCRIPTION
  ddate@15-127.0.0.1:55555-127.0.0.1:36936.service loaded active running Run ddate as a network service (127.0.0.1:36936)
  ddate@16-127.0.0.1:55555-127.0.0.1:37002.service loaded active running Run ddate as a network service (127.0.0.1:37002)

This allows me to monitor each running service individually.

systemd also automatically creates a slice unit called system-ddate.slice grouping all services together:

$ systemctl status system-ddate.slice
 system-ddate.slice
   Loaded: loaded
   Active: active since Thu 2017-09-21 14:25:02 CEST; 9min ago
    Tasks: 4
   CGroup: /system.slice/system-ddate.slice
           ├─ddate@15-127.0.0.1:55555-127.0.0.1:36936.service
            ├─18214 /bin/sh -ec while true; do /usr/bin/ddate; sleep 1m; done
            └─18661 sleep 1m
           └─ddate@16-127.0.0.1:55555-127.0.0.1:37002.service
             ├─18228 /bin/sh -ec while true; do /usr/bin/ddate; sleep 1m; done
             └─18670 sleep 1m

This allows to also work with all running services for this template unit as a whole, sending a signal to all their processes and setting up resource control features for the service as a whole.

See: