Enrico's blog

argparse subcommands are great, but they have a quirk in which options are only available right after the subcommand that define them.

So, if you for example add the --verbose / -v argument to your main parser, and you have subcommands, you need to give the -v option before the subcommand name. For example, given this script:

#!/usr/bin/python3
import argparse

parser = argparse.ArgumentParser(description="test")
parser.add_argument("-v", "--verbose", action="store_true")
subparsers = parser.add_subparsers(dest="handler", required=True)
subparsers.add_parser("test")

args = parser.parse_args()
print(args.verbose)

You get this behaviour:

$ ./mycmd test
False
$ ./mycmd -v test
True
$ ./mycmd test -v
usage: mycmd [-h] [-v] {test} ...
mycmd: error: unrecognized arguments: -v

This sometimes makes sense, and many other times it's really annoying, since the user has to remember at which level an option was defined.

Last night some pieces clicked in my head, and I created a not-too-dirty ArgumentParser subclass that adds a shared option to arguments, that propagates them to subparsers:

#!/usr/bin/python3
from hacks import argparse

parser = argparse.ArgumentParser(description="test")
parser.add_argument("-v", "--verbose", action="store_true", shared=True)
subparsers = parser.add_subparsers(dest="handler", required=True)
subparsers.add_parser("test")

args = parser.parse_args()
print(args.verbose)

And finally, -v can be given at all levels:

$ ./mycmd test
False
$ ./mycmd -v test
True
$ ./mycmd test -v
True

It even works recursively, forwarding arguments to sub-subparsers, but at the moment it can only do it for store_true and store kind of arguments.

Further reading

Talk notes

Intro

  • I'm not speaking for the whole of DAM
  • Motivation in part is personal frustration, and need to set boundaries and negotiate expectations

Debian Account Managers

  • history

Responsibility for official membership

  • approve account creation
  • manage the New Member Process and nm.debian.org
  • close MIA accounts
  • occasional emergency termination of accounts
  • handle Emeritus
  • with lots of help from FrontDesk and MIA teams (big shoutout)

What DAM is not

  • we are not mediators
  • we are not a community management team
  • a list or IRC moderation team
  • we are not responsible for vision or strategic choices about how people are expected to interact in Debian
  • We shouldn't try and solve things because they need solving

Unexpected responsibilities

  • Over time, the community has grown larger and more complex, in a larger and more complex online environment
  • Enforcing the Diversity Statement and the Code of Conduct
  • Emergency list moderation
    • we have ended up using DAM warnings to compensate for the lack of list moderation, at least twice
  • contributors.debian.org (mostly only because of me, but it would be good to have its own team)

DAM warnings

  • except for rare glaring cases, patterns of behaviour / intentions / taking feedback in, are more relevant than individual incidents
  • we do not set out to fix people. It is enough for us to get people to acknowledge a problem
    • if they can't acknowledge a problem they're probably out
    • once a problem is acknowledged, fixing it could be their implementation detail
    • then again it's not that easy to get a number of troublesome people to acknowledge problems, so we go back to the problem of deciding when enough is enough

DAM warnings?

  • I got to a point where I look at DAM warnings as potential signals that DAM has ended up with the ball that everyone else in Debian dropped.
  • DAM warning means we haven't gotten to a last resort situation yet, meaning that it probably shouldn't be DAM dealing with this at this point
  • Everyone in the project can write a person "do you realise there's an issue here? Can you do something to stop?", and give them a chance to reflect on issues or ignore them, and build their reputation accordingly.
  • People in Debian should not have to endure, completey powerless, as trolls drag painful list discussions indefinitely until all the trolled people run out of energy and leave. At the same time, people who abuse a list should expect to be suspended or banned from the list, not have their Debian membership put into question (unless it is a recurring pattern of behaviour).
  • The push to grow DAM warnings as a tool, is a sign of the rest of Debian passing on their responsibilities, and DAM picking them up.
  • Then in DAM we end up passing on things, too, because we also don't have the energy to face another intensive megametathread, and as we take actions for things that shouldn't quite be our responsibility, we face a higher level of controversy, and therefore demotivation.
  • Also, as we take actions for things that shouldn't be our responsibility, and work on a higher level of controversy, our legitimacy is undermined (and understandably so)
    • there's a pothole on my street that never gets filled, so at some point I go out and fill it. Then people thank me, people complain I shouldn't have, people complain I didn't fill it right, people appreciate the gesture and invite me to learn how to fix potholes better, people point me out to more potholes, and then complain that potholes don't get fixed properly on the whole street. I end up being the problem, instead of whoever had responsibility of the potholes but wasn't fixing them
  • The Community Team, the Diversity Team, and individual developers, have no energy or entitlement for explaining what a healthy community looks like, and DAM is left with that responsibility in the form of accountability for their actions: to issue, say, a DAM warning for bullying, we are expected to explain what is bullying, and how that kind of behaviour constitutes bullying, in a way that is understandable by the whole project.
  • Since there isn't consensus in the project about what bullying loos like, we end up having to define it in a warning, which again is a responsibility we shouldn't have, and we need to do it because we have an escalated situation at hand, but we can't do it right

House rules

Interpreting house rules

  • you can't encode common sense about people behaviour in written rules: no matter how hard you try, people will find ways to cheat that
  • so one can use rules as a guideline, and someone responsible for the bits that can't go into rules.
    • context matters, privilege/oppression matters, patterns matter, histor matters
  • example:
    • call a person out for breaking a rule
    • get DARVO in response
    • state that DARVO is not acceptable
    • get concern trolling against margninalised people and accuse them of DARVO if they complain
  • example: assume good intentions vs enabling
  • example: rule lawyering and Figure skating
  • this cannot be solved by GRs: I/we (DAM)/possibly also we (Debian) don't want to do GRs about evaluating people

Governance by bullying

  • How to DoS discussions in Debian
    • example: gender, minority groups, affirmative action, inclusion, anything about the community team itself, anything about the CoC, systemd, usrmerge, dam warnings, expulsions
      • think of a topic. Think about sending a mail to debian-project about it. If you instinctively shiver at the thought, this is probably happening
      • would you send a mail about that to -project / -devel?
      • can you think of other topics?
    • it is an effective way of governance as it excludes topics from public discussion
  • A small number of people abuse all this, intentionally or not, to effectively manipulate decision making in the project.
  • Instead of using the rules of the community to bring forth the issues one cares about, it costs less energy to make it unthinkable or unbearable to have a discussion on issues one doesn't want to progress. What one can't stop constructively, one can oppose destructively.
  • even regularly diverting the discussion away from the original point or concern is enough to derail it without people realising you're doing it
  • This is an effective strategy for a few reckless people to unilaterally direct change, in the current state of Debian, at the cost of the health and the future of the community as a whole.
  • There are now a number of important issues nobody has the energy to discuss, because experience says that energy requirements to bring them to the foreground and deal with the consequences are anticipated to be disproportionate.
  • This is grave, as we're talking about trolling and bullying as malicious power moves to work around the accepted decision making structures of our community.
  • Solving this is out of scope for this talk, but it is urgent nevertheless, and can't be solved by expecting DAM to fix it

How about the Community Team?

  • It is also a small group of people who cannot pick up the responsibility of doing what the community isn't doing for itself
  • I believe we need to recover the Community Team: it's been years that every time they write something in public, they get bullied by the same recurring small group of people (see governance by bullying above)

How about DAM?

  • I was just saying that we are not the emergency catch all
  • When the only enforcement you have is "nuclear escalation", there's nothing you can do until it's too late, and meanwhile lots of people suffer (this was written before Russia invaded Ukraine)
  • Also, when issues happen on public lists, the BTS, or on IRC, some of the perpetrators are also outside of the jurisdiction of DAM, which shows how DAM is not the tool for this

How about the DPL?

  • Talking about emergency catch alls, don't they have enough to do already?

Concentrating responsibility

  • Concentrating all responsibility on social issues on a single point creates a scapegoat: we're blamed for any conduct issue, and we're blamed for any action we take on conduct issues
    • also, when you are a small group you are personally identified with it. Taking action on a person may mean making a new enemy, and becoming a target for harassment, retaliation, or even just the general unwarranted hostility of someone who is left with an axe to grind
  • As long as responsibility is centralised, any action one takes as a response of one micro-aggression (or one micro-aggression too many) is an overreaction. Distributing that responsibility allows a finer granularity of actions to be taken
    • you don't call the police to tell someone they're being annoying at the pub: the people at the pub will tell you you're being annoying, and the police is called if you want to beat them up in response
  • We are also a community where we have no tool to give feedback to posts, so it still looks good to nitpick stupid details with smart-looking tranchant one-liners, or elaborate confrontational put-downs, and one doesn't get the feedback of "that did not help". Compare with discussing https://salsa.debian.org/debian/grow-your-ideas/ which does have this kind of feedback
    • the lack of moderation and enforcement makes the Debian community ideal for easy baiting, concern trolling, dog whistling, and related fun, and people not empowered can be so manipulated to troll those responsible
    • if you're fragile in Debian, people will play cat and mouse with you. It might be social awkwardness, or people taking themselves too serious, but it can easily become bullying, and with no feedback it's hard to tell and course correct
  • Since DAM and DPL are where the ball stops, everyone else in Debian can afford to let the ball drop.
  • More generally, if only one group is responsible, nobody else is

Empowering developers

  • Police alone does not make a community safe: a community makes a community safe.
  • DDs currently have no power to act besides complaining to DAM, or complaining to Community Team that then can only pass complaints on to DAM.
    • you could act directly, but currently nobody has your back if the (micro-)aggression then starts extending to you, too
  • From no power comes no responsibility. And yet, the safety of a community is sustainable only if it is the responsibility of every member of the community.
  • don't wait for DAM as the only group who can do something
  • people should be able to address issues in smaller groups, without escalation at project level
  • but people don't have the tools for that
  • I/we've shouldered this responsibility for far too long because nobody else was doing it, and it's time the whole Debian community gets its act together and picks up this responsibility as they should be. You don't get to not care just because there's a small number of people who is caring for you.

What needs to happen

  • distinguish DAM decisions from decisions that are more about vision and direction, and would require more representation
  • DAM warnings shouldn't belong in DAM
  • who is responsible for interpretation of the CoC?
  • deciding what to do about controversial people shouldn't belong in DAM
  • curation of the community shouldn't belong in DAM
  • can't do this via GRs, it's a mess to do a GR to decide how acceptable is a specific person's behaviour, and a lot of this requires more and more frequent micro-decisions than one'd do via GRs

Back in 2017 I did work to setup a cross-building toolchain for QT Creator, that takes advantage of Debian's packaging for all the dependency ecosystem.

It ended with cbqt which is a little script that sets up a chroot to hold cross-build-dependencies, to avoid conflicting with packages in the host system, and sets up a qmake alternative to make use of them.

Today I'm dusting off that work, to ensure it works on Debian bullseye.

Resetting QT Creator

To make things reproducible, I wanted to reset QT Creator's configuration.

Besides purging and reinstalling the package, one needs to manually remove:

  • ~/.config/QtProject
  • ~/.cache/QtProject/
  • /usr/share/qtcreator/QtProject which is where configuration is stored if you used sdktool to programmatically configure Qt Creator (see for example this post and see Debian bug #1012561.

Updating cbqt

Easy start, change the distribution for the chroot:

-DIST_CODENAME = "stretch"
+DIST_CODENAME = "bullseye"

Adding LIBDIR

Something else does not work:

Test$ qmake-armhf -makefile
Info: creating stash file …/Test/.qmake.stash
Test$ make
[...]
/usr/bin/arm-linux-gnueabihf-g++ -Wl,-O1 -Wl,-rpath-link,…/armhf/lib/arm-linux-gnueabihf -Wl,-rpath-link,…/armhf/usr/lib/arm-linux-gnueabihf -Wl,-rpath-link,…/armhf/usr/lib/ -o Test main.o mainwindow.o moc_mainwindow.o   …/armhf/usr/lib/arm-linux-gnueabihf/libQt5Widgets.so …/armhf/usr/lib/arm-linux-gnueabihf/libQt5Gui.so …/armhf/usr/lib/arm-linux-gnueabihf/libQt5Core.so -lGLESv2 -lpthread
/usr/lib/gcc-cross/arm-linux-gnueabihf/10/../../../../arm-linux-gnueabihf/bin/ld: cannot find -lGLESv2
collect2: error: ld returned 1 exit status
make: *** [Makefile:146: Test] Error 1

I figured that now I also need to set QMAKE_LIBDIR and not just QMAKE_RPATHLINKDIR:

--- a/cbqt
+++ b/cbqt
@@ -241,18 +241,21 @@ include(../common/linux.conf)
 include(../common/gcc-base-unix.conf)
 include(../common/g++-unix.conf)

+QMAKE_LIBDIR += {chroot.abspath}/lib/arm-linux-gnueabihf
+QMAKE_LIBDIR += {chroot.abspath}/usr/lib/arm-linux-gnueabihf
+QMAKE_LIBDIR += {chroot.abspath}/usr/lib/
 QMAKE_RPATHLINKDIR += {chroot.abspath}/lib/arm-linux-gnueabihf
 QMAKE_RPATHLINKDIR += {chroot.abspath}/usr/lib/arm-linux-gnueabihf
 QMAKE_RPATHLINKDIR += {chroot.abspath}/usr/lib/

Now it links again:

Test$ qmake-armhf -makefile
Test$ make
/usr/bin/arm-linux-gnueabihf-g++ -Wl,-O1 -Wl,-rpath-link,…/armhf/lib/arm-linux-gnueabihf -Wl,-rpath-link,…/armhf/usr/lib/arm-linux-gnueabihf -Wl,-rpath-link,…/armhf/usr/lib/ -o Test main.o mainwindow.o moc_mainwindow.o   -L…/armhf/lib/arm-linux-gnueabihf -L…/armhf/usr/lib/arm-linux-gnueabihf -L…/armhf/usr/lib/ …/armhf/usr/lib/arm-linux-gnueabihf/libQt5Widgets.so …/armhf/usr/lib/arm-linux-gnueabihf/libQt5Gui.so …/armhf/usr/lib/arm-linux-gnueabihf/libQt5Core.so -lGLESv2 -lpthread

Making it work in Qt Creator

Time to try it in Qt Creator, and sadly it fails:

/armhf/usr/lib/arm-linux-gnueabihf/qt5/mkspecs/features/toolchain.prf:76: Variable QMAKE_CXX.COMPILER_MACROS is not defined.

QMAKE_CXX.COMPILER_MACROS is not defined

I traced it to this bit in armhf/usr/lib/arm-linux-gnueabihf/qt5/mkspecs/features/toolchain.prf (nonrelevant bits deleted):

isEmpty($${target_prefix}.COMPILER_MACROS) {
    msvc {
        # …
    } else: gcc|ghs {
        vars = $$qtVariablesFromGCC($$QMAKE_CXX)
    }
    for (v, vars) {
        # …
        $${target_prefix}.COMPILER_MACROS += $$v
    }
    cache($${target_prefix}.COMPILER_MACROS, set stash)
} else {
    # …
}

It turns out that qmake is not able to realise that the compiler is gcc, so vars does not get set, nothing is set in COMPILER_MACROS, and qmake fails.

Reproducing it on the command line

When run manually, however, qmake-armhf worked, so it would be good to know how Qt Creator is actually running qmake. Since it frustratingly does not show what commands it runs, I'll have to strace it:

strace -e trace=execve --string-limit=123456 -o qtcreator.trace -f qtcreator

And there it is:

$ grep qmake- qtcreator.trace
1015841 execve("/usr/local/bin/qmake-armhf", ["/usr/local/bin/qmake-armhf", "-query"], 0x56096e923040 /* 54 vars */) = 0
1015865 execve("/usr/local/bin/qmake-armhf", ["/usr/local/bin/qmake-armhf", "…/Test/Test.pro", "-spec", "arm-linux-gnueabihf", "CONFIG+=debug", "CONFIG+=qml_debug"], 0x7f5cb4023e20 /* 55 vars */) = 0

I run the command manually and indeed I reproduce the problem:

$ /usr/local/bin/qmake-armhf Test.pro -spec arm-linux-gnueabihf CONFIG+=debug CONFIG+=qml_debug
…/armhf/usr/lib/arm-linux-gnueabihf/qt5/mkspecs/features/toolchain.prf:76: Variable QMAKE_CXX.COMPILER_MACROS is not defined.

I try removing options until I find the one that breaks it and... now it's always broken! Even manually running qmake-armhf, like I did earlier, stopped working:

$ rm .qmake.stash
$ qmake-armhf -makefile
…/armhf/usr/lib/arm-linux-gnueabihf/qt5/mkspecs/features/toolchain.prf:76: Variable QMAKE_CXX.COMPILER_MACROS is not defined.

Debugging toolchain.prf

I tried purging and reinstalling qtcreator, and recreating the chroot, but qmake-armhf is staying broken. I'll let that be, and try to debug toolchain.prf.

By grepping gcc in the mkspecs directory, I managed to figure out that:

  • The } else: gcc|ghs { test is matching the value(s) of QMAKE_COMPILER
  • QMAKE_COMPILER can have multiple values, separated by space
  • If in armhf/usr/lib/arm-linux-gnueabihf/qt5/mkspecs/arm-linux-gnueabihf/qmake.conf I set QMAKE_COMPILER = gcc arm-linux-gnueabihf-gcc, then things work again.

Sadly, I failed to find reference documentation for QMAKE_COMPILER's syntax and behaviour. I also failed to find why qmake-armhf worked earlier, and I am also failing to restore the system to a situation where it works again. Maybe I dreamt that it worked? I had some manual change laying around from some previous fiddling with things?

Anyway at least now I have the fix:

--- a/cbqt
+++ b/cbqt
@@ -248,7 +248,7 @@ QMAKE_RPATHLINKDIR += {chroot.abspath}/lib/arm-linux-gnueabihf
 QMAKE_RPATHLINKDIR += {chroot.abspath}/usr/lib/arm-linux-gnueabihf
 QMAKE_RPATHLINKDIR += {chroot.abspath}/usr/lib/

-QMAKE_COMPILER          = {chroot.arch_triplet}-gcc
+QMAKE_COMPILER          = gcc {chroot.arch_triplet}-gcc

 QMAKE_CC                = /usr/bin/{chroot.arch_triplet}-gcc

Fixing a compiler mismatch warning

In setting up the kit, Qt Creator also complained that the compiler from qmake did not match the one configured in the kit. That was easy to fix, by pointing at the host system cross-compiler in qmake.conf:

 QMAKE_COMPILER          = {chroot.arch_triplet}-gcc

-QMAKE_CC                = {chroot.arch_triplet}-gcc
+QMAKE_CC                = /usr/bin/{chroot.arch_triplet}-gcc

 QMAKE_LINK_C            = $$QMAKE_CC
 QMAKE_LINK_C_SHLIB      = $$QMAKE_CC

-QMAKE_CXX               = {chroot.arch_triplet}-g++
+QMAKE_CXX               = /usr/bin/{chroot.arch_triplet}-g++

 QMAKE_LINK              = $$QMAKE_CXX
 QMAKE_LINK_SHLIB        = $$QMAKE_CXX

Updated setup instructions

Create an armhf environment:

sudo cbqt ./armhf --create --verbose

Create a qmake wrapper that builds with this environment:

sudo ./cbqt ./armhf --qmake -o /usr/local/bin/qmake-armhf

Install the build-dependencies that you need:

# Note: :arch is added automatically to package names if no arch is explicitly specified
sudo ./cbqt ./armhf --install libqt5svg5-dev libmosquittopp-dev qtwebengine5-dev

Build with qmake

Use qmake-armhf instead of qmake and it works perfectly:

qmake-armhf -makefile
make

Set up Qt Creator

Configure a new Kit in Qt Creator:

  1. Tools/Options, then Kits, then Add
  2. Name: armhf (or anything you like)
  3. In the Qt Versions tab, click Add then set the path of the new Qt to /usr/local/bin/qmake-armhf. Click Apply.
  4. Back in the Kits, select the Qt version you just created in the Qt version field
  5. In Compilers, select the ARM versions of GCC. If they do not appear, install crossbuild-essential-armhf, then in the Compilers tab click Re-detect and then Apply to make them available for selection
  6. Dismiss the dialog with "OK": the new kit is ready

Now you can choose the default kit to build and run locally, and the armhf kit for remote cross-development.

I tried looking at sdktool to automate this step, and it requires a nontrivial amount of work to do it reliably, so these manual instructions will have to do.

Credits

This has been done as part of my work with Truelite.

This is a common logging pattern in Python, to have loggers related to module names:

import logging

log = logging.getLogger(__name__)


class Bill:
    def load_bill(self, filename: str):
        log.info("%s: loading file", filename)

I often however find myself wanting to have loggers related to something context-dependent, like the kind of file that is being processed. For example, I'd like to log loading of bill loading when done by the expenses module, and not when done by the printing module.

I came up with a little hack that keeps the same API as before, and allows to propagate a context dependent logger to the code called:

# Call this file log.py
from __future__ import annotations
import contextlib
import contextvars
import logging

_log: contextvars.ContextVar[logging.Logger] = contextvars.ContextVar('log', default=logging.getLogger())


@contextlib.contextmanager
def logger(name: str):
    """
    Set a default logger for the duration of this context manager
    """
    old = _log.set(logging.getLogger(name))
    try:
        yield
    finally:
        _log.reset(old)


def debug(*args, **kw):
    _log.get().debug(*args, **kw)


def info(*args, **kw):
    _log.get().info(*args, **kw)


def warning(*args, **kw):
    _log.get().warning(*args, **kw)


def error(*args, **kw):
    _log.get().error(*args, **kw)

And now I can do this:

from . import log

# …
    with log.logger("expenses"):
        bill = load_bill(filename)


# This code did not change!
class Bill:
    def load_bill(self, filename: str):
        log.info("%s: loading file", filename)

Anarcat's "procmail considered harmful" post convinced me to get my act together and finally migrate my venerable procmail based setup to sieve.

My setup was nontrivial, so I migrated with an intermediate step in which sieve scripts would by default pipe everything to procmail, which allowed me to slowly move rules from procmailrc to sieve until nothing remained in procmailrc.

Here's what I did.

Literature review

https://brokkr.net/2019/10/31/lets-do-dovecot-slowly-and-properly-part-3-lmtp/ has a guide quite aligned with current Debian, and could be a starting point to get an idea of the work to do.

https://wiki.dovecot.org/HowTo/PostfixDovecotLMTP is way more terse, but more aligned with my intentions. Reading the former helped me in understanding the latter.

https://datatracker.ietf.org/doc/html/rfc5228 has the full Sieve syntax.

https://doc.dovecot.org/configuration_manual/sieve/pigeonhole_sieve_interpreter/ has the list of Sieve features supported by Dovecot.

https://doc.dovecot.org/settings/pigeonhole/ has the reference on Dovecot's sieve implementation.

https://raw.githubusercontent.com/dovecot/pigeonhole/master/doc/rfc/spec-bosch-sieve-extprograms.txt is the hard to find full reference for the functions introduced by the extprograms plugin.

Debugging tools:

  • doveconf to dump dovecot's configuration to see if what it understands matches what I mean
  • sieve-test parses sieve scripts: sieve-test file.sieve /dev/null is a quick and dirty syntax check

Backup of all mails processed

One thing I did with procmail was to generate a monthly mailbox with all incoming email, with something like this:

BACKUP="/srv/backupts/test-`date +%Y-%m-d`.mbox"

:0c
$BACKUP

I did not find an obvious way in sieve to create montly mailboxes, so I redesigned that system using Postfix's always_bcc feature, piping everything to an archive user.

I'll then recreate the monthly archiving using a chewmail script that I can simply run via cron.

Configure dovecot

apt install dovecot-sieve dovecot-lmtpd

I added this to the local dovecot configuration:

service lmtp {
  unix_listener /var/spool/postfix/private/dovecot-lmtp {
    user = postfix
    group = postfix
    mode = 0666
  }
}

protocol lmtp {
  mail_plugins = $mail_plugins sieve
}

plugin {
  sieve = file:~/.sieve;active=~/.dovecot.sieve
}

This makes Dovecot ready to receive mail from Postfix via a lmtp unix socket created in Postfix's private chroot.

It also activates the sieve plugin, and uses ~/.sieve as a sieve script.

The script can be a file or a directory; if it is a directory, ~/.dovecot.sieve will be a symlink pointing to the .sieve file to run.

This is a feature I'm not yet using, but if one day I want to try enabling UIs to edit sieve scripts, that part is ready.

Delegate to procmail

To make sieve scripts that delegate to procmail, I enabled the sieve_extprograms plugin:

 plugin {
   sieve = file:~/.sieve;active=~/.dovecot.sieve
+  sieve_plugins = sieve_extprograms
+  sieve_extensions +vnd.dovecot.pipe
+  sieve_pipe_bin_dir = /usr/local/lib/dovecot/sieve-pipe
+  sieve_trace_dir = ~/.sieve-trace
+  sieve_trace_level = matching
+  sieve_trace_debug = yes
 }

and then created a script for it:

mkdir -p /usr/local/lib/dovecot/sieve-pipe/
(echo "#!/bin/sh'; echo "exec /usr/bin/procmail") > /usr/local/lib/dovecot/sieve-pipe/procmail
chmod 0755 /usr/local/lib/dovecot/sieve-pipe/procmail

And I can have a sieve script that delegates processing to procmail:

require "vnd.dovecot.pipe";

pipe "procmail";

Activate the postfix side

These changes switched local delivery over to Dovecot:

--- a/roles/mailserver/templates/dovecot.conf
+++ b/roles/mailserver/templates/dovecot.conf
@@ -25,6 +25,8 @@+auth_username_format = %Ln
+diff --git a/roles/mailserver/templates/main.cf b/roles/mailserver/templates/main.cf
index d2c515a..d35537c 100644
--- a/roles/mailserver/templates/main.cf
+++ b/roles/mailserver/templates/main.cf
@@ -64,8 +64,7 @@ virtual_alias_domains =-mailbox_command = procmail -a "$EXTENSION"
-mailbox_size_limit = 0
+mailbox_transport = lmtp:unix:private/dovecot-lmtp

Without auth_username_format = %Ln dovecot won't be able to understand usernames sent by postfix in my specific setup.

Moving rules over to sieve

This is mostly straightforward, with the luxury of being able to do it a bit at a time.

The last tricky bit was how to call spamc from sieve, as in some situations I reduce system load by running the spamfilter only on a prefiltered selection of incoming emails.

For this I enabled the filter directive in sieve:

 plugin {
   sieve = file:~/.sieve;active=~/.dovecot.sieve
   sieve_plugins = sieve_extprograms
-  sieve_extensions +vnd.dovecot.pipe
+  sieve_extensions +vnd.dovecot.pipe +vnd.dovecot.filter
   sieve_pipe_bin_dir = /usr/local/lib/dovecot/sieve-pipe
+  sieve_filter_bin_dir = /usr/local/lib/dovecot/sieve-filter
   sieve_trace_dir = ~/.sieve-trace
   sieve_trace_level = matching
   sieve_trace_debug = yes
 }

Then I created a filter script:

mkdir -p /usr/local/lib/dovecot/sieve-filter/"
(echo "#!/bin/sh'; echo "exec /usr/bin/spamc") > /usr/local/lib/dovecot/sieve-filter/spamc
chmod 0755 /usr/local/lib/dovecot/sieve-filter/spamc

And now what was previously:

:0 fw
| /usr/bin/spamc

:0
* ^X-Spam-Status: Yes
.spam/

Can become:

require "vnd.dovecot.filter";
require "fileinto";

filter "spamc";

if header :contains "x-spam-level" "**************" {
    discard;
} elsif header :matches "X-Spam-Status" "Yes,*" {
    fileinto "spam";
}

Updates

Ansgar mentioned that it's possible to replicate the monthly mailbox using the variables and date extensions, with a hacky trick from the extensions' RFC:

require "date"
require "variables"

if currentdate :matches "month" "*" { set "month" "${1}"; }
if currentdate :matches "year" "*" { set "year" "${1}"; }

fileinto :create "${month}-${year}";

Suppose you have a tool that archives images, or scientific data, and it has a test suite. It would be good to collect sample files for the test suite, but they are often so big one can't really bloat the repository with them.

But does the test suite need everything that is in those files? Not necesarily. For example, if one's testing code that reads EXIF metadata, one doesn't care about what is in the image.

That technique works extemely well. I can take GRIB files that are several megabytes in size, zero out their data payload, and get nice 1Kb samples for the test suite.

I've started to collect and organise the little hacks I use for this into a tool I called mktestsample:

$ mktestsample -v samples1/*
2021-11-23 20:16:32 INFO common samples1/cosmo_2d+0.grib: size went from 335168b to 120b
2021-11-23 20:16:32 INFO common samples1/grib2_ifs.arkimet: size went from 4993448b to 39393b
2021-11-23 20:16:32 INFO common samples1/polenta.jpg: size went from 3191475b to 94517b
2021-11-23 20:16:32 INFO common samples1/test-ifs.grib: size went from 1986469b to 4860b

Those are massive savings, but I'm not satisfied about those almost 94Kb of JPEG:

$ ls -la samples1/polenta.jpg
-rw-r--r-- 1 enrico enrico 94517 Nov 23 20:16 samples1/polenta.jpg
$ gzip samples1/polenta.jpg
$ ls -la samples1/polenta.jpg.gz
-rw-r--r-- 1 enrico enrico 745 Nov 23 20:16 samples1/polenta.jpg.gz

I believe I did all I could: completely blank out image data, set quality to zero, maximize subsampling, and tweak quantization to throw everything away.

Still, the result is a 94Kb file that can be gzipped down to 745 bytes. Is there something I'm missing?

I suppose JPEG is better at storing an image than at storing the lack of an image. I cannot really complain :)

I can still commit compressed samples of large images to a git repository, taking very little data indeed. That's really nice!

This morning we realised that a test case failed on Fedora 34 only (the link is in Italian) and we set to debugging.

The initial analysis

This is the initial reproducer:

$ PROJ_DEBUG=3 python setup.py test
test_recipe (tests.test_litota3.TestLITOTA3NordArkimetIFS) ... pj_open_lib(proj.db): call fopen(/lib64/../share/proj/proj.db) - succeeded
proj_create: Open of /lib64/../share/proj/proj.db failed
pj_open_lib(proj.db): call fopen(/lib64/../share/proj/proj.db) - succeeded
proj_create: no database context specified
Cannot instantiate source_crs
EXCEPTION in py_coast(): ProjP: cannot create crs to crs from [EPSG:4326] to [+proj=merc +lon_0=0 +k=1 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +over +units=m +no_defs]
ERROR

Note that opening /lib64/../share/proj/proj.db sometimes succeeds, sometimes fails. It's some kind of Schrödinger path, which works or not depending on how you observe it:

# ls -lad /lib64
lrwxrwxrwx 1 1000 1000 9 Jan 26  2021 /lib64 -> usr/lib64

$ ls -la /lib64/../share/proj/proj.db
-rw-r--r-- 1 root root 8925184 Jan 28  2021 /lib64/../share/proj/proj.db

$ cd /lib64/../share/proj/

$ cd /lib64
$ cd ..
$ cd share
-bash: cd: share: No such file or directory

And indeed, stat(2) finds it, and sqlite doesn't (the file is a sqlite database):

$ stat /lib64/../share/proj/proj.db
  File: /lib64/../share/proj/proj.db
  Size: 8925184     Blocks: 17432      IO Block: 4096   regular file
Device: 33h/51d Inode: 56907       Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2021-11-08 14:09:12.334350779 +0100
Modify: 2021-01-28 05:38:11.000000000 +0100
Change: 2021-11-08 13:42:51.758874327 +0100
 Birth: 2021-11-08 13:42:51.710874051 +0100

$ sqlite3 /lib64/../share/proj/proj.db
Error: unable to open database "/lib64/../share/proj/proj.db": unable to open database file

A minimal reproducer

Later on we started stripping layers of code towards a minimal reproducer: here it is. It works or doesn't work depending on whether proj is linked explicitly, or via MagPlus:

$ cat tc.cc
#include <magics/ProjP.h>

int main() {
    magics::ProjP p("EPSG:4326", "+proj=merc +lon_0=0 +k=1 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +over +units=m +no_defs");
    return 0;
}

$ g++ -o tc  tc.cc -I/usr/include/magics  -lMagPlus
$ ./tc
proj_create: Open of /lib64/../share/proj/proj.db failed
proj_create: no database context specified
terminate called after throwing an instance of 'magics::MagicsException'
  what():  ProjP: cannot create crs to crs from [EPSG:4326] to [+proj=merc +lon_0=0 +k=1 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +over +units=m +no_defs]
Aborted (core dumped)

$ g++ -o tc  tc.cc -I/usr/include/magics -lproj  -lMagPlus
$ ./tc

What is going on here?

A difference between the two is the path used to link to libproj.so:

$ ldd ./tc | grep proj
    libproj.so.19 => /lib64/libproj.so.19 (0x00007fd4919fb000)
$ g++ -o tc  tc.cc -I/usr/include/magics   -lMagPlus
$ ldd ./tc | grep proj
    libproj.so.19 => /lib64/../lib64/libproj.so.19 (0x00007f6d1051b000)

Common sense screams that this should not matter, but we chased an intuition and found that one of the ways proj looks for its database is relative to its shared library.

Indeed, gdb in hand, that dladdr call returns /lib64/../lib64/libproj.so.19.

From /lib64/../lib64/libproj.so.19, proj strips two paths from the end, presumably to pass from something like /something/usr/lib/libproj.so to /something/usr.

So, dladdr returns /lib64/../lib64/libproj.so.19, which becomes /lib64/../, which becomes /lib64/../share/proj/proj.db, which exists on the file system and is used as a path to the database.

But depending how you look at it, that path might or might not be valid: it passes the stat(2) check that stops the lookup for candidate paths, but sqlite is unable to open it.

Why does the other path work?

By linking libproj.so in the other way, dladdr returns /lib64/libproj.so.19, which becomes /share/proj/proj.db, which doesn't exist, which triggers a fallback to a PROJ_LIB constant defined at compile time, which is a path that works no matter how you look at it.

Why that weird path with libMagPlus?

To complete the picture, we found that libMagPlus.so is packaged with a rpath set, which is known to cause trouble

# readelf -d /usr/lib64/libMagPlus.so|grep rpath
 0x000000000000000f (RPATH)              Library rpath: [$ORIGIN/../lib64]

The workaround

We found that one can set PROJ_LIB in the environment to override the normal proj database lookup. Building on that, we came up with a simple way to override it on Fedora 34 only:

    if distro is not None and distro.linux_distribution()[:2] == ("Fedora", "34") and "PROJ_LIB" not in os.environ:
         self.env_overrides["PROJ_LIB"] = "/usr/share/proj/"

This has been a most edifying and educational debugging session, with only the necessary modicum of curses and swearwords. Working in a team of excellent people really helps.

help2man is quite nice for autogenerating manpages from command line help, making sure that they stay up to date as command line options evolve.

It works quite well, except for commands with subcommands, like Python programs that use argparse's add_subparser.

So, here's a quick hack that calls help2man for each subcommand, and stitches everything together in a simple manpage.

#!/usr/bin/python3

import re
import shutil
import sys
import subprocess
import tempfile

# TODO: move to argparse
command = sys.argv[1]

# Use setup.py to get the program version
res = subprocess.run([sys.executable, "setup.py", "--version"], stdout=subprocess.PIPE, text=True, check=True)
version = res.stdout.strip()

# Call the main commandline help to get a list of subcommands
res = subprocess.run([sys.executable, command, "--help"], stdout=subprocess.PIPE, text=True, check=True)
subcommands = re.sub(r'^.+\{(.+)\}.+$', r'\1', res.stdout, flags=re.DOTALL).split(',')

# Generate a help2man --include file with an extra section for each subcommand
with tempfile.NamedTemporaryFile("wt") as tf:
    print("[>DESCRIPTION]", file=tf)

    for subcommand in subcommands:
        res = subprocess.run(
                ["help2man", f"--name={command}", "--section=1",
                 "--no-info", "--version-string=dummy", f"./{command} {subcommand}"],
                stdout=subprocess.PIPE, text=True, check=True)
        subcommand_doc = re.sub(r'^.+.SH DESCRIPTION', '', res.stdout, flags=re.DOTALL)
        print(".SH ", subcommand.upper(), " SUBCOMMAND", file=tf)
        tf.write(subcommand_doc)

    with open(f"{command}.1.in", "rt") as fd:
        shutil.copyfileobj(fd, tf)

    tf.flush()

    # Call help2man on the main command line help, with the extra include file
    # we just generated
    subprocess.run(
            ["help2man", f"--include={tf.name}", f"--name={command}",
             "--section=1", "--no-info", f"--version-string={version}",
             "--output=arkimaps.1", "./arkimaps"],
            check=True)

I had to package a nontrivial Python codebase, and I needed to put dependencies in setup.py.

I could do git grep -h import | sort -u, then review the output by hand, but I lacked the motivation for it. Much better to take a stab at solving the general problem

The result is at https://github.com/spanezz/python-devel-tools.

One fun part is scanning a directory tree, using ast to find import statements scattered around the code:

class Scanner:
    def __init__(self):
        self.names: Set[str] = set()

    def scan_dir(self, root: str):
        for dirpath, dirnames, filenames, dir_fd in os.fwalk(root):
            for fn in filenames:
                if fn.endswith(".py"):
                    with dirfd_open(fn, dir_fd=dir_fd) as fd:
                        self.scan_file(fd, os.path.join(dirpath, fn))
                st = os.stat(fn, dir_fd=dir_fd)
                if st.st_mode & (stat.S_IXUSR | stat.S_IXGRP | stat.S_IXOTH):
                    with dirfd_open(fn, dir_fd=dir_fd) as fd:
                        try:
                            lead = fd.readline()
                        except UnicodeDecodeError:
                            continue
                        if re_python_shebang.match(lead):
                            fd.seek(0)
                            self.scan_file(fd, os.path.join(dirpath, fn))

    def scan_file(self, fd: TextIO, pathname: str):
        log.info("Reading file %s", pathname)
        try:
            tree = ast.parse(fd.read(), pathname)
        except SyntaxError as e:
            log.warning("%s: file cannot be parsed", pathname, exc_info=e)
            return

        self.scan_tree(tree)

    def scan_tree(self, tree: ast.AST):
        for stm in tree.body:
            if isinstance(stm, ast.Import):
                for alias in stm.names:
                    if not isinstance(alias.name, str):
                        print("NAME", repr(alias.name), stm)
                    self.names.add(alias.name)
            elif isinstance(stm, ast.ImportFrom):
                if stm.module is not None:
                    self.names.add(stm.module)
            elif hasattr(stm, "body"):
                self.scan_tree(stm)

Another fun part is grouping the imported module names by where in sys.path they have been found:

    scanner = Scanner()
    scanner.scan_dir(args.dir)

    sys.path.append(args.dir)
    by_sys_path: Dict[str, List[str]] = collections.defaultdict(list)
    for name in sorted(scanner.names):
        spec = importlib.util.find_spec(name)
        if spec is None or spec.origin is None:
            by_sys_path[""].append(name)
        else:
            for sp in sys.path:
                if spec.origin.startswith(sp):
                    by_sys_path[sp].append(name)
                    break
            else:
                by_sys_path[spec.origin].append(name)

    for sys_path, names in sorted(by_sys_path.items()):
        print(f"{sys_path or 'unidentified'}:")
        for name in names:
            print(f"  {name}")

An example. It's kind of nice how it can at least tell apart stdlib modules so one doesn't need to read through those:

$ ./scan-imports …/himblick
unidentified:
  changemonitor
  chroot
  cmdline
  mediadir
  player
  server
  settings
  static
  syncer
  utils
…/himblick:
  himblib.cmdline
  himblib.host_setup
  himblib.player
  himblib.sd
/usr/lib/python3.9:
  __future__
  argparse
  asyncio
  collections
  configparser
  contextlib
  datetime
  io
  json
  logging
  mimetypes
  os
  pathlib
  re
  secrets
  shlex
  shutil
  signal
  subprocess
  tempfile
  textwrap
  typing
/usr/lib/python3/dist-packages:
  asyncssh
  parted
  progressbar
  pyinotify
  setuptools
  tornado
  tornado.escape
  tornado.httpserver
  tornado.ioloop
  tornado.netutil
  tornado.web
  tornado.websocket
  yaml
built-in:
  sys
  time

Maybe such a tool already exists and works much better than this? From a quick search I didn't find it, and it was fun to (re)invent it.

Updates:

Jakub Wilk pointed out to an old python-modules script that finds Debian dependencies.

The AST scanning code should be refactored to use ast.NodeVisitor.

I had this nightmare where I had a very, very important confcall.

I joined with Chrome. Chrome said Failed to access your microphone - Cannot use microphone for an unknown reason. Could not start audio source.

I joined with Firefox. Firefox chose Monitor of Built-in Audio Analog Stereo as a microphone, and did not let me change it. Not in the browser, not in pavucontrol.

I joined with the browser on my phone, and the webpage said This meeting needs to use your microphone and camera. Select *Allow* when your browser asks for permissions. But the question never came.

I could hear people talking. I had very important things to say. I tried typing them in the chat window, but they weren't seeing it. The meeting ended. I was on the verge of tears.

Tell me, Mr. Anderson, what good is a phone call when you are unable to speak?

Since this nightmare happened for real, including the bit about tears in the end, let's see that it doesn't happen again. I should now have three working systems, which hopefully won't all break again all at the same time.

Fixing Chrome

I can reproduce this reliably, on Bullseye's standard Chromium 90.0.4430.212-1, just launched on an empty profile, no extensions.

The webpage has camera and microphone allowed. Chrome doesn't show up in the recording tab of pulseaudio. Nothing on Chrome's stdout/stderr.

JavaScript console has:

Logger.js:154 2021-09-10Txx:xx:xx.xxxZ [features/base/tracks] Failed to create local tracks
Array(2)
DOMException: Could not start audio source

I found the answer here:

I had the similar problem once with chromium. i could solve it by switching in preferences->microphone-> from "default" to "intern analog stereo".

Opening the little popup next to the microphone/mute button allows choosing other microphones, which work. Only "Same as system (Default)" does not work.

Fixing Firefox

I have firefox-esr 78.13.0esr-1~deb11u1. In Jitsi, microphone selection is disabled on the toolbar and in the settings menu. In pavucontrol, changing the recording device for Firefox has no effect. If for some reason the wrong microphone got chosen, those are not ways of fixing it.

What I found works is to click on the camera permission icon, remove microphone permission, then reload the page. At that point Firefox will ask for permission again, and that microphone selection seems to work.

Relevant bugs: on Jitsi and on Firefox. Since this is well known (once you find the relevant issues), I'd have appreciated Jitsi at least showing a link to an explanation of workarounds on Firefox, instead of just disabling microphone selection.

Fixing Jitsi on the phone side

I really don't want to preemptively give camera and microphone permissions to my phone browser. I noticed that there's the Jitsi app on F-Droid and much as I hate to use an app when a website would work, at least in this case it's a way to keep the permission sets separate, so I installed that.

Fixing pavucontrol?

I tried to find out why I can't change input device for FireFox on pavucontrol. I only managed to find an Ask Ubuntu question with no answer and a Unix StackExchange question with no answer.