Enrico's blog

Last 10 blog posts

2019-03-15 11:41:40+01:00

gitpython: list all files in a git commit

A little gitpython recipe to list the paths of all files in a commit:

#!/usr/bin/python3

import git
from pathlib import Path
import sys


def list_paths(root_tree, path=Path(".")):
    for blob in root_tree.blobs:
        yield path / blob.name
    for tree in root_tree.trees:
        yield from list_paths(tree, path / tree.name)


repo = git.Repo(".", search_parent_directories=True)
commit = repo.commit(sys.argv[1])
for path in list_paths(commit.tree):
    print(path)

It can be a good base, for example, for writing a script that, given two git branches, shows which django migrations are in one and not in the other, without doing any git checkout of the code.

debian eng git gitpython hacks pdo python sw
2019-03-08 00:00:00+01:00

Starting tornado on a random free port

One of the software I maintain for work is a GUI data browser that uses Tornado as a backend and a web browser as a front-end.

It is quite convenient to start the command and have the browser open automatically on the right URL. It's quite annoying to start the command and be told that the default port is already in use.

I've needed this trick quite often, also when writing unit tests, and it's time I note it down somewhere, so it's easier to find than going through Tornado's unittest code where I found it the first time.

This is how to start Tornado on a free random port:

from tornado.options import define, options
import tornado.netutil
import tornado.httpserver

define("web_port", type=int, default=None, help="listening port for web interface")

application = Application(self.db_url)

if options.web_port is None:
    sockets = tornado.netutil.bind_sockets(0, '127.0.0.1')
    self.web_port = sockets[0].getsockname()[:2][1]
    server = tornado.httpserver.HTTPServer(application)
    server.add_sockets(sockets)
else:
    server = tornado.httpserver.HTTPServer(application)
    server.listen(options.web_port)
debian devel eng hacks pdo python tornado
2019-03-07 00:00:00+01:00

Getting rusage of child processes on python asyncio

I am writing a little application server for microservices written as compiled binaries, and I would like to log execution statistics from getrusage(2).

The application server is written using asyncio, and processes are managed using asyncio subprocesses.

Unfortunately, asyncio uses os.waitpid instead of os.wait4 to reap child processes, and to get rusage information one has to delve into the asyncio innards, and provide a custom ChildWatcher implementation. Here's how I did it:

import asyncio
from asyncio.log import logger
from contextlib import contextmanager
import os


class ExtendedResults:
    def __init__(self):
        self.rusage = None
        self.returncode = None


class SafeChildWatcherWithRusage(asyncio.SafeChildWatcher):
    """
    SafeChildWatcher that uses os.wait4 to also get rusage information.
    """
    rusage_results = {}

    @classmethod
    @contextmanager
    def monitor(cls, proc):
        """
        Return an ExtendedResults that gets filled when the process exits
        """
        assert proc.pid > 0
        pid = proc.pid
        extended_results = ExtendedResults()
        cls.rusage_results[pid] = extended_results
        try:
            yield extended_results
        finally:
            cls.rusage_results.pop(pid, None)

    def _do_waitpid(self, expected_pid):
        # The original is in asyncio/unix_events.py; on new python versions, it
        # makes sense to check changes to it and port them here
        assert expected_pid > 0

        try:
            pid, status, rusage = os.wait4(expected_pid, os.WNOHANG)
        except ChildProcessError:
            # The child process is already reaped
            # (may happen if waitpid() is called elsewhere).
            pid = expected_pid
            returncode = 255
            logger.warning(
                "Unknown child process pid %d, will report returncode 255",
                pid)
        else:
            if pid == 0:
                # The child process is still alive.
                return

            returncode = self._compute_returncode(status)
            if self._loop.get_debug():
                logger.debug('process %s exited with returncode %s',
                             expected_pid, returncode)

        extended_results = self.rusage_results.get(pid)
        if extended_results is not None:
            extended_results.rusage = rusage
            extended_results.returncode = returncode

        try:
            callback, args = self._callbacks.pop(pid)
        except KeyError:  # pragma: no cover
            # May happen if .remove_child_handler() is called
            # after os.waitpid() returns.
            if self._loop.get_debug():
                logger.warning("Child watcher got an unexpected pid: %r",
                               pid, exc_info=True)
        else:
            callback(pid, returncode, *args)

    @classmethod
    def install(cls):
        loop = asyncio.get_event_loop()
        child_watcher = cls()
        child_watcher.attach_loop(loop)
        asyncio.set_child_watcher(child_watcher)

To use it:

from .hacks import SafeChildWatcherWithRusage
SafeChildWatcherWithRusage.install()

...

    @coroutine
    def run(self, *args, **kw):
        kw["stdin"] = asyncio.subprocess.PIPE
        kw["stdout"] = asyncio.subprocess.PIPE
        kw["stderr"] = asyncio.subprocess.PIPE
        self.started = time.time()

        self.proc = yield from asyncio.create_subprocess_exec(*args, **kw)

        from .hacks import SafeChildWatcherWithRusage
        with SafeChildWatcherWithRusage.monitor(self.proc) as results:
            yield from asyncio.tasks.gather(
                self.write_stdin(self.proc.stdin),
                self.read_stdout(self.proc.stdout),
                self.read_stderr(self.proc.stderr)
            )
        self.returncode = yield from self.proc.wait()
        self.rusage = results.rusage
        self.ended = time.time()
asyncio debian devel eng hacks pdo python
2019-03-06 00:00:00+01:00

Serving debian-distributed javascript libraries in Tornado

Debian conveniently distribute JavaScript libraries, and expects packaged software to use them rather than embedding their own copy.

Here is a convenient custom StaticFileHandler for Tornado that looks for the Debian-distributed versions of JavaScript libraries, and falls back to the vendored versions if they are not found:

from tornado import web
import pathlib


class StaticFileHandler(web.StaticFileHandler):
    """
    StaticFileHandler that allows overriding paths in the static directory with
    system provided versions
    """
    SYSTEM_ASSET_PATH = pathlib.Path("/usr/share/javascript")

    @classmethod
    def get_absolute_path(self, root, path):
        path = pathlib.PurePath(path)
        if not path.parts:
            return super().get_absolute_path(root, path)

        system_dir = self.SYSTEM_ASSET_PATH.joinpath(path.parts[0])
        if system_dir.is_dir():
            # If that asset directory exists in the system, look for things in
            # there
            return self.SYSTEM_ASSET_PATH.joinpath(path)
        else:
            # Else go ahead with the default static dir
            return super().get_absolute_path(root, path)

    def validate_absolute_path(self, root, absolute_path):
        """
        Rewrite of tornado's validate_absolute_path not to raise an error for
        paths in /usr/share/javascript/
        """
        root = pathlib.Path(root)
        absolute_path = pathlib.Path(absolute_path)

        is_system_root = absolute_path.parts[:len(self.SYSTEM_ASSET_PATH.parts)] == self.SYSTEM_ASSET_PATH.parts
        is_static_root = absolute_path.parts[:len(root.parts)] == root.parts

        if not is_system_root and not is_static_root:
            raise web.HTTPError(403, "%s is not in root static directory or system assets path",
                                self.path)

        if absolute_path.is_dir() and self.default_filename is not None:
            # need to look at the request.path here for when path is empty
            # but there is some prefix to the path that was already
            # trimmed by the routing
            if not self.request.path.endswith("/"):
                self.redirect(self.request.path + "/", permanent=True)
                return
            absolute_path = absolute_path.joinpath(self.default_filename)
        if not absolute_path.exists():
            raise web.HTTPError(404)
        if not absolute_path.is_file():
            raise web.HTTPError(403, "%s is not a file", self.path)
        return str(absolute_path)

This is how to use it:

class DebianApplication(tornado.web.Application):
    def __init__(self, *args, **settings):
        from .static import StaticFileHandler
        settings.setdefault("static_handler_class", StaticFileHandler)
        super().__init__(*args, **settings)

And from HTML it's simply a matter of matching the first path component to what is used by Debian's packages under /usr/share/javascript:

    <link rel="stylesheet" href="{{static_url('bootstrap4/css/bootstrap.min.css')}}">
    <script src="{{static_url('jquery/jquery.min.js')}}"></script>
    <script src="{{static_url('popper.js/umd/popper.min.js')}}"></script>
    <script src="{{static_url('bootstrap4/js/bootstrap.min.js')}}"></script>

I find it quite convenient: this way I can start writing prototype code without worrying about fetching javascript libraries to bundle.

I only need to start worrying about it if I need to deploy outside of Debian, or to old stable versions of Debian that don't contain the required JavaScript dependencies. In that case, I just cp -r from a working /usr/share/javascript into Tornado's static directory, and I'm done.

debian devel eng hacks pdo tornado
2019-03-05 17:57:33+01:00

Python hacks: opening a compressed mailbox

Python mailbox.mbox is not good at opening compressed mailboxes:

>>> import mailbox
>>> print(len(mailbox.mbox("/tmp/test.mbox")))
9
>>> print(len(mailbox.mbox("/tmp/test.mbox.gz")))
0
>>> print(len(mailbox.mbox("/tmp/test1.mbox.xz")))
0

For a prototype rewrite of the MIA team's Echelon (the engine behind mia-query), I needed to scan compressed mailboxes, and I had to work around this limitation.

Here is the alternative mailbox.mbox implementation:

import lzma
import gzip
import bz2
import mailbox


class StreamMbox(mailbox.mbox):
    """
    mailbox.mbox does not support opening a stream, which is sad.

    This is a subclass that works around it
    """
    def __init__(self, fd: BinaryIO, factory=None, create: bool = True):
        # Do not call parent __init__, just redo everything here to be able to
        # open a stream. This will need to be re-reviewed for every new version
        # of python's stdlib.

        # Mailbox constructor
        self._path = None
        self._factory = factory

        # _singlefileMailbox constructor
        self._file = fd
        self._toc = None
        self._next_key = 0
        self._pending = False       # No changes require rewriting the file.
        self._pending_sync = False  # No need to sync the file
        self._locked = False
        self._file_length = None    # Used to record mailbox size

        # mbox constructor
        self._message_factory = mailbox.mboxMessage

    def flush(self):
        raise NotImplementedError("StreamMbox is a readonly class")


class UsageExample:
    DECOMPRESS = {
        ".xz": lzma.open,
        ".gz": gzip.open,
        ".bz2": bz2.open,
    }

    @classmethod
    def scan(cls, path: Path) -> Generator[ScannedEmail, None, None]:
        decompress = cls.DECOMPRESS.get(path.suffix)
        if decompress is None:
            with open(path.as_posix(), "rb") as fd:
                yield from cls.scan_fd(path, fd)
        else:
            with decompress(path.as_posix(), "rb") as fd:
                yield from cls.scan_fd(path, fd)

    @classmethod
    def scan_fd(cls, path: Path, fd: BinaryIO) -> Generator[ScannedEmail, None, None]:
        mbox = StreamMbox(fd)
        for msg in mbox:
            ...
debian devel eng hacks pdo python
2018-08-14 15:08:02+02:00

DebConf 18

This is a quick recap of what happened during my DebConf 18.

24 July:

25 July:

26 July:

27 July:

28 July:

29 July:

30 July:

31 July:

01 August:

02 August:

03 August:

04 August:

debian eng pdo
2018-08-03 11:26:51+02:00

Multiple people

These are the notes from my DebConf18 talk

Slides are available in pdf and odp.

Abtract:

Starting from Debian, I have been for a long time part of various groups where diversity is accepted and valued, and it has been an invaluable supply of inspiration, allowing my identity to grow with unexpected freedom.

During the last year, I have been thinking passionately about things such as diversity, gender identity, sexual orientation, neurodiversity, and preserving identity in a group.

I would like to share some of those thoughts, and some of that passion.

Multiple people

"Debian is a relationship between multiple people", it (I) says at the entrance.

I grew up in a small town, being different was not allowed. Difference attracted "education", well-meaning concern, bullying.

Everyone knew everyone, there was no escape to a more fitting bubble, there was very little choice of bubbles.

I had to learn from a very young age the skill of making myself accepted by my group of peers.

It was an essential survival strategy.

Not being a part of the one group meant becoming a dangerous way of defining the identity of the group: "we are not like him". And one would face the consequences.

"Debian is a relationship between multiple people", it (I) says at the entrance.

Debian was one of the first opportunities for me to directly experience that.

Where I could begin to exist

Where I could experience the value and pleasure of diversity.

Including mine.

I am extremely grateful for that.

I love that.

This talk is also a big thank you to you all for that.

"Debian is a relationship between multiple people", it (I) says at the entrance.

Multiple people does not mean being all the same kind of person, all doing all the same kind of things.

Each of us is a different individual, each of us brings their unique creativity to Debian.

Classifying people

How would you describe a person?

There are binary definitions:

Labels: (like package sections)

Spectra: (like debtags)

We classify packages better than we classify people.

Identity / spectrums

I'm going to show a few examples of spectra; I chose them not because they are more or less important than others, but because they have recently been particularly relevant to me, and it's easier for me to talk about them.

If you wonder where you are in each spectrum, know that every place is ok.

Think about who you are, not about who you should be.

Gender identity

My non binary awareness began with d-w and gender neutral documentation.

Sexual orientation

https://en.wikipedia.org/wiki/Human_sexuality_spectrum

table of sexual preference prefixes combinations

Neurodiversity

I'll introduce neurodiversity by introducing allism

An allistic person learns subconsciously that ey is dependent on others for eir emotional experience. Consequently, ey tends to develop the habit of manipulating the form and content of social interactions in order to elicit from others expressions of emotion that ey will find pleasing when incorporated into eir mind.

https://fysh.org/~zefram/allism/allism_intro.txt

The more I reason about this (and I reasoned about this a lot, before, during and after therapy), the more I consider it a very rational adaptation, derived from a very clear message I received since I was a small child: nobody cared whom I was, and to be accepted socially I needed to act a social part, which changed from group to group. Therefore, socially, I was wrong, and I had to work to deserve the right to exist.

What one usually sees of me in large groups or when out of comfort zone, is a social mask of me.

This paper is also interesting: analyzing tweets of people and their social circle, they got to the point of being able to predict what a person will write by running statistics only on what their friends are writing.

Is it measuring identity or social conformance?

Discussion about the autism spectrum tends to get very medical very fast, and the DSM gets often criticised for a tendency of turning diversity into mental health issues.

I stick to my experience from a different end of the spectrum, and there are various resources online to explore if you are interested in more.

Other spectra

I hope you get the idea about spectrum and identity.

There are many more, those were at the top of my head because of my recent experiences.

Skin color, age, wealth, disability, language proficiency, ...

How to deal with diversity

How to deal with my diversity

Let's all assume for a moment that each and every single one of us is ok.

I am ok.

You are ok.

You've been ok since the moment you were born.

Being born is all you need to deserve to exist.

You are ok, and you will always be ok.

Like every single person alive.

I'm ok.

You're ok.

We're all ok.

Hold on to that thought for the next 5 minutes. Hold onto it for the rest of your life.

Ok. A lot of problems are now solved. A lot of energy is not wasted anymore. What next?

Get to know myself

Awareness:

Get in touch with my feelings, get to know my needs.

Here's a simple algorithm to get to know your feelings and needs:

  1. If you are happy, take this phrase: I feel … because my need of … is being met
  2. If you are not happy, take this phrase: I feel … because my need of … is not being met
  3. Fill the first space with one of the words from here
  4. Fill the second space with one of the words from here
  5. Done!

To know more about Non-Violent Communication, I suggest this video

This other video I also liked.

Forget absolute truths, center on my own experience. Have a look here for more details.

Learn to communicate and express myself

Communicating/being oneself

Find out where to be myself

Look for safe spaces where I can activate parts of myself

Learn to protect myself

I will make mistakes acting as myself:

Learn to know my boundaries

Learn to recognise when they are being crossed

Negotiate

Use my anger to protect my integrity. I do not need to hurt others to protect myself

How to deal with the diversity of others

Diversity is a good thing

Once I realised I can be ok in my diversity, it was easier to appreciate the diversity of others

Opening to others doesn't need to sacrifice oneself.

Curiosity is a good default.

Do not assume. Assume better and I'll be disappointed. Assume worse and I'll miss good interactions

When facing the new and unsettling, use curiosity if I have the energy, or be aware that I don't, and take a step back

The goal of the game is to affirm all identities, especially oneself.

Love freely.

Expect nothing.

Liberate myself from imagined expectations.

YKINMKBYKIOK.

What is not acceptable

https://en.wikipedia.org/wiki/Paradox_of_tolerance

The paradox of tolerance, as a comic strip

Less well known is the paradox of tolerance: Unlimited tolerance must lead to the disappearance of tolerance. If we extend unlimited tolerance even to those who are intolerant, if we are not prepared to defend a tolerant society against the onslaught of the intolerant, then the tolerant will be destroyed, and tolerance with them. — In this formulation, I do not imply, for instance, that we should always suppress the utterance of intolerant philosophies; as long as we can counter them by rational argument and keep them in check by public opinion, suppression would certainly be unwise. But we should claim the right to suppress them if necessary even by force; for it may easily turn out that they are not prepared to meet us on the level of rational argument, but begin by denouncing all argument; they may forbid their followers to listen to rational argument, because it is deceptive, and teach them to answer arguments by the use of their fists or pistols. We should therefore claim, in the name of tolerance, the right not to tolerate the intolerant.

Use diversity for growth

Identifying where I am gives me more awareness about myself.

Identifying where I am shows me steps I might be interested in making.

Identity can change, evolve, move I like the idea of talking about my identity in the past tense

Diversity as empowerment, spectrum as empowerment

Take control of your narrative: what is your narrative? Do you like it? Does it tell you now what you're going to like next year, or in 5 years? Is it a problem if it does?

Conceptual space is not limited. Allocating mental space for new diversity doesn't reduce one's own mental space, but it expands it

Is someone trying to control your narrative? gaslighting, negging, patronising.

Debian and diversity

Impostor syndrome

Entering a new group: impostor syndrome. Am I good enough for this group?

Expectations, perceived expectations, perceived changes in perceived identity, perceived requirements on identity

I worked some months with a therapist to deal with that, to, it turned out, learn to give up the need to work to belong.

In the end, it was all there in the Diversity Statement:

No matter how I identify myself or how others perceive me: I am welcome, as long as I interact constructively with my community.

Ability of the group to grow, evolve, change, adapt, create

And here, according to Trout, was the reason human beings could not reject ideas because they were bad: “Ideas on Earth were badges of friendship or enmity. Their content did not matter. Friends agreed with friends, in order to express friendliness. Enemies disagreed with enemies, in order to express enmity.

“The ideas Earthlings held didn’t matter for hundreds of thousands of years, since they couldn’t do much about them anyway. Ideas might as well be badges as anything.

(Kurt Vonnegut, "Breakfast of Champions", 1973)

Keep one's identity in Debian

If your identity is your identity, and the group changes, it actually extends, because you keep being who you are.

If your identity is a function of the group identity, you become a control freak for where the group is going.

When people define their identity in terms of belonging to a group, that group cannot change anymore, because if it does, it faces resistance from its members, that will see their own perceived identity under threat.

The threat is that rituals, or practices, that validated my existance, that previously used to work, cease to function. systemd?

Free software

Us, and our users, we are a diverse ecosystem

Free Software is a diverse ecosystem

Free software can be a spectrum (free hardware, free firmware, free software, free javascript in browsers...)

Vision

Debian exists, and can move in a diverse and constantly changing upstream ecosystem

Vision / non limiting the future of Debian (if your narrative tells you what you're going to like next year, you might have a problem) (but before next year I'd like to get to a point that I can cope with X)

Debian doesn't need to be what people need to define their own identity, but it is defined by the relationship between different, diverse, evolving people

Appreciate diversity, because there's always something you don't know / don't understand, and more in the future.

Nobody can know all of Debian now, and in the future, if we're successful, we're going to get even bigger and more complex.

We're technically complex and diverse, we're socially complex and diverse. We got to learn to deal with with that.

Because we're awesome. We got to learn to deal with with that.

Ode to the diversity statement

https://www.debian.org/intro/diversity

debconf debian eng life pdo talk
2018-07-26 14:36:30+02:00

debug-on-porterbox

This work has been brought to you by the wonderful DebCamp.

I needed to reproduce a build issue on an i386 architecture, so I started going through the instructions for finding a porterbox and setting up a chroot.

And then I though, this is long and boring. A program could do that.

So I created a program to do that:

$ debug-on-porterbox  --help
usage: debug-on-porterbox [-h] [--verbose] [--debug] [--cleanup] [--git]
                          [--reuse] [--dist DIST] [--host HOST]
                          arch [package]

set up a build environment to debug a package on a porterbox

positional arguments:
  arch           architecture name
  package        package name

optional arguments:
  -h, --help     show this help message and exit
  --verbose, -v  verbose output
  --debug        debug output
  --cleanup      cleanup a previous build, removing porterbox data and git
                 remotes
  --git          setup a git clone of the current branch
  --reuse        reuse an existing session
  --dist DIST    distribution (default: sid)
  --host HOST    hostname to use (autodetected by default)

On a source directory, you can run debug-on-porterbox i386 and it will:

The only thing left for you to do is to log into the machine debug-on-porterbox tells you, run the command porterbox tells you to enter the chroot, and debug away.

At the end you can clean everything up, including the remote chroot and the git remote in the local repo, with: debug-on-porterbox [--git] --cleanup i386

The code is on Salsa: have fun!

debian eng pdo sw
2018-06-13 13:43:09+02:00

Progress bar for file descriptors

I ran gzip on an 80Gb file, it's processing, but who knows how much it has done yet, and when it will end? I wish gzip had a progressbar. Or MySQL. Or…

Ok. Now every program that reads a file sequentially can have a progressbar:

https://gitlab.com/spanezz/fdprogress

fdprogress

Print progress indicators for programs that read files sequentially.

fdprogress monitors file descriptor offsets and prints progressbars comparing them to file sizes.

Pattern can be any glob expression.

usage: fdprogress [-h] [--verbose] [--debug] [--pid PID] [pattern]

show progress from file descriptor offsets

positional arguments:
  pattern            file name to monitor

optional arguments:
  -h, --help         show this help message and exit
  --verbose, -v      verbose output
  --debug            debug output
  --pid PID, -p PID  PID of process to monitor

pv

pv has a --watchfd option that does most of what fdprogress is trying to do: use that instead.

fivi

fivi also exists, with specific features to show progressbars for filter commands.

debian eng pdo sw
2018-05-15 14:06:06+02:00

Starting user software in X

There are currently many ways of starting software when a user session starts.

This is an attempt to collect a list of pointers to piece the big picture together. It's partial and some parts might be imprecise or incorrect, but it's a start, and I'm happy to keep it updated if I receive corrections.

x11-common

man xsession

systemd --user

dbus activation

X session manager

xdg autostart

Other startup notes

~/.Xauthority

To connect to an X server, a client needs to send a token from ~/.Xauthority, which proves that they can read the user's provate data.

~/.Xauthority contains a token generated by display manager and communicated to X at startup.

To view its contents, use xauth -i -f ~/.Xauthority list

debian eng pdo sw