Last 10 blog posts

Thanks to Moritz Naumann who found the issues and wrote a very useful report, I fixed a number of Cross Site Scripting vulnerabilities on https://debtags.debian.org.

The core of the issue was code like this in a Django view:

def pkginfo_view(request, name):
    pkg = bmodels.Package.by_name(name)
    if pkg is None:
        return http.HttpResponseNotFound("Package %s was not found" % name)
    # …

The default content-type of HttpResponseNotFound is text/html, and the string passed is the raw HTML with clearly no escaping, so this allows injection of arbitrary HTML/<script> code in the name variable.

I was so used to Django doing proper auto-escaping that I missed this place in which it can't do that.

There are various things that can be improved in that code.

One could introduce escaping (and while one's at it, migrate the old % to format):

from django.utils.html import escape

def pkginfo_view(request, name):
    pkg = bmodels.Package.by_name(name)
    if pkg is None:
        return http.HttpResponseNotFound("Package {} was not found".format(escape(name)))
    # …

Alternatively, set content_type to text/plain:

def pkginfo_view(request, name):
    pkg = bmodels.Package.by_name(name)
    if pkg is None:
        return http.HttpResponseNotFound("Package {} was not found".format(name), content_type="text/plain")
    # …

Even better, raise Http404:

from django.utils.html import escape

def pkginfo_view(request, name):
    pkg = bmodels.Package.by_name(name)
    if pkg is None:
        raise Http404(f"Package {name} was not found")
    # …

Even better, use standard shortcuts and model functions if possible:

from django.shortcuts import get_object_or_404

def pkginfo_view(request, name):
    pkg = get_object_or_404(bmodels.Package, name=name)
    # …

And finally, though not security related, it's about time to switch to class-based views:

class PkgInfo(TemplateView):
    template_name = "reports/package.html"

    def get_context_data(self, **kw):
        ctx = super().get_context_data(**kw)
        ctx["pkg"] = get_object_or_404(bmodels.Package, name=self.kwargs["name"])
    # …
        return ctx

I proceeded with a review of the other Django sites I maintain in case I reproduced this mistake also there.

Several sites have started disabling paste in input fields, mostly password fields, but also other fields for no apparent reason.

Random links on the topic:

  • https://developers.google.com/web/tools/lighthouse/audits/password-pasting
  • https://www.ncsc.gov.uk/blog-post/let-them-paste-passwords
  • https://www.troyhunt.com/the-cobra-effect-that-is-disabling/
  • https://www.wired.com/2015/07/websites-please-stop-blocking-password-managers-2015/

This said, I am normally uneasy about copy-pasting passwords, as any X window can sniff the clipboard contents at any time, and I like password managers like impass that would type it for you instead of copying it to the clipboard.

However, today I got out way more frustrated than I could handle after illing in 17-digits nonsensical, always-slightly-different INPS payment codelines inside input fields that disabled paste for no reason whatsoever (they are not secret).

I thought "never again", I put together some code from impass and wmctrl and created xtypeinto:

$ ./xtypeinto --help
usage: xtypeinto [-h] [--verbose] [--debug] [string]

Type text into a window

positional arguments:
  string         string to type (default: stdin)

optional arguments:
  -h, --help     show this help message and exit
  --verbose, -v  verbose output
  --debug        debug output

Pass a string to xtypeinto as an argument, or as standard input.

xtypeinto will show a crosshair to pick a window, and the text will be typed into that window.

Please make sure that you focus on the right field before running xtypeinto, to make sure things are typed where you need them.

The LineageOS updater notified me that there will be no more updates for LineageOS 14, because now development on my phone happens on LineageOS 16, so I set aside some time and carefully followed the upgrade instructions.

I now have a phone with Lineageos 16, but the whole modem subsystem does not work.

Advice on #lineageos was that "the wiki instructions are often a bit generic.. offical thread often has the specific details".

Official thread is here, and the missing specific detail was "Make sure you had Samsung's Oreo firmware bootloader and modem before installing this.".

It looks like nothing ever installed firmware updates, since the Android that came with my phone ages ago. I can either wipe everything and install a stock android to let it do the upgrade, then replace it with LineageOS, or try a firmware upgrade.

This link has instructions for firmware upgrades using haimdall, which is in Debian, instead of Odin, which is in Windows.

Finding firmwares is embarassing. They only seem to be available from links on shady download sites, or commercial sites run by who knows whom. I verify sha256sums on LineageOS images, F-Droid has reproducible builds, but at the base of this wonderful stack there's going to be a blob downloaded off some forum on the internet.

In this case, this link points to some collection of firmware blobs.

I downloaded the pack and identified the ones for my phone, then unpacked the tar files and uncompressed the lz4 blobs.

With heimdall, I identified the mapping from partition names to blob names:

heimdall print-pit --no-reboot

Then I did the flashing:

heimdall flash --resume --RADIO modem.bin --CM cm.bin --PARAM param.bin --BOOTLOADER sboot.bin

The first time flashing didn't work, and I got stuck in download mode. This explains how to get out of download mode (power + volume down for 10s).

Second attempt worked fine, and now I have a working phone again:

heimdall flash --RADIO modem.bin --CM cm.bin --PARAM param.bin --BOOTLOADER sboot.bin

«Bullshit is unavoidable whenever circumstances require someone to talk without knowing what he is talking about. Thus the production of bullshit is stimulated whenever a person’s obligations or opportunities to speak about some topic are more excessive than his knowledge of the facts that are relevant to that topic.

This discrepancy is common in public life, where people are frequently impelled— whether by their own propensities or by the demands of others—to speak extensively about matters of which they are to some degree ignorant.

Closely related instances arise from the widespread conviction that it is the responsibility of a citizen in a democracy to have opinions about everything, or at least everything that pertains to the conduct of his country’s affairs.

The lack of any significant connection between a person’s opinions and his apprehension of reality will be even more severe, needless to say, for someone who believes it his responsibility, as a conscientious moral agent, to evaluate events and conditions in all parts of the world.»

(From Harry G. Frankfurt's On Bullshit)

Opinion Sort

In a world where it is more important to have a quick opinion than a thorough understanding, I propose this novel sorting algoritihm.

def opinion_sort(list: List[Any], post: Callable[List]):
    """
    list: a list of elements to sort in place
    post: a callable that requires a sorted list as input and does
          proper error checking, as they should do
    """
    if list[0] > list[1]:
        swap(list[0], list[1])
    while True:
        try:
            # Assert opinion: "It is a sorted list!"
            post(list)
        except NotSortedException as e:
            # Someone disagrees, and they have a good point
            swap(list[e.unsorted_idx_1], list[e.unsorted_idx_2])
        else:
            break
    # The list is now sorted, and the callable has to agree

This algorithm is the most efficient sorting algorithm, because it can sort a list by only looking at the first two elements.

I might have accidentally forked live-wrapper.

I sometimes need to build Debian live iso images for work, and some time ago got into an inconvenient situation in which live-wrapper required software not available in Debian anymore, and there was no obvious replacement for it, so I forked it and tried to forward-port things and fill the gaps.

Over time this kind of grew: I ported it to python3, removed difficult dependencies, added several new features that I needed, and removed several that I didn't need.

I recently had a chance to document the result, which makes it good enough to be announced, so here it is. The README has an introduction and links to documentation, recipes and examples.

I'm not actively maintaining this except when work requires, so if there's anything extra you need for it, the best way to get it is via a merge request.

I'm not sure how much of live-wrapper is still left in the fork. If anyone starts using it, we should probably look into a new name.

Updated: re-run 2019-04-02

Updated: re-run after merging Andreas Tille's addresses

Updated: re-run after fixing a bug in the code that skips signatures

Updated: re-run on a mailbox with only the post-nomination discussion.

I made a script to compute some statistics on debian-vote's election discussions.

Here are the result as of 2019-04-02 12:00 UTC+1:

These are the number of mails sent by people who posted more than 2 messages:

Name                              Mails
=======================================
Jonathan Carter                      33
Joerg Jaspert                        31
Martin Michlmayr                     20
Sam Hartman                          18
Andreas Tille                        11
Lucas Nussbaum                        8
Stefano Zacchiroli                    8
Sean Whitton                          7
Jose Miguel Parrella                  6
Jonas Meurer                          5
Paulo Henrique de Lima Santana        5
Laura Arjona Reina                    4
martin f krafft                       4
Louis-Philippe_Véronneau              3
Alexander Wirt                        3
Ansgar                                3
Ian Jackson                           3
Paul Wise                             3
Raphael Hertzog                       3

These are sum and averages of lines of non-quoted message text sent by people:

Name                                Sum   Avg
=============================================
Jonathan Carter                    1475    45
Sam Hartman                         799    44
Joerg Jaspert                       684    22
Martin Michlmayr                    665    33
Andreas Tille                       287    26
Lucas Nussbaum                      204    26
Jose Miguel Parrella                167    28
Ian Jackson                         140    47
Stefano Zacchiroli                  127    16
Sean Whitton                        109    16
Jonas Meurer                         96    19
Paulo Henrique de Lima Santana       89    18
Laura Arjona Reina                   80    20
Ansgar                               68    23
martin f krafft                      68    17
Louis-Philippe Véronneau             67    22
Raphael Hertzog                      43    14
Alexander Wirt                       36    12
Paul Wise                            23     8

These are the top keywords of messages sent by the candidates so far, scored by an improvised TFIDF metric:

Sam Hartman
  valuable, people, helping, things, consensus, project, think
Jonathan Carter
  software, think, make, help, time, free, project
Joerg Jaspert
  thats, nice, whatever, stuff, people, simple, need
Martin Michlmayr
  believe, foss, where, maybe, people, world, these

A little gitpython recipe to list the paths of all files in a commit:

#!/usr/bin/python3

import git
from pathlib import Path
import sys


def list_paths(root_tree, path=Path(".")):
    for blob in root_tree.blobs:
        yield path / blob.name
    for tree in root_tree.trees:
        yield from list_paths(tree, path / tree.name)


repo = git.Repo(".", search_parent_directories=True)
commit = repo.commit(sys.argv[1])
for path in list_paths(commit.tree):
    print(path)

It can be a good base, for example, for writing a script that, given two git branches, shows which django migrations are in one and not in the other, without doing any git checkout of the code.

One of the software I maintain for work is a GUI data browser that uses Tornado as a backend and a web browser as a front-end.

It is quite convenient to start the command and have the browser open automatically on the right URL. It's quite annoying to start the command and be told that the default port is already in use.

I've needed this trick quite often, also when writing unit tests, and it's time I note it down somewhere, so it's easier to find than going through Tornado's unittest code where I found it the first time.

This is how to start Tornado on a free random port:

from tornado.options import define, options
import tornado.netutil
import tornado.httpserver

define("web_port", type=int, default=None, help="listening port for web interface")

application = Application(self.db_url)

if options.web_port is None:
    sockets = tornado.netutil.bind_sockets(0, '127.0.0.1')
    self.web_port = sockets[0].getsockname()[:2][1]
    server = tornado.httpserver.HTTPServer(application)
    server.add_sockets(sockets)
else:
    server = tornado.httpserver.HTTPServer(application)
    server.listen(options.web_port)

I am writing a little application server for microservices written as compiled binaries, and I would like to log execution statistics from getrusage(2).

The application server is written using asyncio, and processes are managed using asyncio subprocesses.

Unfortunately, asyncio uses os.waitpid instead of os.wait4 to reap child processes, and to get rusage information one has to delve into the asyncio innards, and provide a custom ChildWatcher implementation. Here's how I did it:

import asyncio
from asyncio.log import logger
from contextlib import contextmanager
import os


class ExtendedResults:
    def __init__(self):
        self.rusage = None
        self.returncode = None


class SafeChildWatcherWithRusage(asyncio.SafeChildWatcher):
    """
    SafeChildWatcher that uses os.wait4 to also get rusage information.
    """
    rusage_results = {}

    @classmethod
    @contextmanager
    def monitor(cls, proc):
        """
        Return an ExtendedResults that gets filled when the process exits
        """
        assert proc.pid > 0
        pid = proc.pid
        extended_results = ExtendedResults()
        cls.rusage_results[pid] = extended_results
        try:
            yield extended_results
        finally:
            cls.rusage_results.pop(pid, None)

    def _do_waitpid(self, expected_pid):
        # The original is in asyncio/unix_events.py; on new python versions, it
        # makes sense to check changes to it and port them here
        assert expected_pid > 0

        try:
            pid, status, rusage = os.wait4(expected_pid, os.WNOHANG)
        except ChildProcessError:
            # The child process is already reaped
            # (may happen if waitpid() is called elsewhere).
            pid = expected_pid
            returncode = 255
            logger.warning(
                "Unknown child process pid %d, will report returncode 255",
                pid)
        else:
            if pid == 0:
                # The child process is still alive.
                return

            returncode = self._compute_returncode(status)
            if self._loop.get_debug():
                logger.debug('process %s exited with returncode %s',
                             expected_pid, returncode)

        extended_results = self.rusage_results.get(pid)
        if extended_results is not None:
            extended_results.rusage = rusage
            extended_results.returncode = returncode

        try:
            callback, args = self._callbacks.pop(pid)
        except KeyError:  # pragma: no cover
            # May happen if .remove_child_handler() is called
            # after os.waitpid() returns.
            if self._loop.get_debug():
                logger.warning("Child watcher got an unexpected pid: %r",
                               pid, exc_info=True)
        else:
            callback(pid, returncode, *args)

    @classmethod
    def install(cls):
        loop = asyncio.get_event_loop()
        child_watcher = cls()
        child_watcher.attach_loop(loop)
        asyncio.set_child_watcher(child_watcher)

To use it:

from .hacks import SafeChildWatcherWithRusage
SafeChildWatcherWithRusage.install()

...

    @coroutine
    def run(self, *args, **kw):
        kw["stdin"] = asyncio.subprocess.PIPE
        kw["stdout"] = asyncio.subprocess.PIPE
        kw["stderr"] = asyncio.subprocess.PIPE
        self.started = time.time()

        self.proc = yield from asyncio.create_subprocess_exec(*args, **kw)

        from .hacks import SafeChildWatcherWithRusage
        with SafeChildWatcherWithRusage.monitor(self.proc) as results:
            yield from asyncio.tasks.gather(
                self.write_stdin(self.proc.stdin),
                self.read_stdout(self.proc.stdout),
                self.read_stderr(self.proc.stderr)
            )
        self.returncode = yield from self.proc.wait()
        self.rusage = results.rusage
        self.ended = time.time()

Debian conveniently distribute JavaScript libraries, and expects packaged software to use them rather than embedding their own copy.

Here is a convenient custom StaticFileHandler for Tornado that looks for the Debian-distributed versions of JavaScript libraries, and falls back to the vendored versions if they are not found:

from tornado import web
import pathlib


class StaticFileHandler(web.StaticFileHandler):
    """
    StaticFileHandler that allows overriding paths in the static directory with
    system provided versions
    """
    SYSTEM_ASSET_PATH = pathlib.Path("/usr/share/javascript")

    @classmethod
    def get_absolute_path(self, root, path):
        path = pathlib.PurePath(path)
        if not path.parts:
            return super().get_absolute_path(root, path)

        system_dir = self.SYSTEM_ASSET_PATH.joinpath(path.parts[0])
        if system_dir.is_dir():
            # If that asset directory exists in the system, look for things in
            # there
            return self.SYSTEM_ASSET_PATH.joinpath(path)
        else:
            # Else go ahead with the default static dir
            return super().get_absolute_path(root, path)

    def validate_absolute_path(self, root, absolute_path):
        """
        Rewrite of tornado's validate_absolute_path not to raise an error for
        paths in /usr/share/javascript/
        """
        root = pathlib.Path(root)
        absolute_path = pathlib.Path(absolute_path)

        is_system_root = absolute_path.parts[:len(self.SYSTEM_ASSET_PATH.parts)] == self.SYSTEM_ASSET_PATH.parts
        is_static_root = absolute_path.parts[:len(root.parts)] == root.parts

        if not is_system_root and not is_static_root:
            raise web.HTTPError(403, "%s is not in root static directory or system assets path",
                                self.path)

        if absolute_path.is_dir() and self.default_filename is not None:
            # need to look at the request.path here for when path is empty
            # but there is some prefix to the path that was already
            # trimmed by the routing
            if not self.request.path.endswith("/"):
                self.redirect(self.request.path + "/", permanent=True)
                return
            absolute_path = absolute_path.joinpath(self.default_filename)
        if not absolute_path.exists():
            raise web.HTTPError(404)
        if not absolute_path.is_file():
            raise web.HTTPError(403, "%s is not a file", self.path)
        return str(absolute_path)

This is how to use it:

class DebianApplication(tornado.web.Application):
    def __init__(self, *args, **settings):
        from .static import StaticFileHandler
        settings.setdefault("static_handler_class", StaticFileHandler)
        super().__init__(*args, **settings)

And from HTML it's simply a matter of matching the first path component to what is used by Debian's packages under /usr/share/javascript:

    <link rel="stylesheet" href="{{static_url('bootstrap4/css/bootstrap.min.css')}}">
    <script src="{{static_url('jquery/jquery.min.js')}}"></script>
    <script src="{{static_url('popper.js/umd/popper.min.js')}}"></script>
    <script src="{{static_url('bootstrap4/js/bootstrap.min.js')}}"></script>

I find it quite convenient: this way I can start writing prototype code without worrying about fetching javascript libraries to bundle.

I only need to start worrying about it if I need to deploy outside of Debian, or to old stable versions of Debian that don't contain the required JavaScript dependencies. In that case, I just cp -r from a working /usr/share/javascript into Tornado's static directory, and I'm done.