Index of categories

pdo

Pages exported to http://planet.debian.org.

Pressure

I've just stumbled on this bit that seems relevant to me:

Insist on using objective criteria

The final step is to use mutually agreed and objective criteria for evaluating the candidate solutions. During this stage they encourage openness and surrender to principle not pressure.

http://www.wikisummaries.org/Getting_to_Yes

I find the concept of "pressure" very relevant, and I like the idea of discussions being guided by content rather than pressure.

I'm exploring the idea of filing under this concept of "pressure" most of the things described in code of conducts, and I'm toying with looking at gender or race issues from the point of view of making people surrender to pressure.

In that context, most code of conducts seem to be giving a partial definition of "pressure". I've been uncomfortable at DebConf this year, because the conference PG12 code of conduct would cause me trouble for talking about what lessons can Debian learn from consent culture in BDSM communities, but it would still allow situations in which people would have to yield to pressure, as long as the pressure was done avoiding the behaviours blacklisted by the CoC.

Pressure could be the phrase "you are wrong" without further explanation, spoken by someone with more reputation than I have in a project. It could be someone with the time for writing ten emails a day discussing with someone with barely the time to write one. It could be someone using elaborate English discussing with someone who needs to look up every other word in a dictionary. It could be just ignoring emails from people who have issues different than mine.

I like the idea of having "please do not use pressure to bring your issues forward" written somewhere, rather than spend time blacklisting all possible ways of pressuring people.

I love how the Diversity Statement is elegantly getting all this where it says: «We welcome contributions from everyone as long as they interact constructively with our community.»

However, I also find it hard not to fall back to using pressure, even just for self-preservation: I have often found myself in the situation of having the responsibility to get a job done, and not having the time or emotional resources to even read the emails I get about the subject. All my life I've seen people in such a situation yell "shut up and let me work!", and I feel a burning thirst for other kinds of role models.

A CoC saying "do not use pressure" would not help me much here, but being around people who do that, learning to notice when and how they do it, and knowing that I could learn from them, that certainly would.

If you can link to examples, I'd like to add them here.

Posted Tue Sep 23 16:18:22 2014 Tags:

Laptop, I demand that you suspend!

Dear Lazyweb,

Sometimes some application prevents suspend on my laptop. I want to disable that feature: how?

I understand that there may exist some people who like that feature. I, on the other hand, consider a scenario like this inconceivable:

  1. I'm on a plane working with my laptop, the captain announces preparations for landing, so I quickly hit the suspend button (or close the lid) on my laptop and stow it away.
  2. One connecting flight later, I pick up my backpack, I feel it unusually hot and realise that my laptop has been on all along, and is now dead from either running out of battery or thermal protection.
  3. I think things that, if spoken aloud in front of a pentacle, might invoke major lovecraftian horrors.

I do not want this scenario to ever be possible. I want my suspend button to suspend the laptop no matter what. If a process does not agree, I'm fine with suspending it anyway, or killing it.

If I want my laptop to suspend, I generally have a good enough real-world reason for it, and I cannot conceive that a software could ever be allowed to override my command.

How do I change this? I don't know if I should look into systemd, upowerd, pm-utils, the kernel, the display manager or something else entirely. I worry that I cannot even figure where to start looking for a solution.

This happened to me multiple times already, and I consider it ridiculous. I know that it can cause me data loss. I know that it can cause me serious trouble in case I was relying on having some battery or state left at my arrival. I know that depending on what is in my backpack, this could also be physically dangerous.

So, what knob do I tweak for this? How do I make suspend reliable?

Update

Systemd has an inhibitor system, and systemd-inhibit --list only lists 'delay' blocks in my system. It is an interesting feature that seems to be implemented in the right way, and it could mean that I finally can get my screen to be locked before the system is suspended.

It is possible to configure the inhibitor system in /etc/systemd/logind.conf, including ways to ignore inhibitors, and a maximum time after which inhibitors are ignored if not yet released.

Try as I might to run everything that I was running on the plane that time, I could not manage to see anything take an inhibitor block that could have prevented my suspend. I now suspect that what happened to me was a glitch caused by something else (hardware? kernel? cosmic rays!) during that specific suspend.

When I had this issue in the past it looks like the infrastructure at the time was far more primitive that what we have now with systemd, so I guess that when writing my blog post I had simply correlated my old experiences with a one-off suspend glitch.

If I want to investigate or tune further, to test the situation with a runaway block, I can use commands like systemd-inhibit --mode=block sleep 3600.

I'm quite happy to see that we're moving to a standard and sane system for this. In the meantime, I have learnt that pm-utils has now become superfluous and can be deinstalled, and so can acpi-support and acpi-support-base.

Thanks vbernat, mbiebl, and ah, on #debian-devel for all the help.

Posted Thu Sep 11 14:32:40 2014 Tags:

On relationships

Good relationships are like a good video game

with an easy, intuitive interface

and lots of interesting content.

(Lynoure)

Posted Wed Aug 6 16:47:40 2014 Tags:

Fear of losing

If I am afraid of breaking my laptop, then I may leave it at home, and it will be as if I didn't have a laptop.

If I am afraid of losing faith, then I may closely follow the dictates of the church. I will be keeping the church's faith, but not mine.

If I am afraid of losing my children, they may have to run away from me to be free to grow into adults.

If I am afraid of losing my inner child, then I might not expose it to the world, and so my inner child will never live. I will just be a box, a mask of what the world expects from me, that shelters and cages the Me that would like to live.

If I am afraid of losing you, then I may get obsessed with preserving the beautiful image I have of you. I may stop seeing, experiencing you as you live, think, grow. I may become afraid of your depth, of your being different each day, of your being alive. I may end up with cherishing a perfect image of you in my head, while you will have become a stranger to me.

I am really asking when I can accept answers.

I am really living when I can accept myself.

I am really loving when I can accept you.

Posted Mon Jun 16 09:21:52 2014 Tags:

Perfection

We like perfection.

Perfection is the ultimate achievement, there is nothing beyond.

Perfection is fully understood. It is not going to change, it is fact, we can rely upon it.

Perfection is final. Perfection is death.

Ideas can be perfect, and perfect ideas are easy to understand.

Perfect ideas are final and unchangeable. Perfect ideas are hard to correct, hard to refute.

Perfect ideas spread easily. They are helpful. They shed light on a little corner of our world, give it shape. They bring stability. They can be relied upon. Perfect ideas make good memes.

Perfect ideas are shared standards through which we act, interact, coordinate, cooperate. They don't change, so they are a solid base for habits, that make a bit of our life a little easier.

That we should not kill, is a perfect idea. So is racism. So are the ten commandments, so, for many, is love.

Thanks Lynoure for saying the right thing at the right time.

Posted Wed Jun 4 22:53:37 2014 Tags:

When I said "I love you"

All people ever say is: thank you (a celebration of life) and please (an opportunity to make life more wonderful). Marshall Rosenberg

I have said "I love you" many times in my life, and many times I have failed to say it, because, for me, it is not an easy thing to say.

It is not easy when I have no idea what the other person will make of it: will they be frightened? Will they feel awkward around me afterwards? Will they disappear from my life?

But do I know what I myself mean when I say it?

I have said "I love you" because I thought you somehow expected it of me. "please, consider me worth of you".

I have said "I love you" to beg for affection. "please, love me back".

I have said "I love you" because I was grateful to you for existing in my life. "thank you".

I now understand why it has not been easy for me to say "I love you" when I was feeling, or imagining, that I had to say it.

I now understand why I have sometimes made myself awkward, as I was begging.

I now understand why, when I said "I love you" out of gratitude, when I said it to celebrate that you exist in my life, that's when I felt no trouble, no fear, and when I felt that my words really were fitting with what I was feeling and what I was wanting to say.

Posted Mon Jun 2 10:18:56 2014 Tags:

Wheezy for industrial software development

I'm helping with setting up a wheezy-based toolchain for industrial automation.

The basic requirements are: live-build, C++11, Qt 5.3, and a frozen internal wheezy mirror.

debmirror

A good part of a day's work was lost because of #749734 and possibly #628779. Mirror rebuild is still ongoing, and fingers crossed.

This is Italy, and you can't simply download 21Gb of debs just to see how it goes.

C++11

Stable toolchains for C++11 now exist and have gained fast adoption. It makes sense, since given what is in C++11 it is unthinkable to start a new C++ project with the old standard nowadays.

C++11 is supported by g++ 4.8+ or clang 3.3+. None of them is available on wheezy or wheezy-backports.

Backports exist of g++ 4.8 only for Ubuntu 12.04, but they are uninstallable on wheezy due at least to a different libc6. I tried rebuilding g++4.8 on wheezy but quickly gave up.

clang 3.3 has a build dependency on g++ 4.8. LOL.

However, LLVM provides an APT repository with their most recent compiler, and it works, too. C++11 problem solved!

Qt 5.3

Qt 5.3 is needed because of the range of platforms it can target. There is no wheezy backport that I can find.

I cannot simply get it from Qt's Download page and install it, since we need it packaged, to build live ISOs with it.

I'm attempting to backport the packages from experimental to wheezy.

Here are its build dependencies:

libxcb-1.10 (needed by qt5)

Building this is reasonably straightforward.

libxkbcommon 0.4.0 (needed by qt5)

The version from jessie builds fine on wheezy, provided you remove --fail-missing from the dh_install invocation.

libicu 52.1 (needed by harfbuzz)

The jessie packages build on wheezy, provided that mentions of clang are deleted from source/configure.ac, since it fails to build with clang 3.5 (the one currently available for wheezy on llvm.org).

libharfbuzz-dev

Backporting this is a bloodbath: the Debian packages from jessie depend on a forest of gobject hipsterisms of doom, all unavailable on wheezy. I gave up.

qt 5.3

qtbase-opensource-src-5.3.0+dfsg can be made to build with an embedded version of harfbuzz, with just this change:

diff -Naur a/debian/control a/debian/control
--- a/debian/control    2014-05-20 18:48:27.000000000 +0200
+++ b/debian/control    2014-05-29 17:45:31.037215786 +0200
@@ -28,7 +28,6 @@
                libgstreamer-plugins-base0.10-dev,
                libgstreamer0.10-dev,
                libgtk2.0-dev,
-               libharfbuzz-dev,
                libicu-dev,
                libjpeg-dev,
                libmysqlclient-dev,
diff -Naur a/debian/rules b/debian/rules
--- a/debian/rules  2014-05-18 01:56:37.000000000 +0200
+++ b/debian/rules  2014-05-29 17:45:25.738634371 +0200
@@ -108,7 +108,6 @@
                -plugin-sql-tds \
                -system-sqlite \
                -platform $(platform_arg) \
-               -system-harfbuzz \
                -system-zlib \
                -system-libpng \
                -system-libjpeg \

(thanks Lisandro Damián Nicanor Pérez Meyer for helping me there!)

There are probably going to be further steps in the Qt5 toolchain.

Actually, let's try prebuilt binaries

The next day with a fresh mind we realised that it is preferable to reduce our tampering with the original wheezy to a minimum. Our current plan is to use wheezy's original Qt and Qt-using packages, and use Qt's prebuilt binaries in /opt for all our custom software.

We run Qt's installer, tarred the result, and wrapped it in a Debian package like this:

$ cat debian/rules
#!/usr/bin/make -f

QT_VERSION = 5.3

%:
    dh $@

override_dh_auto_build:
    dh_auto_build
    sed -re 's/@QT_VERSION@/$(QT_VERSION)/g' debian-rules.inc.in > debian-rules.inc

override_dh_auto_install:
    dh_auto_install
    # Download and untar the prebuild Qt5 binaries
    install -d -o root -g root -m 0755 debian/our-qt5-sdk/opt/Qt
    curl http://localserver/Qt$(QT_VERSION).tar.xz | xz -d | tar -C debian/our-qt5-sdk/opt -xf -
    # Move the runtime part to our-qt5
    install -d -o root -g root -m 0755 debian/our-qt5/opt/Qt
    mv debian/our-qt5-sdk/opt/Qt/$(QT_VERSION) debian/our-qt5/opt/Qt/
    # Makes dpkg-shlibdeps work on packages built with Qt from /opt
    # Hack. Don't try this at home. Don't ever do this unless you
    # know what you are doing. This voids your warranty. If you
    # know what you are doing, you won't do this.
    find debian/our-qt5/opt/Qt/$(QT_VERSION)/gcc_64/lib -maxdepth 1 -type f -name "lib*.so*" \
        | sed -re 's,^.+/(lib[^.]+)\.so.+$$,\1 5 our-qt5 (>= $(QT_VERSION)),' > debian/our-qt5.shlibs


$ cat debian-rules.inc.in
export PATH := /opt/Qt/@QT_VERSION@/gcc_64/bin:$(PATH)
export QMAKESPEC=/opt/Qt/@QT_VERSION@/gcc_64/mkspecs/linux-clang/

To build one of our packages using Qt5.3 and clang, we just add this to its debian/rules:

include /usr/share/our-qt5/debian-rules.inc

Wrap up

We got the dependencies sorted. Hopefully the mirror will rebuild itself tonight and tomorrow we can resume working on our custom live system.

Posted Thu May 29 18:05:17 2014 Tags:

On responsibilities

I feel like in my Debian projects I have two roles: the person with the responsibility of making the project happen, and the person who does the work to make it happen.

As the person responsible for the project, I need to keep track of vision, goals, milestones, status. To make announcements, find contributors, motivate them, deal with users and bug reports, maintain documentation, digest feedback.

As the person who does the work to make it happen, I need quiet time, I need to study technology, design code, write unit tests, merge patches, code, code, code, ask around about deployment information, more code.

I have a hard time doing both things at the same time: the first engages my social skills and extroversion, requires low-latency interaction, and acting when outside things happen. The second engages my technical skills and introversion, requires quiet uninterrupted periods of flow, and acting when inspiration strikes. I never managed to make good use of "gift bugs" or "minions": I often found the phrase "it's easier for me to do than to explain it" sadly relevant. Now I understand that it's not because of the objective difficulty of explaining or doing things, nor about the value of doing or of involving people. It's about switching from one kind of workflow to another. If I rephrase that as "it's easier for me to stay in flux and fix it, than to switch my entire attitude to ask for help".

Of course this does not scale: we've all been saying it since I can remember.

Looking at the situation from the point of view of those two roles, however, I now wonder if those two roles shouldn't really require two people. In other worlds they are: the project managers, taking responsibility for making the project happen, and the software designers, artists, and all other kind of artisans doing the work to make it happen.

Of course I don't want the kind of project manager that shifts responsibilities to artisans, does nothing and takes the credit for the project: not in paid work, not in Debian.

Project management is something else.

I would be interested instead in having the kind of project manager that takes responsibility for the project, checks how the artisans are doing and communicates what is happening to the rest of the world, deals with the community, motivates more people to help, test, try, use, give feedback on things as they happen. A project manager / community manager.

So that while I'm flux there is someone who tags bugs as "gift", mentors people to find code and documentation, and remembers to write an announcement if I implemented three cool things in a row and I'm already busy working on the fourth.

So that I don't write cool ideas in my todo list where nobody can read them, but I can share them to a mailing list where someone picks up a relevant one and finds someone to make it happen while I'm busy refactoring old code that only I can understand.

So that if I say "sorry, paid work calls, I won't be able to work on this project for a month", I'll be able to completely forget about that project for a whole month, without leaving the community out there to die.

That's an interesting job for non-uploading DDs: please take over my projects. Let's share a vision, and team up to make it happen. Give me the freedom of being the craftsman I enjoy being, and take away from me those responsibilities that I've never asked for.

The worst project managers are those that never asked to be one, but were promoted to it. Let's not repeat that mistake in Debian.

A good part of the credits for this post go to Francesca Ciceri, for the discussions we had on our way back from MiniDebConf Barcelona 2014.

P.S. I'm seeing how a non-uploading DD could be in the Maintainer field for one or more packages, with uploading DDs being, well, uploaders. Food for thought.

Posted Tue Mar 18 16:58:58 2014 Tags:

Habits

Beware of habits. I've seen them turn into expectations over time, without any kind of negotiation.

Posted Thu Feb 13 01:22:59 2014 Tags:

An absolute truth

Every time people phrase their own opinions as absolute truths, they look grotesque and they incite violence.

If you now feel like stabbing me, then you may be seeing my point.

Posted Tue Feb 11 14:33:18 2014 Tags:
Posted Sat Jun 6 00:57:39 2009

Pages about OpenMoko.

Released nodm 0.7

I have released version 0.7 of nodm.

It only fixes one silly typo in autotools, which made it fail to build on Fedora.

Posted Sun May 23 21:36:52 2010 Tags:

Released nodm 0.6

I have released version 0.6 of nodm.

It is purely a bug fix release, trying harder to detect a console in order to get rid of a bug introduced with version 0.5

Posted Mon Aug 3 12:34:16 2009 Tags:

Released nodm 0.5

I have released version 0.5 of nodm.

New features:

  • truncate ~/.xsession-errors on startup: finally that file stops growing, and growing, and growing...
  • dynamic VT allocation: it can now avoid opening a virtual terminal if it is already in use.
Posted Fri Jul 24 02:29:55 2009 Tags:

Getting dbus signatures right from Vala

I am trying to play a bit with Vala on the FreeRunner.

The freesmartphone.org stack on the OpenMoko is heavily based on DBus. Using DBus from Vala is rather simple, if mostly undocumented: you get a few examples in the Vala wiki and you make do with those.

All works fine with simple methods. But what with providing callbacks to signals that have complex nested structures in their signatures, like aa{sv}? You try, and then if you don't get the method signature right, the signal is just silently not delivered because it does not match the method signature.

So this is how to provide a callback to org.freesmartphone.Usage.ResourceChanged, with signature sba{sv}:

public void on_resourcechanged(dynamic DBus.Object pos,
                   string name,
                   bool state,
                   HashTable<string, Value?> attributes)
{
    stderr.printf("Resource %s changed\n", name);
}

And this is how to provide a callback to org.freesmartphone.GPS.UBX.DebugPacket, with signature siaa{sv}:

protected void on_ubxdebug_packet(dynamic DBus.Object ubx, string clid, int length,
        HashTable<string, Value?>[] wrongdata)
{
    stderr.printf("Received UBX debug packet");

    // Ugly ugly work-around
    PtrArray< HashTable<string, Value?> >* data = (PtrArray< HashTable<string, Value?> >)wrongdata;

    stderr.printf("%u elements received", data->len);
}

What is happening here is that the only method signature that I found matching the dbus signature is this one. However, the unmarshaller for some reason gets it wrong, and passes a PtrArray instead of a HashTable array. So you need to cast it back to what you've actually been passed.

Figuring all this out took several long hours and was definitely not fun.

Posted Wed Jul 15 12:30:50 2009 Tags:

Mapping using the Openmoko FreeRunner headset

The FreeRunner has a headset which includes a microphone and a button. When doing OpenStreetMap mapping, it would be very useful to be able to keep tangogps on the display and be able to mark waypoints using the headset button, and to record an audio track using the headset microphone.

In this way, I can use tangogps to see where I need to go, where it's already mapped and where it isn't, and then I can use the headset to mark waypoints corresponding to the audio track, so that later I can take advantage of JOSM's audio mapping features.

Enter audiomap:

$ audiomap --help
Usage: audiomap [options]

Create a GPX and audio trackFind the times in the wav file when there is clear
voice among the noise

Options:
  --version      show program's version number and exit
  -h, --help     show this help message and exit
  -v, --verbose  verbose mode
  -m, --monitor  only keep the GPS on and monitor satellite status
  -l, --levels   only show input levels

If called without parameters, or with -v which is suggested, it will:

  1. Fix the mixer settings so that it can record from the headset and detect headset button presses.
  2. Show a monitor of GPS satellite information until it gets a fix.
  3. Synchronize the system time with the GPS time so that the timestamps of the files that are created afterwards are accurate.
  4. Start recording a GPX track.
  5. Start recording audio.
  6. Record a GPX waypoint for every headset button press.

When you are done, you stop audiomap with ^C and it will properly close the .wav file, close the tags in the GPX waypoint and track files and restore the mixer settings.

You can plug the headset out and record using the handset microphone, but then you will not be able to set waypoints until you plug the headset back in.

After you stop audiomap, you will have a track, waypoints and .wav file ready to be loaded in JOSM.

Big thanks go to Luca Capello for finding out how to detect headset button presses.

Posted Sun Jun 7 23:51:37 2009 Tags:

Simple tool to query the GPS using the OpenMoko FSO stack

I was missing a simple command line tool that allows me to perform basic GPS queries in shellscripts.

Enter getgps:

# getgps --help
Usage: getgps [options]

Simple GPS query tool for the FSO stack

Options:
  --version          show program's version number and exit
  -h, --help         show this help message and exit
  -v, --verbose      verbose mode
  -q, --quiet        suppress normal output
  --fix              check if we have a fix
  -s, --sync-time    set system time from GPS time
  --info             get all GPS information
  --info-connection  get GPS connection information
  --info-fix         get GPS fix information
  --info-position    get GPS position information
  --info-accuracy    get GPS accuracy information
  --info-course      get GPS course information
  --info-time        get GPS time information
  --info-satellite   get GPS satellite information

So finally I can write little GPS-aware scripts:

if getgps --fix -q
then
    start_gps_aware_program
else
    start_gps_normal_program
fi

Or this.

Posted Sun Jun 7 17:59:32 2009 Tags:

Voice-controlled waypoints

I have it in my TODO list to implement taking waypoints when pressing the headset button of the openmoko, but that is not done yet.

In the meantime, I did some experiments with audio mapping, and since I did not manage to enter waypoints while recording them, I was looking for a way to make use of them anyway.

Enter findvoice:

$ ./findvoice  --help
Usage: findvoice [options] wavfile

Find the times in the wav file when there is clear voice among the noise

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -v, --verbose         verbose mode
  -p NUM, --percentile=NUM
            percentile to use to discriminate noise from voice
            (default: 90)
  -t, --timestamps      print timestamps instead of human readable information

You give it a wav file, and it will output a list of timestamps corresponding to where it things that you were talking clearly and near the FreeRunner / voice recorder instead of leaving the recorder dangling to pick up background noise.

Its algorithm is crude and improvised because I have no background whatsoever in audio processing, but it basically finds those parts of the audio file where the variance of the samples is above a given percentile: the higher the percentile, the less timestamps you get; the lower the percentile, the more likely it is that it picks a period of louder noise.

For example, you can automatically extract waypoints out of an audio file by using it together with gpxinterpolate:

./findvoice -t today.wav | ./gpxinterpolate today.gpx > today-waypoints.gpx

The timestamps it outputs are computed using the modification time of the .wav file: if your system clock was decently synchronised (which you can do with getgps), then the mtime of the wav is the time of the end of the recording, which gives the needed reference to compute timestamps that are absolute in time.

For example:

getgps --sync-time
arecord file.wav
^C
./findvoice -t file.wav | ./gpxinterpolate today.gpx > today-waypoints.gpx
Posted Sun Jun 7 02:48:40 2009 Tags:

Geocoding Unix timestamps

Geocoding EXIF tags in JPEG images is fun, but there is more that can benefit from interpolating timestamps over a GPX track.

Enter gpxinterpolate:

$ ./gpxinterpolate --help
Usage: gpxinterpolate [options] gpxfile [gpxfile...]

Read one or more GPX files and a list of timestamps on standard input. Output
a GPX file with waypoints at the location of the GPX track at the given
timestamps.

Options:
  --version      show program's version number and exit
  -h, --help     show this help message and exit
  -v, --verbose  verbose mode

For example, you can create waypoints interpolating file modification times:

find . -printf "%Ts %p\n" | ./gpxinterpolate ~/tracks/*.gpx > myfiles.gpx

In case you wonder where you were when you modified or accessed a file, now you can find out.

Posted Sun Jun 7 02:07:43 2009 Tags:

Recording audio on the FreeRunner

The FreeRunner can record audio. It is nice to record audio: for example I can run the recording in background while I keep tangogps in the screen, and take audio notes about where I am while I am doing mapping for OpenStreetMap.

Here is the script that I put together to create geocoded audio notes:

#!/bin/sh

WORKDIR=~/rec
TMPINFO=`mktemp $WORKDIR/info.XXXXXXXX`

# Sync system time and get GPS info
echo "Synchronising system time..."
getgps --sync-time --info > $TMPINFO

# Compute an accurate basename for the files we generate
BASENAME=~/rec/rec-$(date +%Y-%m-%d-%H-%M-%S)
# Then give a proper name to the file with saved info
mv $TMPINFO $BASENAME.info

# Proper mixer settings for recording
echo "Recording..."
alsactl -f /usr/share/openmoko/scenarios/voip-handset.state restore
arecord -D hw -f cd -r 8000 -t wav $BASENAME.wav

echo "Done"

It works like this:

  1. It synchronizes the system time from the GPS (if there is a fix) so that the timestamps on the wav files will be as accurate as possible.
  2. It also gets all sort of information from the GPS and stores them into a file, should you want to inspect it later.
  3. It records audio until it gets interrupted.

The file name of the files that it generates corresponds to the beginning of the recording. The mtime of the wav file obviously corresponds to the end of the recording. This can be used to later georeference the start and end point of the recording.

You can use this to check mixer levels and that you're actually getting any input:

arecord -D hw -f cd -r 8000 -t wav -V mono /dev/null

The getgps script is now described in its own post.

You may now want to experiment, in JOSM, with "Preferences / Audio settings / Modified times (time stamps) of audio files".

Posted Sun Jun 7 01:30:37 2009 Tags:

How to read the Freerunner's accelerometers

This code has been take from moko_eightball by Jakob Westhoff: it just continuously prints the value of the three accelerometers.

#include <stdio.h>
#include <stdint.h>

void processInputEvents(FILE* in)
{
    int x = 0, y = 0, z = 0;
    while (1)
    {
        char padding[16];
        uint16_t type, code;
        int32_t value;

        // Skip the timestamp
        fread(padding, 1, 8, in);

        // Read the type
        fread(&type, 1, 2, in);

        // Read the code
        fread(&code, 1, 2, in);

        // Read the value
        fread(&value, 1, 4, in);

        switch( type )
        {
            case 0:
                switch( code )
                {
                    case 0:
                        fprintf(stdout, "x%d y%d z%d\n", x, y, z);
                        break;
                    default:
                        //warning( "Unknown code ( 0x%02x ) for type 0x%02x\n", code, type );
                        break;
                }
                break;
            case 2:
                switch ( code )
                {
                    case 0:
                        // Update to the new value
                        x = value;
                        break;
                    case 1:
                        // Update to the new value
                        y = value;
                        break;
                    case 2:
                        // Update to the new value
                        z = value;
                        break;
                    default:
                        //warning( "Unknown code ( 0x%02x ) for type 0x%02x\n", code, type );
                        break;
                }
                break;

            default:
                //warning( "Unknown type ( 0x%02x ) in accelerometer input stream\n", type );
                break;
        }


    }
}


int main()
{
    FILE* in = fopen("/dev/input/event2", "r");
    processInputEvents(in);
    fclose(in);
    return 0;
}
Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Jun 6 00:57:39 2009
ppy

Posts for Planet Python.

Custom function decorators with TurboGears 2

I am exposing some library functions using a TurboGears2 controller (see web-api-with-turbogears2). It turns out that some functions return a dict, some a list, some a string, and TurboGears 2 only allows JSON serialisation for dicts.

A simple work-around for this is to wrap the function result into a dict, something like this:

@expose("json")
@validate(validator_dispatcher, error_handler=api_validation_error)
def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
    # Call API
    res = self.engine.list_colours(filter, productID, maxResults)

    # Return result
    return dict(r=res)

It would be nice, however, to have an @webapi() decorator that automatically wraps the function result with the dict:

def webapi(func):
    def dict_wrap(*args, **kw):
        return dict(r=func(*args, **kw))
    return dict_wrap

# ...in the controller...

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    @webapi
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

This works, as long as @webapi appears last in the list of decorators. This is because if it appears last it will be the first to wrap the function, and so it will not interfere with the tg.decorators machinery.

Would it be possible to create a decorator that can be put anywhere among the decorator list? Yes, it is possible but tricky, and it gives me the feeling that it may break in any future version of TurboGears:

class webapi(object):
    def __call__(self, func):
        def dict_wrap(*args, **kw):
            return dict(r=func(*args, **kw))
        # Migrate the decoration attribute to our new function
        if hasattr(func, 'decoration'):
            dict_wrap.decoration = func.decoration
            dict_wrap.decoration.controller = dict_wrap
            delattr(func, 'decoration')
        return dict_wrap

# ...in the controller...

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    @webapi
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

As a convenience, TurboGears 2 offers, in the decorators module, a way to build decorator "hooks":

class before_validate(_hook_decorator):
    '''A list of callables to be run before validation is performed'''
    hook_name = 'before_validate'

class before_call(_hook_decorator):
    '''A list of callables to be run before the controller method is called'''
    hook_name = 'before_call'

class before_render(_hook_decorator):
    '''A list of callables to be run before the template is rendered'''
    hook_name = 'before_render'

class after_render(_hook_decorator):
    '''A list of callables to be run after the template is rendered.

    Will be run before it is returned returned up the WSGI stack'''

    hook_name = 'after_render'

The way these are invoked can be found in the _perform_call function in tg/controllers.py.

To show an example use of those hooks, let's add a some polygen wisdom to every data structure we return:

class wisdom(decorators.before_render):
    def __init__(self, grammar):
        super(wisdom, self).__init__(self.add_wisdom)
        self.grammar = grammar
    def add_wisdom(self, remainder, params, output):
        from subprocess import Popen, PIPE
        output["wisdom"] = Popen(["polyrun", self.grammar], stdout=PIPE).communicate()[0]

# ...in the controller...

    @wisdom("genius")
    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

These hooks cannot however be used for what I need, that is, to wrap the result inside a dict. The reason is because they are called in this way:

        controller.decoration.run_hooks(
                'before_render', remainder, params, output)

and not in this way:

        output = controller.decoration.run_hooks(
                'before_render', remainder, params, output)

So it is possible to modify the output (if it is a mutable structure) but not to exchange it with something else.

Can we do even better? Sure we can. We can assimilate @expose and @validate inside @webapi to avoid repeating those same many decorator lines over and over again:

class webapi(object):
    def __init__(self, error_handler = None):
        self.error_handler = error_handler

    def __call__(self, func):
        def dict_wrap(*args, **kw):
            return dict(r=func(*args, **kw))
        res = expose("json")(dict_wrap)
        res = validate(validator_dispatcher, error_handler=self.error_handler)(res)
        return res

# ...in the controller...

    @expose("json")
    def api_validation_error(self, **kw):
        pylons.response.status = "400 Error"
        return dict(e="validation error on input fields", form_errors=pylons.c.form_errors)

    @webapi(error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

This got rid of @expose and @validate, and provides almost all the default values that I need. Unfortunately I could not find out how to access api_validation_error from the decorator so that I can pass it to the validator, therefore I remain with the inconvenience of having to explicitly pass it every time.

Posted Wed Nov 4 17:52:38 2009 Tags:

Building a web-based API with Turbogears2

I am using TurboGears2 to export a python API over the web. Every API method is wrapper by a controller method that validates the parameters and returns the results encoded in JSON.

The basic idea is this:

@expose("json")
def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
    # Call API
    res = self.engine.list_colours(filter, productID, maxResults)

    # Return result
    return res

To validate the parameters we can use forms, it's their job after all:

class ListColoursForm(TableForm):
    fields = [
            # One field per parameter
            twf.TextField("filter", help_text="Please enter the string to use as a filter"),
            twf.TextField("productID", help_text="Please enter the product ID"),
            twf.TextField("maxResults", validator=twfv.Int(min=0), default=200, size=5, help_text="Please enter the maximum number of results"),
    ]
list_colours_form=ListColoursForm()

#...

    @expose("json")
    @validate(list_colours_form, error_handler=list_colours_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Parameter validation is done by the form
    
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

All straightforward so far. However, this means that we need two exposed methods for every API call: one for the API call and one error handler. For every API call, we have to type the name several times, which is error prone and risks to get things mixed up.

We can however have a single error handler for all methonds:

def get_method():
    '''
    The method name is the first url component after the controller name that
    does not start with 'test'
    '''
    found_controller = False
    for name in pylons.c.url.split("/"):
        if not found_controller and name == "controllername":
            found_controller = True
            continue
        if name.startswith("test"):
            continue
        if found_controller:
            return name
    return None

class ValidatorDispatcher:
    '''
    Validate using the right form according to the value of the "method" field
    '''
    def validate(self, args, state):
        method = args.get("method", None)
    # Extract the method from the URL if it is missing
        if method is None:
            method = get_method()
            args["method"] = method
        return forms[method].validate(args, state)

validator_dispatcher = ValidatorDispatcher()

This validator will try to find the method name, either as a form field or by parsing the URL. It will then use the method name to find the form to use for validation, and pass control to the validate method of that form.

We then need to add an extra "method" field to our forms, and arrange the forms inside a dictionary:

class ListColoursForm(TableForm):
    fields = [
            # One hidden field to have a place for the method name
            twf.HiddenField("method")
            # One field per parameter
            twf.TextField("filter", help_text="Please enter the string to use as a filter"),
    #...

forms["list_colours"] = ListColoursForm()

And now our methods become much nicer to write:

    @expose("json")
    def api_validation_error(self, **kw):
        pylons.response.status = "400 Error"
        return dict(form_errors=pylons.c.form_errors)

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Parameter validation is done by the form
    
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

api_validation_error is interesting: it returns a proper HTTP error status, and a JSON body with the details of the error, taken straight from the form validators. It took me a while to find out that the form errors are in pylons.c.form_errors (and for reference, the form values are in pylons.c.form_values). pylons.response is a WebOb Response that we can play with.

So now our client side is able to call the API methods, and get a proper error if it calls them wrong.

But now that we have the forms ready, it doesn't take much to display them in web pages as well:

def _describe(self, method):
    "Return a dict describing an API method"
    ldesc = getattr(self.engine, method).__doc__.strip()
    sdesc = ldesc.split("\n")[0]
    return dict(name=method, sdesc = sdesc, ldesc = ldesc)

@expose("myappserver.templates.myappapi")
def index(self):
    '''
    Show an index of exported API methods
    '''
    methods = dict()
    for m in forms.keys():
        methods[m] = self._describe(m)
    return dict(methods=methods)

@expose('myappserver.templates.testform')
def testform(self, method, **kw):
    '''
    Show a form with the parameters of an API method
    '''
    kw["method"] = method
    return dict(method=method, action="/myapp/test/"+method, value=kw, info=self._describe(method), form=forms[method])

@expose(content_type="text/plain")
@validate(validator_dispatcher, error_handler=testform)
def test(self, method, **kw):
    '''
    Run an API method and show its prettyprinted result
    '''
    res = getattr(self, str(method))(**kw)
    return pprint.pformat(res)

In a few lines, we have all we need: an index of the API methods (including their documentation taken from the docstrings!), and for each method a form to invoke it and a page to see the results.

Make the forms children of AjaxForm, and you can even see the results together with the form.

Posted Thu Oct 15 15:45:39 2009 Tags:

Creating pipelines with subprocess

It is possible to create process pipelines using subprocess.Popen, by just using stdout=subprocess.PIPE and stdin=otherproc.stdout.

Almost.

In a pipeline created in this way, the stdout of all processes except the last is opened twice: once in the script that has run the subprocess and another time in the standard input of the next process in the pipeline.

This is a problem because if a process closes its stdin, the previous process in the pipeline does not get SIGPIPE when trying to write to its stdout, because that pipe is still open on the caller process. If this happens, a wait on that process will hang forever: the child process waits for the parent to read its stdout, the parent process waits for the child process to exit.

The trick is to close the stdout of each process in the pipeline except the last just after creating them:

#!/usr/bin/python
# coding=utf-8

import subprocess

def pipe(*args):
    '''
    Takes as parameters several dicts, each with the same
    parameters passed to popen.

    Runs the various processes in a pipeline, connecting
    the stdout of every process except the last with the
    stdin of the next process.
    '''
    if len(args) < 2:
        raise ValueError, "pipe needs at least 2 processes"
    # Set stdout=PIPE in every subprocess except the last
    for i in args[:-1]:
        i["stdout"] = subprocess.PIPE

    # Runs all subprocesses connecting stdins and stdouts to create the
    # pipeline. Closes stdouts to avoid deadlocks.
    popens = [subprocess.Popen(**args[0])]
    for i in range(1,len(args)):
        args[i]["stdin"] = popens[i-1].stdout
        popens.append(subprocess.Popen(**args[i]))
        popens[i-1].stdout.close()

    # Returns the array of subprocesses just created
    return popens

At this point, it's nice to write a function that waits for the whole pipeline to terminate and returns an array of result codes:

def pipe_wait(popens):
    '''
    Given an array of Popen objects returned by the
    pipe method, wait for all processes to terminate
    and return the array with their return values.
    '''
    results = [0] * len(popens)
    while popens:
        last = popens.pop(-1)
        results[len(popens)] = last.wait()
    return results

And, look and behold, we can now easily run a pipeline and get the return codes of every single process in it:

process1 = dict(args='sleep 1; grep line2 testfile', shell=True)
process2 = dict(args='awk \'{print $3}\'', shell=True)
process3 = dict(args='true', shell=True)
popens = pipe(process1, process2, process3)
result = pipe_wait(popens)
print result

Update: Colin Watson suggests an improvement to compensate for Python's nonstandard SIGPIPE handling.

Colin Watson has a similar library for C.

Posted Wed Jul 1 09:08:06 2009 Tags:

TurboGears RemoteForm tip

In case your RemoteForm misteriously behaves like a normal HTTP form, refreshing the page on submit, and the only hint that there's something wrong is this bit in the Iceweasel's error console:

Errore: uncaught exception: [Exception... "Component returned failure
code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIXMLHttpRequest.open]"
nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame ::
javascript: eval(__firebugTemp__); :: anonymous :: line 1"  data: no]

the problem can just be a missing action= attribute to the form.

I found out after:

  1. reading the TurboGears remoteform wiki: "For some reason, the RemoteForm is acting like a regular html form, serving up a new page instead of performing the replacements we're looking for. I'll update this page as soon as I figure out why this is happening."

  2. finding this page on Google and meditating for a while while staring at it. I don't speak German, but often enough I manage to solve problems after meditating over Google results in all sorts of languages unknown or unreadable to me. I will call this practice Webomancy.

Posted Sat Jun 6 00:57:39 2009 Tags:

Python scoping

How do you create a list of similar functions in Python?

As a simple example, let's say we want to create an array of 10 elements like this:

a[0] = lambda x: x
a[1] = lambda x: x+1
a[2] = lambda x: x+2
...
a[9] = lambda x: x+9

Simple:

>>> a = []
>>> for i in range(0,10): a.append(lambda x: x+i)
...

...but wrong:

>>> a[0](1)
10

What happened here? In Python, that lambda x: x+i uses the value that i will have when the function is invoked.

This is the trick to get it right:

>>> a = []
>>> for i in range(0,10): a.append(lambda x, i=i: x + i)
...
>>> a[0](1)
1

What happens here is explained in the section "A Jedi Mind Trick" of the Instant Python article: i=i assigns as the default value of the parameter i the current value of i.

Strangely enough the same article has "A Note About Python 2.1 and Nested Scopes" which seems to imply that from Python 2.2 the scoping has changed to "work as it should". I don't understand: the examples above are run on Python 2.4.4.

Googling for keywords related to python closure scoping only yields various sorts of complicated PEPs and an even uglier list trick:

a lot of people might not know about the trick of using a list to box variables within a closure.

Now I know about the trick, but I wish I didn't need to know :-(

Posted Sat Jun 6 00:57:39 2009 Tags:

Linking to self in turbogears

I want to put in my master.kid some icons that allow to change the current language for the session.

First, all user-accessible methods need to handle a 'language' parameter:

@expose(template="myapp.templates.foobar")
def index(self, someparam, **kw):
    if 'language' in kw: turbogears.i18n.set_session_locale(kw['language'])

Then, we need a way to edit the current URL so that we can generate modified links to self that preserve the existing path_info and query parameters. In your main controller, add:

def linkself(**kw):
    params = {}
    params.update(cherrypy.request.params)
    params.update(kw)
    url = cherrypy.request.browser_url.split('?', 1)[0]
    return url + '?' + '&'.join(['='.join(x) for x in params.iteritems()])

def add_custom_stdvars(vars):
    return vars.update({"linkself": linkself})

turbogears.view.variable_providers.append(add_custom_stdvars)

(see the turbogears stdvars documentation and the cherrypy request documentation (cherrypy 2 documentation at the bottom of the page))

And finally, in master.kid:

<div id="footer">
  <div id="langselector">
    <span class="language">
      <a href="${tg.linkself(language='it_IT')}">
        <img src="${tg.url('/static/images/it.png')}"/>
      </a>
    </span>

    <span class="language">
      <a href="${tg.linkself(language='C')}">
        <img src="${tg.url('/static/images/en.png')}"/>
      </a>
    </span>
  </div><!-- langselector -->
</div><!-- footer -->
Posted Sat Jun 6 00:57:39 2009 Tags:

Turbogears quirks when testing controllers that use SingleSelectField

Suppose you have a User that can be a member of a Company. In SQLObject you model it somehow like this:

    class Company(SQLObject):
        name = UnicodeCol(length=16, alternateID=True, alternateMethodName="by_name")
        display_name = UnicodeCol(length=255)

    class User(InheritableSQLObject):
        company = ForeignKey("Company", notNull=False, cascade='null')

Then you want to make a form that allows to choose what is the company of a user:

def companies():
    return [ [ -1, 'None' ] ] + [ [c.id, c.display_name] for c in Company.select() ]

class NewUserFields(WidgetsList):
    """Fields for editing general settings"""
    user_name = TextField(label="User name")
    companyID = SingleSelectField(label="Company", options=companies)

Ok. Now you want to run tests:

  1. nosetests imports the controller to see if there's any initialisation code.
  2. The NewUserFields class is created.
  3. The SingleSelectField is created.
  4. The SingleSelectField constructor tries to guess the validator and peeks at the first option.
  5. This calls companies.
  6. companies accesses the database.
  7. The testing database has not yet been created because nosetests imported the module before giving the test code a chance to setup the test database.
  8. Bang.

The solution is to add an explicit validator to disable this guessing code that is a source of so many troubles:

class NewUserFields(WidgetsList):
    """Fields for editing general settings"""
    user_name = TextField(label="User name")
    companyID = SingleSelectField(label="Company", options=companies, validator=v.Int(not_empty=True))
Posted Sat Jun 6 00:57:39 2009 Tags:

Passing values to turbogears widgets at display time (the general case)

Last time I dug this up I was not clear enough in documenting my findings, so I had to find them again. Here is the second attempt.

In Turbogears, in order to pass parameters to arbitrary widgets in a compound widget, the syntax is:

form.display(PARAMNAME=dict(WIDGETNAME=VALUE))

And if you have more complex nested widgets and would like to know what goes on, this monkey patch is good for inspecting the params lookup functions:

import turbogears.widgets.forms
old_rpbp = turbogears.widgets.forms.retrieve_params_by_path
def inspect_rpbp(params, path):
    print "RPBP", repr(params), repr(path)
    res = old_rpbp(params, path)
    print "RPBP RES", res
    return res
turbogears.widgets.forms.retrieve_params_by_path = inspect_rpbp

The code for the lookup itself is, as the name suggests, in the retrieve_params_by_path function in the file widgets/forms.py in the Turbogears source code.

Posted Sat Jun 6 00:57:39 2009 Tags:

Turbogears form quirk

I had a great idea:

@validate(model_form)
@error_handler()
@expose(template='kid:myproject.templates.new')
def new(self, id, tg_errors=None, **kw):
    """Create new records in model"""
    if tg_errors:
        # Ask until there is still something missing
        return dict(record = defaults, form = model_form)
    else:
        # We have everything: save it
        i = Item(**kw)
        flash("Item was successfully created.")
        raise redirect("../show/%d" % i.id)

It was perfect: one simple method, simple error handling, nice helpful messages all around. Except, check boxes and select fields would not get the default values while all other fields would.

After two hours searching and cursing and tracing things into widget code, I found this bit in InputWidget.adjust_value:

# there are some input fields that when nothing is checked/selected
# instead of sending a nice name="" are totally missing from
# input_values, this little workaround let's us manage them nicely
# without interfering with other types of fields, we need this to
# keep track of their empty status otherwise if the form is going to be
# redisplayed for some errors they end up to use their defaults values
# instead of being empty since FE doesn't validate a failing Schema.
# posterity note: this is also why we need if_missing=None in
# validators.Schema, see ticket #696.

So, what is happening here is that since check boxes and option fields don't have a nice behaviour when unselected, turbogears has to work around it. So in order to detect the difference between "I selected 'None'" and "I didn't select anything", it reasons that if the input has been validated, then the user has made some selections, so it defaults to "The user selected 'None'". If the input has not been validated, then we're showing the form for the first time, then a missing value means "Use the default provided".

Since I was doing the validation all the time, this meant that Checkboxes and Select fields would never use the default values.

Hence, if you use those fields then you necessarily need two different controller methods, one to present the form and one to save it:

@expose(template='kid:myproject.templates.new')
def new(self, id, **kw):
    """Create new records in model"""
    return dict(record = defaults(), form = model_form)

@validate(model_form)
@error_handler(new)
@expose()
def savenew(self, id, **kw):
    """Create new records in model"""
    i = Item(**kw)
    flash("Item was successfully created.")
    raise redirect("../show/%d"%i.id)

If someone else stumbles on the same problem, I hope they'll find this post and they won't have to spend another two awful hours tracking it down again.

Posted Sat Jun 6 00:57:39 2009 Tags:

Passing values to turbogears widgets at display time

In turbogears, I often need to pass data to widgets at display time. Sometimes it works automatically, but sometimes, in cases like passing option lists to CheckBoxLists or number of repetitions in a RepeatingFieldSet, it doesn't.

All the examples use precomputed lists or pass simple code functions. In most of my cases, I want them computed by the controller every time.

Passing a function hasn't worked, as I did not find any obvious way to have the function know about the controller.

So I need to pass things the display() method of the widgets, but I could not work out how to pass the option list and default list for a CheckBoxList that is part of a WidgetsList in a TableForm.

On IRC came the answer, thanks to Xentac:

you should be able to...
    tableform.display(options=dict(checkboxname=[optionlist]))

And yes, it works. I can pass the default value as one of the normal form values:

    tableform.display(values=dict(checkboxname=[values]), options=dict(checkboxname=[optionlist]))
Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Jun 6 00:57:39 2009

Localising free software for Taiwanese Aboriginal cultures.

People who participated so far:

Character list for the Amis language

We mapped the available glyphs and accents for the Amis language.

The letters in alphabetical order:

    a c d f ng h i k l m n o p r s t u w y

Everyone of them can get an acute or circumflex accent on top. ng can get a dot on top of the g.

The accents are literally on top: i would get the dot PLUS the accent on top.

Not all accented characters directly exist in Unicode; however Unicode developed various kinds of combination features to take care of these cases.

Then we need an input method that would insert ng instead of g and allow to type all the accent combinations.

Here is the full character set:

    a     á    â
    c     ć    ĉ
    d     d́    d̂
    f     f́    f̂
    ng    nǵ   nĝ  nġ
    h     h́    ĥ
    i     i̇́    i̇̂
    k     ḱ    k̂
    l     ĺ    l̂
    m     ḿ    m̂
    n     ń    n̂
    o     ó    ô
    p     ṕ    p̂
    r     ŕ    r̂
    s     ś    ŝ
    t     t́    t̂
    u     ú    û
    w     ẃ    ŵ
    y     ý    ŷ

Update: this character list has been improved and the good version is found in the Debian wiki.

The list is not displayed correctly with many fonts or rendering engines. Arne made a test page that explicitly sets a font that works.

The accents are not taken into account when sorting.

Uppercase letters are not used.

Note: the page has been updated to reflect further input from Unicode and Amis people.

Update: there is now a wiki page on the Debian wiki.

Posted Sat Jun 6 00:57:39 2009 Tags:

Happy new year

A year ago we got in touch with various Taiwanese aboriginal tribes to try to start localisation efforts.

Thanks to the research the Taroko people did during 2007 and the prototype work of tonight, the Taroko people in Taiwan can see the computer calendar of the new year in their own language:

trv_TZW Gnome calendar

Posted Sat Jun 6 00:57:39 2009 Tags:

Character list for the Paiwan language

We mapped the available glyphs and accents for the Paiwan language.

The letters in alphabetical order:

a b c d e f h i j k l m n p q r s t u v w y z ḏ nġ ḻ ṟ ṯ 

No uppercase.

Update: this character list has been improved and the good version is found in the Debian wiki.

All the characters are in Unicode except nġ, which already needs to be requested for the Amis script.

We need to design an input method to enter the underlined letters and the nġ.

Update: there is now a wiki page on the Debian wiki.

Posted Sat Jun 6 00:57:39 2009 Tags:

Amis and Paiwan input method and character set

Arne Götje (高盛華) created:

The scripts, especially Amis, make heavy use of Unicode combination characters. They should display well at least with the Dejavu Sans font in many applications.

Try it out: if it displays correctly, you should see:

  • accented letters instead of letters next to accents.
  • i with both the dot and the accent.

Update: there is now a wiki page on the Debian wiki.

Posted Sat Jun 6 00:57:39 2009 Tags:

Creating a new locale

I'm currently in Cilamitay, in the east of Taiwan. There is a little meeting of Taiwanese Free Software people and people from the Amis, Taroko and Puyuma tribes, with the idea of starting localisation efforts for some aboriginal languages.

These are some of the issues we are going to discuss:

Language code

A new ISO standard (639-3) will hopefully be formalised in January that will include the language codes for the Taiwanese aboriginal tribes. We'll have to work some temporary solution, but there's good hope that it won't have to be temporary for long.

List of characters

Because of Christian missionary influence, both Amis and Taroko use a roman alphabet, with accents. We need to work out the complete list of character and accent combination, see if everything is in Unicode, see how they sort.

We then need to find a comfortable way to input them using the keyboards normally available here (English US layout): compose key? Dead keys? How about on Windows?

Womble2 on IRC tells me that on Windows one can works with MSKLC.

Technical terms and country list

We need to work out how to map terms that do not exist in the language.

Technical terms are usually borrowed from Japanese.

Names for all the countries in the world probably do not exist.

Translation interface

We need to find an easy to use interface to input the translations.

There is Rosetta.

There is Pootle. (Thanks to Christian Perrier for pointing me at it)

There is Webpot.

Update: there is now a wiki page on the Debian wiki.

Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Jun 6 00:57:39 2009

Food and recipes.

Coppone e spinaci all'orientale

Ingredienti:

  • una bistecca di coppone
  • spinaci surgelati
  • olio
  • aglio
  • zenzero
  • peperoncino
  • anice stellato
  • salsa di soia
  • olio di sesamo tostato
  • pepe

Al supermercato hanno spesso delle bistecchine di coppone in sconto. Sono ottime sulla griglia, ma in mancanza di griglia una volta ho improvvisato questo, e ogni tanto lo rifaccio. È una cena velocissima che si può preparare quando non c'è niente in casa con ingredienti presi fuori dal freezer.

Soffriggere nell'olio l'aglio, lo zenzero e l'anice stellato.

Aggiungere il coppone tagliato a pezzetti e farlo rosolare. Mentre sta cuocendo, aggiungere il peperoncino sbriciolato, l'olio di sesamo e un po' di salsa di soia.

Quando la carne ha preso colore, aggiungere gli spinaci scongelati e rosolarli assieme alla carne e al suo sugo.

Regolare di sale con la salsa di soia e spolverare di pepe macinato prima di servire.

Posted Sat Mar 9 18:42:22 2013 Tags:

Besciamella al caffè e acciughe

Ingredienti:

Ispirato da un soufflé di broccoli al caffè mangiato alla meravigliosa trattoria Antichi Sapori a Parma, ho provato anch'io a combinare caffè e broccoli.

L'idea era fare una salsa da versare sui broccoli appena lessati. Lo chef Davide Sensi aveva parlato di caffè e acciughe, quindi ho deciso che il sapore della salsa dovrà venire da caffè e salsa di pesce thai. Per addensarli, ci potrebbe stare anche solo un classico roux.

Il risultato, una besciamella in cui il roux non è allungato con latte, ma con caffè, salsa di pesce thai e acqua di cottura dei broccoli.

La prima prova è venuta un po' troppo carica di caffè. Sui broccoli però ci sta benino.

Posted Fri Mar 8 19:17:04 2013 Tags:

Aubergine soup

I, too, have been guilty of discovering a shrivelled aubergine in the bottom of the fridge, and I think I improved on the recipe a bit.

First I softened the onion in butter, then I added the crushed garlic, the aubergine peeled and diced, and the cumin seeds. I let them all roast in the pan for a while, until the aubergines took some colour, then I added the stock. Carefully, as the pan with the roasting aubergines is far above 100°C and the first splash of water turns into steam very quickly.

When it was all soft and yummy, I added two spoonfuls of tahini, as a thickener, a generous grating of nutmeg, and blended the lot.

What came out is basically a soup version of baba ganoush, and it is yummy!

Posted Thu Nov 8 21:15:29 2012 Tags:

Spaghetti con friggitelli e mozzarella

Dosi per 4 persone:

  • 300 gr di spaghetti
  • 300 gr di friggitelli
  • 70 gr di mollica di pane
  • 125 gr di mozzarella
  • parmigiano grattugiato q.b.
  • 4 foglioline di menta
  • olio extra-vergine q.b.
  • sale & pepe

Laviamo i peperoncini, togliamo i semini e il picciolo, asciughiamoli e tagliamoli a striscette.

Tritiamo nel mixer la mollica di pane e doriamola in padella con 3 cucchiai di l'olio, finchè non diventerà croccanate e, mettiamolo da parte.

Tagliamo la mozzarella a dadini e teniamo anch'essa da parte.

Scaldiamo altri 5 cucchiai di olio, uniamo i peperoncini e facciamoli cuocere a fiamma viva per 10 minuti, regolandoli di sale e pepe.

Cuociamo la pasta, scoliamola e ripassiamola in padella con i peperoncini.

Facciamo saltare il tutto a fiamma vivace per qualche minuto.

Aggiungiamo la mozzarella a dadini, il pane croccante e il parmigiano.

Come tocco finale, uniamo la menta spezzettata e serviamo.

(via http://friggitelli.it/)

Fatta oggi, buona. Per le briciole di pane ho usato un avanzo della farcitura dei carciofi alla romana di ieri sera.

Con l'acqua di cottura dei carciofi, stasera risotto.

Posted Mon Apr 30 16:08:55 2012 Tags:

Fagottini di pollo agli spinaci

È da tempo che cerco di capire come cucinare una buona bistecca, e finalmente ho trovato un sito di cucina che parla la mia lingua.

Giochiamo quindi con la Reazione di Maillard. Dopo un discreto successo con una bistecchina da quattro soldi, è venuto il momento di cimentarsi col pollo, che è l'unica carne che piace alla morosa.

La carne di pollo ha proteine, ma non abbastanza zuccheri perché avvenga la reazione di Maillard. Ergo, mariniamo la carne in qualcosa che contenga zuccheri.

Guggolando "pollo" e "marinata", esce questa bella ricetta "Petti di pollo ripieni al miele e aceto balsamico". La ricetta dice: "rosolatevi il pollo (3-4 minuti per lato, a fuoco medio)", ma io Maillard lo volevo guardare negli occhi e "fuoco medio" non mi bastava, e poi in Inghilterra non si trova lo speck, e io in casa avevo degli spinaci e non dell'"insalata Tatsoi", ergo, ho pistolato la ricetta come mio solito:

Arrosto di pollo ripieno di spinaci

Ingredienti:

  • 3 petti di pollo a fette
  • 3 fette di bacon magro, senza cotenna (siamo in Inghilterra...)
  • 6-7 cubetti di spinaci surgelati
  • 2 cucchiai di miele (meglio se d’Acacia)
  • 2 cucchiai di aceto balsamico
  • 1 cucchiaio di salsa di soia
  • 1 cipollotto
  • peperoncino secco sminuzzato
  • olio, sale, pepe nero
  • vari spicchi di aglio

Ho scongelato gli spinaci in un tegamino a fuoco basso, assieme a 4 o 5 spicchietti d'aglio schiacciati.

Nel frattempo ho fatto la marinata con miele, aceto balsamico, salsa di soia, il cipollotto tagliato finemente, uno spicchio o due d'aglio schiacciato, il peperoncino, due cucchiai d'olio, sale, pepe.

Ho poi spiattellato un po' i petti di pollo, ci ho messo sopra una fetta di pancetta, poi ho intonacato con uno strato di spinaci, arrotolato il tutto e legato con lo spago.

Pronti i fagotti li ho messi a mollo nella marinata. Li ho lasciati lí una buona mezz'oretta poi li ho girati e li ho lasciati lí un'altra mezz'oretta, in modo che si impregnassero e colorassero bene.

A questo punto, ho dato la molla al forno a 180° (per dopo) e ho messo sul fuoco una padella (io ho usato il wok antiaderente) con un pochino d'olio d'oliva.

Mi sono assicurato di non far danni con la fiamma vivace: la reazione di Maillard avviene oltre i 140°, il punto di fumo dell'olio d'oliva è dai 190° ai 240°, e quello del teflon dell'antiaderente è di 300°, quindi i margini ci sono.

Fiamma alta, olio caldo, giú il primo fagotto di pollo: due minuti per lato, col tegame coperto per limitare i danni degli schizzi. Ogni fagotto fatto da entrambi i lati l'ho poi messo in una pirofila, ci ho versato sopra un filo d'olio e ho messo tutto in forno per una 20ina di minuti, per stare nel sicuro perché, seppure a fiamma alta, 2 minuti per lato non mi sembravano abbastanza per cuocere il pollo e il bacon all'interno.

Tra un fagotto e l'altro vale la pena togliere dal tegame il grosso dei fondi e metterlo da parte, altrimenti a star lí per 3 fagotti su 2 lati c'è il rischio che bruci. Alla fine, col tegame bello incrostato, ci ho versato del vino, ho aggiunto i fondi messi da parte, e col fuoco basso e il cucchiaio di legno ho scrostato il tutto. Ho poi aggiunto un po' di zucchero per contrastare l'aspro del vino e ho lasciato restringere, dopodiché ho filtrato col colino e ho ottenuto una salsina deliziosa da cospargenere sui fagotti al momento di servire.

Purtroppo non ho la foto perché, vuoi l'aspetto vuoi il profumo, tutti e tre i fagotti sono spariti prima che ci venisse in mente di fare la foto.

Siccome era rimasto dell'unto invitante nel fondo della pirofila e il forno era ancora caldo, ci ho poi arrostito delle patate al forno. In Italia ci saranno 40 gradi, ma qui si fa fatica ad arrivare a 20.

Il tutto, annaffiato da una bottiglia di dolcetto del monferrato che trovammo tempo fa in sconto al supermercato: saporito com'era il pollo, un vino bianco non avrebbe avuto speranza.

Posted Sat Jun 6 00:57:39 2009 Tags:

Risotto ai funghi (e un po' di banana)

Avevo voglia di sperimentare, e in casa avevo delle banane. Cosa ci si può fare, con delle banane?

Entra in gioco http://www.foodpairing.be/

Questo sito raggruppa vari ingredienti in base alla comunanza delle sostanze chimiche che gli danno il sapore. E chi c'è vicino alla banana? I FUNGHI!

Facciamo quindi un risotto coi funghi: solito fondo di cipolla soffritta nel burro finché non diventa trasparente, e poi giú un pezzetto di banana tagliato a pezzettini sottili, a soffriggere anche lui e a caramellarsi un po'. Infine, qualche pezzetto di porcino secco rinvenuto in acqua tiepida.

Aggiungiamo poi il riso, lasciamolo soffriggere anche lui un po' nell'intingolo, e poi allunghiamo col brodo (io avevo un dado apposta per il risotto ai funghi comprato nel vicino negozietto di cose belle).

Niente sale, pepe, burro per mantecare, niente. Una volta cotto, l'ho solo lasciato a riposare per 5 minuti.

Il risultato è stato delizioso. "Ci hai messo la panna? Come fa a essere cosí cremoso?". Saporito ma non dolce. E la banana si sente che c'è, ma non si sente che è banana.

Da oggi mi sa che nel risotto ai funghi il mio ingrediente segreto sarà un pezzetto di banana.

Da notare che, come si legge in http://khymos.org/pairings.php, la banana sta anche bene col prezzemolo, e cosí i funghi. Mi son scordato il prezzemolo nel risotto in questo esperimento, ma ci sarà nel prossimo: ce l'abbiamo anche fresco in giardino. La stessa pagina parla anche di un probabile abbinamento molecolare cacao-funghi... chissà.

Pagine collegate:

Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Jun 6 00:57:39 2009

Python-related posts.

Custom function decorators with TurboGears 2

I am exposing some library functions using a TurboGears2 controller (see web-api-with-turbogears2). It turns out that some functions return a dict, some a list, some a string, and TurboGears 2 only allows JSON serialisation for dicts.

A simple work-around for this is to wrap the function result into a dict, something like this:

@expose("json")
@validate(validator_dispatcher, error_handler=api_validation_error)
def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
    # Call API
    res = self.engine.list_colours(filter, productID, maxResults)

    # Return result
    return dict(r=res)

It would be nice, however, to have an @webapi() decorator that automatically wraps the function result with the dict:

def webapi(func):
    def dict_wrap(*args, **kw):
        return dict(r=func(*args, **kw))
    return dict_wrap

# ...in the controller...

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    @webapi
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

This works, as long as @webapi appears last in the list of decorators. This is because if it appears last it will be the first to wrap the function, and so it will not interfere with the tg.decorators machinery.

Would it be possible to create a decorator that can be put anywhere among the decorator list? Yes, it is possible but tricky, and it gives me the feeling that it may break in any future version of TurboGears:

class webapi(object):
    def __call__(self, func):
        def dict_wrap(*args, **kw):
            return dict(r=func(*args, **kw))
        # Migrate the decoration attribute to our new function
        if hasattr(func, 'decoration'):
            dict_wrap.decoration = func.decoration
            dict_wrap.decoration.controller = dict_wrap
            delattr(func, 'decoration')
        return dict_wrap

# ...in the controller...

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    @webapi
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

As a convenience, TurboGears 2 offers, in the decorators module, a way to build decorator "hooks":

class before_validate(_hook_decorator):
    '''A list of callables to be run before validation is performed'''
    hook_name = 'before_validate'

class before_call(_hook_decorator):
    '''A list of callables to be run before the controller method is called'''
    hook_name = 'before_call'

class before_render(_hook_decorator):
    '''A list of callables to be run before the template is rendered'''
    hook_name = 'before_render'

class after_render(_hook_decorator):
    '''A list of callables to be run after the template is rendered.

    Will be run before it is returned returned up the WSGI stack'''

    hook_name = 'after_render'

The way these are invoked can be found in the _perform_call function in tg/controllers.py.

To show an example use of those hooks, let's add a some polygen wisdom to every data structure we return:

class wisdom(decorators.before_render):
    def __init__(self, grammar):
        super(wisdom, self).__init__(self.add_wisdom)
        self.grammar = grammar
    def add_wisdom(self, remainder, params, output):
        from subprocess import Popen, PIPE
        output["wisdom"] = Popen(["polyrun", self.grammar], stdout=PIPE).communicate()[0]

# ...in the controller...

    @wisdom("genius")
    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

These hooks cannot however be used for what I need, that is, to wrap the result inside a dict. The reason is because they are called in this way:

        controller.decoration.run_hooks(
                'before_render', remainder, params, output)

and not in this way:

        output = controller.decoration.run_hooks(
                'before_render', remainder, params, output)

So it is possible to modify the output (if it is a mutable structure) but not to exchange it with something else.

Can we do even better? Sure we can. We can assimilate @expose and @validate inside @webapi to avoid repeating those same many decorator lines over and over again:

class webapi(object):
    def __init__(self, error_handler = None):
        self.error_handler = error_handler

    def __call__(self, func):
        def dict_wrap(*args, **kw):
            return dict(r=func(*args, **kw))
        res = expose("json")(dict_wrap)
        res = validate(validator_dispatcher, error_handler=self.error_handler)(res)
        return res

# ...in the controller...

    @expose("json")
    def api_validation_error(self, **kw):
        pylons.response.status = "400 Error"
        return dict(e="validation error on input fields", form_errors=pylons.c.form_errors)

    @webapi(error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

This got rid of @expose and @validate, and provides almost all the default values that I need. Unfortunately I could not find out how to access api_validation_error from the decorator so that I can pass it to the validator, therefore I remain with the inconvenience of having to explicitly pass it every time.

Posted Wed Nov 4 17:52:38 2009 Tags:

Building a web-based API with Turbogears2

I am using TurboGears2 to export a python API over the web. Every API method is wrapper by a controller method that validates the parameters and returns the results encoded in JSON.

The basic idea is this:

@expose("json")
def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
    # Call API
    res = self.engine.list_colours(filter, productID, maxResults)

    # Return result
    return res

To validate the parameters we can use forms, it's their job after all:

class ListColoursForm(TableForm):
    fields = [
            # One field per parameter
            twf.TextField("filter", help_text="Please enter the string to use as a filter"),
            twf.TextField("productID", help_text="Please enter the product ID"),
            twf.TextField("maxResults", validator=twfv.Int(min=0), default=200, size=5, help_text="Please enter the maximum number of results"),
    ]
list_colours_form=ListColoursForm()

#...

    @expose("json")
    @validate(list_colours_form, error_handler=list_colours_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Parameter validation is done by the form
    
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

All straightforward so far. However, this means that we need two exposed methods for every API call: one for the API call and one error handler. For every API call, we have to type the name several times, which is error prone and risks to get things mixed up.

We can however have a single error handler for all methonds:

def get_method():
    '''
    The method name is the first url component after the controller name that
    does not start with 'test'
    '''
    found_controller = False
    for name in pylons.c.url.split("/"):
        if not found_controller and name == "controllername":
            found_controller = True
            continue
        if name.startswith("test"):
            continue
        if found_controller:
            return name
    return None

class ValidatorDispatcher:
    '''
    Validate using the right form according to the value of the "method" field
    '''
    def validate(self, args, state):
        method = args.get("method", None)
    # Extract the method from the URL if it is missing
        if method is None:
            method = get_method()
            args["method"] = method
        return forms[method].validate(args, state)

validator_dispatcher = ValidatorDispatcher()

This validator will try to find the method name, either as a form field or by parsing the URL. It will then use the method name to find the form to use for validation, and pass control to the validate method of that form.

We then need to add an extra "method" field to our forms, and arrange the forms inside a dictionary:

class ListColoursForm(TableForm):
    fields = [
            # One hidden field to have a place for the method name
            twf.HiddenField("method")
            # One field per parameter
            twf.TextField("filter", help_text="Please enter the string to use as a filter"),
    #...

forms["list_colours"] = ListColoursForm()

And now our methods become much nicer to write:

    @expose("json")
    def api_validation_error(self, **kw):
        pylons.response.status = "400 Error"
        return dict(form_errors=pylons.c.form_errors)

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Parameter validation is done by the form
    
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

api_validation_error is interesting: it returns a proper HTTP error status, and a JSON body with the details of the error, taken straight from the form validators. It took me a while to find out that the form errors are in pylons.c.form_errors (and for reference, the form values are in pylons.c.form_values). pylons.response is a WebOb Response that we can play with.

So now our client side is able to call the API methods, and get a proper error if it calls them wrong.

But now that we have the forms ready, it doesn't take much to display them in web pages as well:

def _describe(self, method):
    "Return a dict describing an API method"
    ldesc = getattr(self.engine, method).__doc__.strip()
    sdesc = ldesc.split("\n")[0]
    return dict(name=method, sdesc = sdesc, ldesc = ldesc)

@expose("myappserver.templates.myappapi")
def index(self):
    '''
    Show an index of exported API methods
    '''
    methods = dict()
    for m in forms.keys():
        methods[m] = self._describe(m)
    return dict(methods=methods)

@expose('myappserver.templates.testform')
def testform(self, method, **kw):
    '''
    Show a form with the parameters of an API method
    '''
    kw["method"] = method
    return dict(method=method, action="/myapp/test/"+method, value=kw, info=self._describe(method), form=forms[method])

@expose(content_type="text/plain")
@validate(validator_dispatcher, error_handler=testform)
def test(self, method, **kw):
    '''
    Run an API method and show its prettyprinted result
    '''
    res = getattr(self, str(method))(**kw)
    return pprint.pformat(res)

In a few lines, we have all we need: an index of the API methods (including their documentation taken from the docstrings!), and for each method a form to invoke it and a page to see the results.

Make the forms children of AjaxForm, and you can even see the results together with the form.

Posted Thu Oct 15 15:45:39 2009 Tags:

Creating pipelines with subprocess

It is possible to create process pipelines using subprocess.Popen, by just using stdout=subprocess.PIPE and stdin=otherproc.stdout.

Almost.

In a pipeline created in this way, the stdout of all processes except the last is opened twice: once in the script that has run the subprocess and another time in the standard input of the next process in the pipeline.

This is a problem because if a process closes its stdin, the previous process in the pipeline does not get SIGPIPE when trying to write to its stdout, because that pipe is still open on the caller process. If this happens, a wait on that process will hang forever: the child process waits for the parent to read its stdout, the parent process waits for the child process to exit.

The trick is to close the stdout of each process in the pipeline except the last just after creating them:

#!/usr/bin/python
# coding=utf-8

import subprocess

def pipe(*args):
    '''
    Takes as parameters several dicts, each with the same
    parameters passed to popen.

    Runs the various processes in a pipeline, connecting
    the stdout of every process except the last with the
    stdin of the next process.
    '''
    if len(args) < 2:
        raise ValueError, "pipe needs at least 2 processes"
    # Set stdout=PIPE in every subprocess except the last
    for i in args[:-1]:
        i["stdout"] = subprocess.PIPE

    # Runs all subprocesses connecting stdins and stdouts to create the
    # pipeline. Closes stdouts to avoid deadlocks.
    popens = [subprocess.Popen(**args[0])]
    for i in range(1,len(args)):
        args[i]["stdin"] = popens[i-1].stdout
        popens.append(subprocess.Popen(**args[i]))
        popens[i-1].stdout.close()

    # Returns the array of subprocesses just created
    return popens

At this point, it's nice to write a function that waits for the whole pipeline to terminate and returns an array of result codes:

def pipe_wait(popens):
    '''
    Given an array of Popen objects returned by the
    pipe method, wait for all processes to terminate
    and return the array with their return values.
    '''
    results = [0] * len(popens)
    while popens:
        last = popens.pop(-1)
        results[len(popens)] = last.wait()
    return results

And, look and behold, we can now easily run a pipeline and get the return codes of every single process in it:

process1 = dict(args='sleep 1; grep line2 testfile', shell=True)
process2 = dict(args='awk \'{print $3}\'', shell=True)
process3 = dict(args='true', shell=True)
popens = pipe(process1, process2, process3)
result = pipe_wait(popens)
print result

Update: Colin Watson suggests an improvement to compensate for Python's nonstandard SIGPIPE handling.

Colin Watson has a similar library for C.

Posted Wed Jul 1 09:08:06 2009 Tags:

TurboGears RemoteForm tip

In case your RemoteForm misteriously behaves like a normal HTTP form, refreshing the page on submit, and the only hint that there's something wrong is this bit in the Iceweasel's error console:

Errore: uncaught exception: [Exception... "Component returned failure
code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIXMLHttpRequest.open]"
nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame ::
javascript: eval(__firebugTemp__); :: anonymous :: line 1"  data: no]

the problem can just be a missing action= attribute to the form.

I found out after:

  1. reading the TurboGears remoteform wiki: "For some reason, the RemoteForm is acting like a regular html form, serving up a new page instead of performing the replacements we're looking for. I'll update this page as soon as I figure out why this is happening."

  2. finding this page on Google and meditating for a while while staring at it. I don't speak German, but often enough I manage to solve problems after meditating over Google results in all sorts of languages unknown or unreadable to me. I will call this practice Webomancy.

Posted Sat Jun 6 00:57:39 2009 Tags:

Python scoping

How do you create a list of similar functions in Python?

As a simple example, let's say we want to create an array of 10 elements like this:

a[0] = lambda x: x
a[1] = lambda x: x+1
a[2] = lambda x: x+2
...
a[9] = lambda x: x+9

Simple:

>>> a = []
>>> for i in range(0,10): a.append(lambda x: x+i)
...

...but wrong:

>>> a[0](1)
10

What happened here? In Python, that lambda x: x+i uses the value that i will have when the function is invoked.

This is the trick to get it right:

>>> a = []
>>> for i in range(0,10): a.append(lambda x, i=i: x + i)
...
>>> a[0](1)
1

What happens here is explained in the section "A Jedi Mind Trick" of the Instant Python article: i=i assigns as the default value of the parameter i the current value of i.

Strangely enough the same article has "A Note About Python 2.1 and Nested Scopes" which seems to imply that from Python 2.2 the scoping has changed to "work as it should". I don't understand: the examples above are run on Python 2.4.4.

Googling for keywords related to python closure scoping only yields various sorts of complicated PEPs and an even uglier list trick:

a lot of people might not know about the trick of using a list to box variables within a closure.

Now I know about the trick, but I wish I didn't need to know :-(

Posted Sat Jun 6 00:57:39 2009 Tags:

Linking to self in turbogears

I want to put in my master.kid some icons that allow to change the current language for the session.

First, all user-accessible methods need to handle a 'language' parameter:

@expose(template="myapp.templates.foobar")
def index(self, someparam, **kw):
    if 'language' in kw: turbogears.i18n.set_session_locale(kw['language'])

Then, we need a way to edit the current URL so that we can generate modified links to self that preserve the existing path_info and query parameters. In your main controller, add:

def linkself(**kw):
    params = {}
    params.update(cherrypy.request.params)
    params.update(kw)
    url = cherrypy.request.browser_url.split('?', 1)[0]
    return url + '?' + '&'.join(['='.join(x) for x in params.iteritems()])

def add_custom_stdvars(vars):
    return vars.update({"linkself": linkself})

turbogears.view.variable_providers.append(add_custom_stdvars)

(see the turbogears stdvars documentation and the cherrypy request documentation (cherrypy 2 documentation at the bottom of the page))

And finally, in master.kid:

<div id="footer">
  <div id="langselector">
    <span class="language">
      <a href="${tg.linkself(language='it_IT')}">
        <img src="${tg.url('/static/images/it.png')}"/>
      </a>
    </span>

    <span class="language">
      <a href="${tg.linkself(language='C')}">
        <img src="${tg.url('/static/images/en.png')}"/>
      </a>
    </span>
  </div><!-- langselector -->
</div><!-- footer -->
Posted Sat Jun 6 00:57:39 2009 Tags:

Turbogears quirks when testing controllers that use SingleSelectField

Suppose you have a User that can be a member of a Company. In SQLObject you model it somehow like this:

    class Company(SQLObject):
        name = UnicodeCol(length=16, alternateID=True, alternateMethodName="by_name")
        display_name = UnicodeCol(length=255)

    class User(InheritableSQLObject):
        company = ForeignKey("Company", notNull=False, cascade='null')

Then you want to make a form that allows to choose what is the company of a user:

def companies():
    return [ [ -1, 'None' ] ] + [ [c.id, c.display_name] for c in Company.select() ]

class NewUserFields(WidgetsList):
    """Fields for editing general settings"""
    user_name = TextField(label="User name")
    companyID = SingleSelectField(label="Company", options=companies)

Ok. Now you want to run tests:

  1. nosetests imports the controller to see if there's any initialisation code.
  2. The NewUserFields class is created.
  3. The SingleSelectField is created.
  4. The SingleSelectField constructor tries to guess the validator and peeks at the first option.
  5. This calls companies.
  6. companies accesses the database.
  7. The testing database has not yet been created because nosetests imported the module before giving the test code a chance to setup the test database.
  8. Bang.

The solution is to add an explicit validator to disable this guessing code that is a source of so many troubles:

class NewUserFields(WidgetsList):
    """Fields for editing general settings"""
    user_name = TextField(label="User name")
    companyID = SingleSelectField(label="Company", options=companies, validator=v.Int(not_empty=True))
Posted Sat Jun 6 00:57:39 2009 Tags:

Passing values to turbogears widgets at display time (the general case)

Last time I dug this up I was not clear enough in documenting my findings, so I had to find them again. Here is the second attempt.

In Turbogears, in order to pass parameters to arbitrary widgets in a compound widget, the syntax is:

form.display(PARAMNAME=dict(WIDGETNAME=VALUE))

And if you have more complex nested widgets and would like to know what goes on, this monkey patch is good for inspecting the params lookup functions:

import turbogears.widgets.forms
old_rpbp = turbogears.widgets.forms.retrieve_params_by_path
def inspect_rpbp(params, path):
    print "RPBP", repr(params), repr(path)
    res = old_rpbp(params, path)
    print "RPBP RES", res
    return res
turbogears.widgets.forms.retrieve_params_by_path = inspect_rpbp

The code for the lookup itself is, as the name suggests, in the retrieve_params_by_path function in the file widgets/forms.py in the Turbogears source code.

Posted Sat Jun 6 00:57:39 2009 Tags:

Turbogears form quirk

I had a great idea:

@validate(model_form)
@error_handler()
@expose(template='kid:myproject.templates.new')
def new(self, id, tg_errors=None, **kw):
    """Create new records in model"""
    if tg_errors:
        # Ask until there is still something missing
        return dict(record = defaults, form = model_form)
    else:
        # We have everything: save it
        i = Item(**kw)
        flash("Item was successfully created.")
        raise redirect("../show/%d" % i.id)

It was perfect: one simple method, simple error handling, nice helpful messages all around. Except, check boxes and select fields would not get the default values while all other fields would.

After two hours searching and cursing and tracing things into widget code, I found this bit in InputWidget.adjust_value:

# there are some input fields that when nothing is checked/selected
# instead of sending a nice name="" are totally missing from
# input_values, this little workaround let's us manage them nicely
# without interfering with other types of fields, we need this to
# keep track of their empty status otherwise if the form is going to be
# redisplayed for some errors they end up to use their defaults values
# instead of being empty since FE doesn't validate a failing Schema.
# posterity note: this is also why we need if_missing=None in
# validators.Schema, see ticket #696.

So, what is happening here is that since check boxes and option fields don't have a nice behaviour when unselected, turbogears has to work around it. So in order to detect the difference between "I selected 'None'" and "I didn't select anything", it reasons that if the input has been validated, then the user has made some selections, so it defaults to "The user selected 'None'". If the input has not been validated, then we're showing the form for the first time, then a missing value means "Use the default provided".

Since I was doing the validation all the time, this meant that Checkboxes and Select fields would never use the default values.

Hence, if you use those fields then you necessarily need two different controller methods, one to present the form and one to save it:

@expose(template='kid:myproject.templates.new')
def new(self, id, **kw):
    """Create new records in model"""
    return dict(record = defaults(), form = model_form)

@validate(model_form)
@error_handler(new)
@expose()
def savenew(self, id, **kw):
    """Create new records in model"""
    i = Item(**kw)
    flash("Item was successfully created.")
    raise redirect("../show/%d"%i.id)

If someone else stumbles on the same problem, I hope they'll find this post and they won't have to spend another two awful hours tracking it down again.

Posted Sat Jun 6 00:57:39 2009 Tags:

Passing values to turbogears widgets at display time

In turbogears, I often need to pass data to widgets at display time. Sometimes it works automatically, but sometimes, in cases like passing option lists to CheckBoxLists or number of repetitions in a RepeatingFieldSet, it doesn't.

All the examples use precomputed lists or pass simple code functions. In most of my cases, I want them computed by the controller every time.

Passing a function hasn't worked, as I did not find any obvious way to have the function know about the controller.

So I need to pass things the display() method of the widgets, but I could not work out how to pass the option list and default list for a CheckBoxList that is part of a WidgetsList in a TableForm.

On IRC came the answer, thanks to Xentac:

you should be able to...
    tableform.display(options=dict(checkboxname=[optionlist]))

And yes, it works. I can pass the default value as one of the normal form values:

    tableform.display(values=dict(checkboxname=[values]), options=dict(checkboxname=[optionlist]))
Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Jun 6 00:57:39 2009

Pages about Debtags.

Evolution's old odd mail folders to mbox

Something wrong happened in my dad's Evolution. It just would get stuck checking mail forever, with no useful diagnostic that I could find. Fun. Not.

Anyway, I solved by resetting everything to factory defaults, moving away all gconf entries and .evolution/ files. Then it started to work again, of course then I needed to reconfigure it from scratch.

It turned out however that some old mail was only archived locally, and in a kind of weird format that looks like this:

$ ls -la Enrico/
total 336
drwx------ 2 enrico enrico   4096 Jul 23 03:05 .
drwxr-xr-x 7 enrico enrico   4096 Jul 23 03:12 ..
-rw------- 1 enrico enrico   3230 Dec  4  2010 113.HEADER
-rw------- 1 enrico enrico  14521 Dec  4  2010 113.TEXT
-rw------- 1 enrico enrico   3209 Oct 22  2010 134.HEADER
-rw------- 1 enrico enrico   2937 Oct 22  2010 134.TEXT
-rw------- 1 enrico enrico   3116 Jun 27  2011 15.
-rw------- 1 enrico enrico   3678 Jun 27  2011 168.
-rw------- 1 enrico enrico     73 Apr 27  2009 22.1.MIME
-rw------- 1 enrico enrico   3199 Apr 27  2009 22.2
-rw------- 1 enrico enrico     88 Apr 27  2009 22.2.MIME
[...]

I couldn't even find the name of that mail folder layout, let alone conversion tools. So I had to sit down and waste my sunday break writing software to convert that to a mbox file. Here's the tool, may it save you the awful time I had today: http://anonscm.debian.org/gitweb/?p=users/enrico/evo2mbox.git

Note: feel free to fork it, or send patches, but don't bother with feature requests. Evolution isn't and won't be a personal interest of mine. Anything that makes an afternoon at my parents more tiresome than a whole busy month of paid work, doesn't deserve to be.

Luckily they now seem to have changed the local folder format to Maildir.

Posted Mon Jul 23 03:27:50 2012 Tags:

Giving away distromatch

at last year's Fosdem I tried to inject a lot of energy into distromatch but shortly afterwards I've had to urgently rewrite the nm.debian.org website.

After Lars Wirzenius GTDFH talks in Bologna and Varese I wrote a tool which, among other things, is able to scan my home dir and list how many projects I'm working on.

The output was scary. Like, they are too many. Like, I couldn't even recite the list out of memory. And since I couldn't do that, I had no idea there were so many. And I kept being stressful because I couldn't manage to take care of them all properly.

Now that I became conscious of the situation, it's time to deal with it like a grown up, and politely back off from some of my irresponsible responsibilities.

Distromatch is one of them. It had just started as a proof of concept prototype, and I had the vision that it could be the basis for a fantastic culture of sharing and exchange of information across distributions.

I need to distinguish the vision from the responsibility. I still have that vision for distromatch, but I cannot take responsibility for making it happen.

So I am giving it up to anyone who has the time and resources to pick up that responsibility.

Current status

It works well enough as a prototype. I believe it can successfully map a large enough slice of packages, that one can prototype stuff based on it.

I have for example used it to export the Debtags categories for other distros, and the resulting file looked big enough to be used for prototyping category-based features on distributions that don't have them yet.

I think it also works well enough to support a few common use cases, like sharing screenshots, or doing most of the work of converting dependency lists from a distro to another.

And finally, anyone can deploy it, and work on it.

Existing data sources

Everything I index in the Debian distromatch deployment is available at http://dde.debian.net/exports/distromatch/. The rpm-based data in there comes from an export script I wrote that runs on Sophie, but which I cannot maintain properly.

This is an experimental export of Fedora and OpenSUSE data: http://tmp.vuntz.net/misc/distromatch/distromatch-opensuse-fedora.tar

All existing export scripts are found in distromatch git repo on gitorious.

Contacts I gathered at Fosdem

At Fosdem I devoted quite some work to get contacts from all possible distributions and software repositories, so that distromatch could be hooked into them. Here is a dump of what I have collected:

  • Debian: me
  • OpenSuse: Vincent Untz and Adrian Schröter
  • Fedora: Tom "Spot" Callaway
  • Arch: Tasser on IRC
  • CPAN: contact the people of https://metacpan.org/, on irc.perl.org:#metacpan or make an issue on github
  • NetBSD: ask on #netbsd on Freenode
  • FreeBSD: Baptiste Daroussin (bapt)
  • Mageia: Olivier Thauvin

Some of those contacts may have "expired" in the meantime: I wouldn't assume all of them still remember talking with me, although most probably still do.

My commitment for the time being

I am happy to commit, at the moment, to maintaining a working data export for Debian data. I can take responsibility for making it so that the Debian data for it stays up to date, and to fix it asap if it isn't the case.

I hope that now someone can take distromatch over from me, and make it grow to achieve its great potential.

Posted Sat Jul 21 16:54:18 2012 Tags:

More diversity in Debian skills

This blog post has been co-authored with Francesca Ciceri.

In his Debconf talk, zack said:

We need to understand how to invite people with different backgrounds than packaging to join the Debian project [...] I don't know what exactly, but we need to do more to attract those kinds of people.

Francesca and I know what we could do: make other kinds of contributions visible.

Basically, we should track and acknowledge the contributions of webmasters, translators, programmers, sysadmins, event organisers, and so on, at the same level as what we do for packagers: DDPO, minechangelogs, Portfolio...

For any non-packaging activity that we can make visible and credited, we get:

  • to acknowledge the people who do it, and show that they are active contributors in the project;

  • to acknowledge the work that gets done, and show the actual amount of non-packaging work that gets done in Debian every day;

  • to allow non-packagers to have a reputation, too: first of all, they deserve it, and among other things, it would make nm processing trivial.  

Here's an example: who's the lead translator for German? And if you are German, who's the lead translator for Spanish? Czech? Thai? I (Enrico) don't know the answers, not even for Italian, but we all should! Or at least it should be trivial to find out.

To start to change this, is just a matter of programming.

Francesca already worked on a list of trackable data sources, at least for translators.

Here are some more details, related to translation:

  • Translations can be tracked via the i18n robot (and relative statistics). This works only with teams who activated the robot and actively use the pseudo-urls in their messages on localisation mailing lists. Some translators don't bother to do it but it's ok to only support the main workflow. It beats extracting .po files from l10n-tagged BTS bugs at any rate.

  • DPN and website translations: for wml pages there's a specific field to be extracted for each translated page: grep for maintainer="name" on normal wml pages, while for DPN translations we have a specific translator="name" field. The problem is that this field is not mandatory, so sometimes there's no indication of the maintainer. Again, it's ok to only support the main workflow.

    Anyway, this is preferable to the cvs log: often the commit is done by the coordinator of the team and not by the actual translator. See above for the alternative solution of using the statistics provided by the i18n bot.

  • DDTSS: since the new release of DDTSS-Django, done by Martijn van Oosterhout about a year ago, the contributions are by default non-anonymous. This should be easy to track.

  • http://wiki.debian.org: it is more complicated because in the wiki we do not have a proper l10n translation workflow, so the only thing that can be tracked are changelogs $LANG/* pages. A nice idea would be to have translated pages list the version of the page that was translated and who did the translation.

  • translation of debian manuals and release notes: usually in the translation of manuals and long documentation there is a specific translator field.

And here are some notes about other fields:

  • DPN editors: for each issue there's a list of editors at the bottom of the page. In the wml: grep for editor=.

  • Artwork: artwork submitted via debianart are easy to track on the portal. Anyway usually you can find the author in the license and copyright file.

  • Programming: the only thing we have is the list of services which can be expanded if needed.

  • Press and publicity: there seems to be not much besides svn logs.

  • l10n-english: The Smith Review Project page has some tracking links. Other activities can probably only be tracked, at the moment, via mailing list activity.

  • Events: we can use the "main coordinator" field on www.debian.org/events/$year/$date-$eventname.wml: grep for <define-tag coord>; for events not published on the http://www.debian.org, but only on http://wiki.debian.org, the coordinator or the contact for the event is usually present on the page itself.

  • Sysadmins: we haven't asked DSA.

And finally, if you are still wondering who those translation coordinators are, they are listed here, although not all teams keep that page up to date.

Of course, when a data source is too hard to mine, it can make sense to see if the workflow could be improved, rather than spending months writing compicated mining code.

This is a fun project for people at Debconf to get together and try.

If by the end of the conference we had a way to credit some group of non-packaging contributors, even if just one like translators or website contributors, at least we would finally have started having official trackers for the activities of non-packagers.

Posted Thu Jul 12 14:01:54 2012 Tags:

Debtags for derivative distributions

Sometimes I do cool stuff and I forget to announce it.

Ok, so I recently announced a new Debtags website.

I forgot to say in the announcement that the new website does not only know of Debian packages: see for example this page, at the very bottom it says: "Distributions: oneiric, precise, sid, testing".

This means that already, here and now, debtags.debian.net can be used to tag packages from both Debian and Ubuntu, and can easily be extended to cover the entire Debian ecosystem.

If you are a package maintainer, you will notice that your maintainer page shows your packages from everywhere. If you want to filter things a bit, for example hide obsolete packages from an old Debian Stable or Ubuntu LTS, just click on the "Settings" link on the top right to configure the page.

How it works

The magic is in this mergepackages script, which is run daily, and exports merged Packages files at dde.debian.net. The debtags.debian.net concept of Packages and Sources files are just those all-merged.gz and all-merged-sources.gz.

The merging is simple: that rebuild script processes files in order, and the first version of a package that is found is chosen as the base for the one that will go in the merged Packages file. Some fields like "Description" are just taken from this pivot package, others like Architecture or dependencies are merged into it. It's arbitrary, but works for me: the result has all the packages with all their possible architectures and dependencies, and is ready to be indexed with apt-xapian-index.

At the moment I pull data from Debian and Ubuntu, but you can see that the script can easily be extended to pull data from any Debian-style ftp archive, so any Debian derivative can go in. I've already started negotiations with the Derivatives Census on how to add any Debian derivative and keep the list up to date.

How to export tags for your own distribution

I'll use Ubuntu as an example since the data is already available.

The way you add Debtags to the Ubuntu packages file is just this one:

  1. Get the full reviewed tag database
  2. Optionally filter out those packages that you are not interested in
  3. Tweak this script to build an overrides file.
  4. Give the overrides file to your favourite ftp archive building tool.

The make-overrides is a bit rusty: if you improve it, please send me your changes.

That is it, nothing else required, no excuses, it's ready, here, now!

Hitches and gotchas

This merged Packages file is a bit of a hack, and suffers from name conflicts across distributions, where two different softwares are packaged in two different distributions with the same name.

Ideally, name conflicts should not happen: if a derivative decided to package kate and call it gedit, they deserve to have it tagged uitoolkit::gtk. I think it's rather important that the whole Debian ecosystem works as much as possible with a single package namespace.

However, that reasoning fails if you take time into account: packages get renamed, like git and chromium, and may mean completely different things, for example, if you compare Debian Stable with Debian Sid.

This last is a problem caused by debtags only working with package names but not package versions. I have a strategy in mind based on being able to override the stable tag database using headers in debian/control; it still needs some details sorted out, but I'm confident we will be able to address these issues properly soon enough.

Why stop at the Debian ecosystem?

Why indeed. I'm clearly trying to use FOSDEM, and the CrossDistribution devroom as the venue to discuss just that.

Posted Fri Jan 20 15:12:33 2012 Tags:

Deploying distromatch

I have been working on allowing anyone to set up their own distromatch instance.

For Debian and Ubuntu, I can easily generate the distromatch input using UDD and the Contents files found in any mirrors.

For the whole RPM world, thanks to Olivier Thauvin I have been able to set up regular exports from the vast Sophie database.

I have set up distromatch access on DDE, which can also serve as a list of all working distributions so far. If you have access to the full dataset of package names and package contents for a distribution not in that list, please get in touch and we can add it.

I'm also exporting the full raw dataset which enables anyone to set up the same distromatch environment on their own machines.

Here is how:

# Get distromatch
git clone git://gitorious.org/appstream/distromatch.git
cd distromatch

# Fetch distribution information (updated every 2 days)
wget http://dde.debian.net/exports/distromatch-all.tar.gz

# Unpack it
mkdir data
tar -C data -zxf distromatch-all.tar.gz

# Reindex it (use --verbose if you are curious)
./distromatch --datadir=data --reindex --verbose

# Run it
./distromatch --datadir=data debian gedit

What does this mean? For example it means that if another distribution has some data (categories, screenshots...) that your distribution doesn't have, you can use distromatch to translate package names, then go and get it!

My next step is going to be to improve the distromatch functionality in DDE and possibly build a simple user friendly web interface to it. If you have some JQuery experience and would like to help, don't wait to get in touch.

Posted Fri Feb 18 13:46:30 2011 Tags:

update-apt-xapian-index on other distros

I've drafted a little HOWTO on using apt-xapian-index on non-Debian distributions.

The procedure has been tried on Mageia with some success, and there's no reason it wouldn't work everywhere else: the index itself does not depend on anything distro-specific.

Posted Tue Jan 25 23:01:45 2011 Tags:

Cross-distro Meeting on Application Installer

I have been to a Cross-distro Meeting on Application Installer which to the best of our knowledge is also the first one of its kind. Credit goes to Vincent Untz for organising it, to OpenSUSE for hosting it and to the various sponsors for getting us there.

It went surprisingly well. We got along, got stuff done, did as much work as possible to agree on as many formats, protocols and technologies as we possibly could.

The timing of it is very important, as most major distros would like to adopt some of the features that just became popular in the various new app markets and stores, such as screenshots, user comments and ratings. It looks like a lot of new code is about to be written, or a lot of existing code is about to gain quite a bit of popularity.

For my part, I presented the work on Debtags and apt-xapian-index.

With regards to Debtags, other distros seem to be missing a compehensive classification system, and Debtags is, well, it.

With regards to apt-xapian-index, we just noticed that it's the perfect back-end for what everyone would like to do, and the index structure is rather distribution-agnostic, and it's been road-tested with considerable success by at least software-center, so it attracted quite a bit of interest, and will likely attract some more.

Just to prove a point I put together a prototype webby markety appy thing in just a few hours of work.

The meeting was also the ideal place to create a joint effort to match package names across distributions, which means that a lot of things that were hard to share before, such as screenshots, tags and patches, are suddenly not hard to share anymore.

Posted Sat Jan 22 01:40:50 2011 Tags:

A prototype webby markety appy thing

What better way to introduce my work at an Application Installer meeting than to come with a prototype package browser modeled after shopping sites developed in just a few hours?

It's a little Flask webapp that just works on any Debian system, using the local apt-xapian-index as a backend. It has fast keyword search, faceted navigation and screenshots, and it runs on your system showing the packages that you have available.

Screenshot of packageshelf

To try it:

git clone git://git.debian.org/users/enrico/pkgshelf.git
cd pkgshelf
./web-server.py

Then visit http://localhost:5000

It didn't have much interface polishing, as it's just a quick technology demo. However you can see that:

  • keyword search is fast (fast enought that it could be made to search as you type);
  • relevant tags appear on the left, grouped by facets;
  • the most relevant tags are highlighted;
  • the less relevant tags could be hidden behind a [more] expander;
  • you can choose several strategies to hide packages you may find irrelevant.

Things that need doing:

  • hiding uninteresting facets;
  • making it pretty.

It's essentially JavaScript and CSS work. Anyone wants to play?

Posted Sat Jan 22 01:40:50 2011 Tags:

Match package names across distributions

What would happen if we had a quick and reliable way to match package names across distributions?

These ideas came up at the appinstaller2011 meeting:

  • it would be easy to lookup screenshots in the local distro, and if there are none then fall back on other distributions;
  • it would be easy to port Debtags to other distributions, and possibly get changes back;
  • it would be trivial to add a [patches in $DISTRO] link to the PTS
  • it would be easy to point to other BTSes

We thought they were good ideas, so we started hacking.

To try it, you need to get the code and build the index first:

git clone git://git.debian.org/users/enrico/distromatch.git
cd distromatch
# Careful: 90Mb
wget http://people.debian.org/~enrico/dist-info.tar.gz
tar zxf dist-info.tar.gz
# Takes a long time to do the indexing
./distromatch --reindex --verbose

Then you can query it this way:

./distromatch $DISTRO $PKGNAME [$PKGNAME1 ...]

This would give you, for the package $PKGNAME in $DISTRO, the corresponding package names in all other distros for which we have data. If you do not provide package names, it automatically shows output for all packages in $DISTRO.

For example:

$ time ./distromatch debian libdigest-sha1-perl
debian:libdigest-sha1-perl fedora:perl-Digest-SHA1
debian:libdigest-sha1-perl mandriva:perl-Digest-SHA1
debian:libdigest-sha1-perl suse:perl-Digest-SHA1

real    0m0.073s
user    0m0.056s
sys 0m0.016s

Yes it's quick. It builds a Xapian index with the information it needs, and then it reuses it. As soon as I find a moment, I intend to deploy an instance of it on DDE.

It is using a range of different heuristics:

  • match packages by name;
  • match packages by desktop files contained within;
  • match packages by pkg-config metadata files contained within;
  • match packages by [/usr]/bin/* files contained within;
  • match packages by shared library files contained within;
  • match packages by devel library files contained within;
  • match packages by man pages contained within;
  • match stemmed form of development library package names;
  • match stemmed form of shared library package names;
  • match stemmed form of perl library package names;
  • match stemmed form of python library package names.

This list may get obsolete soon as more heuristics get implemented.

Euristics will never cover all corner cases we surely have, but the idea is that if we can match a sizable amout of packages, the rest can be somehow fixed by hand as needed.

The data it requires for a distribution should be rather straightforward to generate:

  1. a file which maps binary package names to source package names
  2. a file with the list of files in all the packages

For example:

$ ls -l dist-debian/
total 39688
-rw-r--r--  1 enrico enrico  1688249 Jan 20 17:37 binsrc
drwxr-xr-x  2 enrico enrico     4096 Jan 21 19:12 db
-rw-r--r--  1 enrico enrico 29960406 Jan 21 10:02 files.gz
-rw-r--r--  1 enrico enrico  8914771 Jan 21 18:39 interesting-files

$ head dist-debian/binsrc 
openoffice.org-dev openoffice.org
ext4-modules-2.6.32-5-4kc-malta-di linux-kernel-di-mipsel-2.6
linux-headers-2.6.30-2-common linux-2.6
libnspr4 nspr
ipfm ipfm
libforks-perl libforks-perl
med-physics debian-med
libntfs-3g-dev ntfs-3g
libguppi16 guppi
selinux selinux

$ zcat dist-debian/files.gz | head
memstat etc/memstat.conf
memstat usr/bin/memstat
memstat usr/share/doc/memstat/changelog.gz
memstat usr/share/doc/memstat/copyright
memstat usr/share/doc/memstat/memstat-tutorial.txt.gz
memstat usr/share/man/man1/memstat.1.gz
libdirectfb-dev usr/bin/directfb-config
libdirectfb-dev usr/bin/directfb-csource
libdirectfb-dev usr/include/directfb-internal/core/clipboard.h
libdirectfb-dev usr/include/directfb-internal/core/colorhash.h

interesting-files and db are generated when indexing.

To prove the usefulness of the idea (but does it need proving?), you can find in the same git repo a little example app (it took me 10 minutes to write it), that uses the distromatch engine to export Debtags tags to other distributions:

$ ./exportdebtags fedora | head
memstat: admin::benchmarking, interface::commandline, role::program, use::monitor
libdirectfb-dev: devel::lang:c, devel::library, implemented-in::c, interface::framebuffer, role::devel-lib
libkonqsidebarplugin4a: implemented-in::c++, role::shared-lib, suite::kde, uitoolkit::qt
libemail-simple-perl: devel::lang:perl, devel::library, implemented-in::perl, role::devel-lib, role::shared-lib, works-with::mail
libpoe-component-pluggable-perl: devel::lang:perl, devel::library, implemented-in::perl, role::shared-lib
manpages-ja: culture::japanese, made-of::man, role::documentation
libhippocanvas-dev: devel::library, qa::low-popcon, role::devel-lib
libexpat-ocaml-dev: devel::lang:ocaml, devel::library, implemented-in::c, implemented-in::ocaml, role::devel-lib, works-with-format::xml
libgnutls-dev: devel::library, role::devel-lib, suite::gnu

Just in case this made you itch to play with Debtags in a non-Debian distribution, I've generated the full datasets for Fedora, Mandriva and OpenSUSE.

Others have been working on the same matching problem. After we started writing code we started to become aware of existing work:

I'd like to make use of those efforts, maybe to cross-validate results, maybe even better as yet another heuristics.

Update:

I built a simple distromatch query system into DDE!

Posted Sat Jan 22 01:40:50 2011 Tags:

fuss-launcher: an application launcher built on apt-xapian-index

Long ago I blogged about using apt-xapian-index to write an application launcher.

Now I just added a couple of new apt-xapian-index plugins that look like they have been made just for that.

In fact, they have indeed been made just for that.

After my blog post in 2008, people from Truelite and the FUSS project took up the challenge and wrote a launcher applet around my example engine.

The prototype has been quite successful in FUSS, and as a consequence I've been asked (and paid) to bring in some improvements.

The result, that I have just uploaded to NEW, is a package called fuss-launcher:

* New upstream release
   - Use newer apt-xapian-index: removed need of local index
   - Dragging a file in the launcher shows the applications that can open it
   - Remembers the applications launched more frequently
   - Allow to set a list of favourite applications

To get it:

  • apt-get install fuss-launcher (after it passed NEW);
  • or git clone http://git.fuss.bz.it/git/launcher.git/ and apt-get install python-gtk2 python-xapian python-xdg apt-xapian-index app-install-data

It requires apt-xapian-index >= 0.35.

To try it:

  1. Make sure your index is up to date, especially if you just installed app-install-data: just run update-apt-xapian-index as root.
  2. Run fuss-launcher.
  3. Click on the new tray icon to open the launcher dialog.
  4. Type some keywords and see the list of matching applications come to life as you type.

It's worth mentioning again that all this work was sponsored by Truelite and the Fuss project, which rocks.

Some screenshots:

When you open the launcher, by default it shows the most frequently started applicationss and the favourite applications:

launcher just opened

When you type some keywords, you get results as you type, and context-sensitive completion:

keyword search

When you drag a file on the launcher you only see the applications that can open that file:

drag files to the launcher

Posted Mon May 17 10:41:09 2010 Tags:
Posted Sat Jun 6 00:57:39 2009

Pages about Turbogears.

Custom function decorators with TurboGears 2

I am exposing some library functions using a TurboGears2 controller (see web-api-with-turbogears2). It turns out that some functions return a dict, some a list, some a string, and TurboGears 2 only allows JSON serialisation for dicts.

A simple work-around for this is to wrap the function result into a dict, something like this:

@expose("json")
@validate(validator_dispatcher, error_handler=api_validation_error)
def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
    # Call API
    res = self.engine.list_colours(filter, productID, maxResults)

    # Return result
    return dict(r=res)

It would be nice, however, to have an @webapi() decorator that automatically wraps the function result with the dict:

def webapi(func):
    def dict_wrap(*args, **kw):
        return dict(r=func(*args, **kw))
    return dict_wrap

# ...in the controller...

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    @webapi
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

This works, as long as @webapi appears last in the list of decorators. This is because if it appears last it will be the first to wrap the function, and so it will not interfere with the tg.decorators machinery.

Would it be possible to create a decorator that can be put anywhere among the decorator list? Yes, it is possible but tricky, and it gives me the feeling that it may break in any future version of TurboGears:

class webapi(object):
    def __call__(self, func):
        def dict_wrap(*args, **kw):
            return dict(r=func(*args, **kw))
        # Migrate the decoration attribute to our new function
        if hasattr(func, 'decoration'):
            dict_wrap.decoration = func.decoration
            dict_wrap.decoration.controller = dict_wrap
            delattr(func, 'decoration')
        return dict_wrap

# ...in the controller...

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    @webapi
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

As a convenience, TurboGears 2 offers, in the decorators module, a way to build decorator "hooks":

class before_validate(_hook_decorator):
    '''A list of callables to be run before validation is performed'''
    hook_name = 'before_validate'

class before_call(_hook_decorator):
    '''A list of callables to be run before the controller method is called'''
    hook_name = 'before_call'

class before_render(_hook_decorator):
    '''A list of callables to be run before the template is rendered'''
    hook_name = 'before_render'

class after_render(_hook_decorator):
    '''A list of callables to be run after the template is rendered.

    Will be run before it is returned returned up the WSGI stack'''

    hook_name = 'after_render'

The way these are invoked can be found in the _perform_call function in tg/controllers.py.

To show an example use of those hooks, let's add a some polygen wisdom to every data structure we return:

class wisdom(decorators.before_render):
    def __init__(self, grammar):
        super(wisdom, self).__init__(self.add_wisdom)
        self.grammar = grammar
    def add_wisdom(self, remainder, params, output):
        from subprocess import Popen, PIPE
        output["wisdom"] = Popen(["polyrun", self.grammar], stdout=PIPE).communicate()[0]

# ...in the controller...

    @wisdom("genius")
    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

These hooks cannot however be used for what I need, that is, to wrap the result inside a dict. The reason is because they are called in this way:

        controller.decoration.run_hooks(
                'before_render', remainder, params, output)

and not in this way:

        output = controller.decoration.run_hooks(
                'before_render', remainder, params, output)

So it is possible to modify the output (if it is a mutable structure) but not to exchange it with something else.

Can we do even better? Sure we can. We can assimilate @expose and @validate inside @webapi to avoid repeating those same many decorator lines over and over again:

class webapi(object):
    def __init__(self, error_handler = None):
        self.error_handler = error_handler

    def __call__(self, func):
        def dict_wrap(*args, **kw):
            return dict(r=func(*args, **kw))
        res = expose("json")(dict_wrap)
        res = validate(validator_dispatcher, error_handler=self.error_handler)(res)
        return res

# ...in the controller...

    @expose("json")
    def api_validation_error(self, **kw):
        pylons.response.status = "400 Error"
        return dict(e="validation error on input fields", form_errors=pylons.c.form_errors)

    @webapi(error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)

        # Return result
        return res

This got rid of @expose and @validate, and provides almost all the default values that I need. Unfortunately I could not find out how to access api_validation_error from the decorator so that I can pass it to the validator, therefore I remain with the inconvenience of having to explicitly pass it every time.

Posted Wed Nov 4 17:52:38 2009 Tags:

Building a web-based API with Turbogears2

I am using TurboGears2 to export a python API over the web. Every API method is wrapper by a controller method that validates the parameters and returns the results encoded in JSON.

The basic idea is this:

@expose("json")
def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
    # Call API
    res = self.engine.list_colours(filter, productID, maxResults)

    # Return result
    return res

To validate the parameters we can use forms, it's their job after all:

class ListColoursForm(TableForm):
    fields = [
            # One field per parameter
            twf.TextField("filter", help_text="Please enter the string to use as a filter"),
            twf.TextField("productID", help_text="Please enter the product ID"),
            twf.TextField("maxResults", validator=twfv.Int(min=0), default=200, size=5, help_text="Please enter the maximum number of results"),
    ]
list_colours_form=ListColoursForm()

#...

    @expose("json")
    @validate(list_colours_form, error_handler=list_colours_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Parameter validation is done by the form
    
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

All straightforward so far. However, this means that we need two exposed methods for every API call: one for the API call and one error handler. For every API call, we have to type the name several times, which is error prone and risks to get things mixed up.

We can however have a single error handler for all methonds:

def get_method():
    '''
    The method name is the first url component after the controller name that
    does not start with 'test'
    '''
    found_controller = False
    for name in pylons.c.url.split("/"):
        if not found_controller and name == "controllername":
            found_controller = True
            continue
        if name.startswith("test"):
            continue
        if found_controller:
            return name
    return None

class ValidatorDispatcher:
    '''
    Validate using the right form according to the value of the "method" field
    '''
    def validate(self, args, state):
        method = args.get("method", None)
    # Extract the method from the URL if it is missing
        if method is None:
            method = get_method()
            args["method"] = method
        return forms[method].validate(args, state)

validator_dispatcher = ValidatorDispatcher()

This validator will try to find the method name, either as a form field or by parsing the URL. It will then use the method name to find the form to use for validation, and pass control to the validate method of that form.

We then need to add an extra "method" field to our forms, and arrange the forms inside a dictionary:

class ListColoursForm(TableForm):
    fields = [
            # One hidden field to have a place for the method name
            twf.HiddenField("method")
            # One field per parameter
            twf.TextField("filter", help_text="Please enter the string to use as a filter"),
    #...

forms["list_colours"] = ListColoursForm()

And now our methods become much nicer to write:

    @expose("json")
    def api_validation_error(self, **kw):
        pylons.response.status = "400 Error"
        return dict(form_errors=pylons.c.form_errors)

    @expose("json")
    @validate(validator_dispatcher, error_handler=api_validation_error)
    def list_colours(self, filter=None, productID=None, maxResults=100, **kw):
        # Parameter validation is done by the form
    
        # Call API
        res = self.engine.list_colours(filter, productID, maxResults)
    
        # Return result
        return res

api_validation_error is interesting: it returns a proper HTTP error status, and a JSON body with the details of the error, taken straight from the form validators. It took me a while to find out that the form errors are in pylons.c.form_errors (and for reference, the form values are in pylons.c.form_values). pylons.response is a WebOb Response that we can play with.

So now our client side is able to call the API methods, and get a proper error if it calls them wrong.

But now that we have the forms ready, it doesn't take much to display them in web pages as well:

def _describe(self, method):
    "Return a dict describing an API method"
    ldesc = getattr(self.engine, method).__doc__.strip()
    sdesc = ldesc.split("\n")[0]
    return dict(name=method, sdesc = sdesc, ldesc = ldesc)

@expose("myappserver.templates.myappapi")
def index(self):
    '''
    Show an index of exported API methods
    '''
    methods = dict()
    for m in forms.keys():
        methods[m] = self._describe(m)
    return dict(methods=methods)

@expose('myappserver.templates.testform')
def testform(self, method, **kw):
    '''
    Show a form with the parameters of an API method
    '''
    kw["method"] = method
    return dict(method=method, action="/myapp/test/"+method, value=kw, info=self._describe(method), form=forms[method])

@expose(content_type="text/plain")
@validate(validator_dispatcher, error_handler=testform)
def test(self, method, **kw):
    '''
    Run an API method and show its prettyprinted result
    '''
    res = getattr(self, str(method))(**kw)
    return pprint.pformat(res)

In a few lines, we have all we need: an index of the API methods (including their documentation taken from the docstrings!), and for each method a form to invoke it and a page to see the results.

Make the forms children of AjaxForm, and you can even see the results together with the form.

Posted Thu Oct 15 15:45:39 2009 Tags:

TurboGears RemoteForm tip

In case your RemoteForm misteriously behaves like a normal HTTP form, refreshing the page on submit, and the only hint that there's something wrong is this bit in the Iceweasel's error console:

Errore: uncaught exception: [Exception... "Component returned failure
code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIXMLHttpRequest.open]"
nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "JS frame ::
javascript: eval(__firebugTemp__); :: anonymous :: line 1"  data: no]

the problem can just be a missing action= attribute to the form.

I found out after:

  1. reading the TurboGears remoteform wiki: "For some reason, the RemoteForm is acting like a regular html form, serving up a new page instead of performing the replacements we're looking for. I'll update this page as soon as I figure out why this is happening."

  2. finding this page on Google and meditating for a while while staring at it. I don't speak German, but often enough I manage to solve problems after meditating over Google results in all sorts of languages unknown or unreadable to me. I will call this practice Webomancy.

Posted Sat Jun 6 00:57:39 2009 Tags:

Linking to self in turbogears

I want to put in my master.kid some icons that allow to change the current language for the session.

First, all user-accessible methods need to handle a 'language' parameter:

@expose(template="myapp.templates.foobar")
def index(self, someparam, **kw):
    if 'language' in kw: turbogears.i18n.set_session_locale(kw['language'])

Then, we need a way to edit the current URL so that we can generate modified links to self that preserve the existing path_info and query parameters. In your main controller, add:

def linkself(**kw):
    params = {}
    params.update(cherrypy.request.params)
    params.update(kw)
    url = cherrypy.request.browser_url.split('?', 1)[0]
    return url + '?' + '&'.join(['='.join(x) for x in params.iteritems()])

def add_custom_stdvars(vars):
    return vars.update({"linkself": linkself})

turbogears.view.variable_providers.append(add_custom_stdvars)

(see the turbogears stdvars documentation and the cherrypy request documentation (cherrypy 2 documentation at the bottom of the page))

And finally, in master.kid:

<div id="footer">
  <div id="langselector">
    <span class="language">
      <a href="${tg.linkself(language='it_IT')}">
        <img src="${tg.url('/static/images/it.png')}"/>
      </a>
    </span>

    <span class="language">
      <a href="${tg.linkself(language='C')}">
        <img src="${tg.url('/static/images/en.png')}"/>
      </a>
    </span>
  </div><!-- langselector -->
</div><!-- footer -->
Posted Sat Jun 6 00:57:39 2009 Tags:

Turbogears quirks when testing controllers that use SingleSelectField

Suppose you have a User that can be a member of a Company. In SQLObject you model it somehow like this:

    class Company(SQLObject):
        name = UnicodeCol(length=16, alternateID=True, alternateMethodName="by_name")
        display_name = UnicodeCol(length=255)

    class User(InheritableSQLObject):
        company = ForeignKey("Company", notNull=False, cascade='null')

Then you want to make a form that allows to choose what is the company of a user:

def companies():
    return [ [ -1, 'None' ] ] + [ [c.id, c.display_name] for c in Company.select() ]

class NewUserFields(WidgetsList):
    """Fields for editing general settings"""
    user_name = TextField(label="User name")
    companyID = SingleSelectField(label="Company", options=companies)

Ok. Now you want to run tests:

  1. nosetests imports the controller to see if there's any initialisation code.
  2. The NewUserFields class is created.
  3. The SingleSelectField is created.
  4. The SingleSelectField constructor tries to guess the validator and peeks at the first option.
  5. This calls companies.
  6. companies accesses the database.
  7. The testing database has not yet been created because nosetests imported the module before giving the test code a chance to setup the test database.
  8. Bang.

The solution is to add an explicit validator to disable this guessing code that is a source of so many troubles:

class NewUserFields(WidgetsList):
    """Fields for editing general settings"""
    user_name = TextField(label="User name")
    companyID = SingleSelectField(label="Company", options=companies, validator=v.Int(not_empty=True))
Posted Sat Jun 6 00:57:39 2009 Tags:

Passing values to turbogears widgets at display time (the general case)

Last time I dug this up I was not clear enough in documenting my findings, so I had to find them again. Here is the second attempt.

In Turbogears, in order to pass parameters to arbitrary widgets in a compound widget, the syntax is:

form.display(PARAMNAME=dict(WIDGETNAME=VALUE))

And if you have more complex nested widgets and would like to know what goes on, this monkey patch is good for inspecting the params lookup functions:

import turbogears.widgets.forms
old_rpbp = turbogears.widgets.forms.retrieve_params_by_path
def inspect_rpbp(params, path):
    print "RPBP", repr(params), repr(path)
    res = old_rpbp(params, path)
    print "RPBP RES", res
    return res
turbogears.widgets.forms.retrieve_params_by_path = inspect_rpbp

The code for the lookup itself is, as the name suggests, in the retrieve_params_by_path function in the file widgets/forms.py in the Turbogears source code.

Posted Sat Jun 6 00:57:39 2009 Tags:

Turbogears form quirk

I had a great idea:

@validate(model_form)
@error_handler()
@expose(template='kid:myproject.templates.new')
def new(self, id, tg_errors=None, **kw):
    """Create new records in model"""
    if tg_errors:
        # Ask until there is still something missing
        return dict(record = defaults, form = model_form)
    else:
        # We have everything: save it
        i = Item(**kw)
        flash("Item was successfully created.")
        raise redirect("../show/%d" % i.id)

It was perfect: one simple method, simple error handling, nice helpful messages all around. Except, check boxes and select fields would not get the default values while all other fields would.

After two hours searching and cursing and tracing things into widget code, I found this bit in InputWidget.adjust_value:

# there are some input fields that when nothing is checked/selected
# instead of sending a nice name="" are totally missing from
# input_values, this little workaround let's us manage them nicely
# without interfering with other types of fields, we need this to
# keep track of their empty status otherwise if the form is going to be
# redisplayed for some errors they end up to use their defaults values
# instead of being empty since FE doesn't validate a failing Schema.
# posterity note: this is also why we need if_missing=None in
# validators.Schema, see ticket #696.

So, what is happening here is that since check boxes and option fields don't have a nice behaviour when unselected, turbogears has to work around it. So in order to detect the difference between "I selected 'None'" and "I didn't select anything", it reasons that if the input has been validated, then the user has made some selections, so it defaults to "The user selected 'None'". If the input has not been validated, then we're showing the form for the first time, then a missing value means "Use the default provided".

Since I was doing the validation all the time, this meant that Checkboxes and Select fields would never use the default values.

Hence, if you use those fields then you necessarily need two different controller methods, one to present the form and one to save it:

@expose(template='kid:myproject.templates.new')
def new(self, id, **kw):
    """Create new records in model"""
    return dict(record = defaults(), form = model_form)

@validate(model_form)
@error_handler(new)
@expose()
def savenew(self, id, **kw):
    """Create new records in model"""
    i = Item(**kw)
    flash("Item was successfully created.")
    raise redirect("../show/%d"%i.id)

If someone else stumbles on the same problem, I hope they'll find this post and they won't have to spend another two awful hours tracking it down again.

Posted Sat Jun 6 00:57:39 2009 Tags:

Passing values to turbogears widgets at display time

In turbogears, I often need to pass data to widgets at display time. Sometimes it works automatically, but sometimes, in cases like passing option lists to CheckBoxLists or number of repetitions in a RepeatingFieldSet, it doesn't.

All the examples use precomputed lists or pass simple code functions. In most of my cases, I want them computed by the controller every time.

Passing a function hasn't worked, as I did not find any obvious way to have the function know about the controller.

So I need to pass things the display() method of the widgets, but I could not work out how to pass the option list and default list for a CheckBoxList that is part of a WidgetsList in a TableForm.

On IRC came the answer, thanks to Xentac:

you should be able to...
    tableform.display(options=dict(checkboxname=[optionlist]))

And yes, it works. I can pass the default value as one of the normal form values:

    tableform.display(values=dict(checkboxname=[values]), options=dict(checkboxname=[optionlist]))
Posted Sat Jun 6 00:57:39 2009 Tags:

File downloads with TurboGears

In TurboGears, I had to implement a file download method, but the file required access controls so it was put in a directory not exported by Apache.

In #turbogears I've been pointed at: http://cherrypy.org/wiki/FileDownload and this is everything put together:

from cherrypy.lib.cptools import serveFile
# In cherrypy 3 it should be:
#from cherrypy.lib.static import serve_file

@expose()
def get(self, *args, **kw):
    """Access the file pointed by the given path"""
    pathname = check_auth_and_compute_pathname()
    return serveFile(pathname)

Then I needed to export some CSV:

@expose()
def getcsv(self, *args, **kw):
    """Get the data in CSV format"""
    rows = compute_data_rows()
    headers = compute_headers(rows)
    filename = compute_file_name()

    cherrypy.response.headers['Content-Type'] = "application/x-download"
    cherrypy.response.headers['Content-Disposition'] = 'attachment; filename="'+filename+'"'

    csvdata = StringIO.StringIO()
    writer = csv.writer(csvdata)
    writer.writerow(headers)
    writer.writerows(rows)

    return csvdata.getvalue()

In my case it's not an issue as I can only compute the headers after I computed all the data, but I still have to find out how to serve the CSV file while I'm generating it, instead of storing it all into a big string and returning the big string.

Posted Sat Jun 6 00:57:39 2009 Tags:

Turbogears i18n quirks

Collecting strings from .kid files

tg-admin i18n collect won't collect strings from your .kid files: you need the toolbox web interface for that.

Indentation problems in .kid files

The toolbox web interface chokes on intentation errors on your .kid files.

To see the name of the .kid file that causes the error, look at the tg-admin toolbox output in the terminal for lines like Working on app/Foo/templates/bar.kid.

What happens is that the .kid files are converted to python files, and if there are indentation glitches they end up in the python files, and python will complain.

Once you see from the tg-admin toolbox standard error what is the .kid file with the problem, edit it and try to make sure that all closing tags are at the exact indentation level as their coresponding opening tags. Even a single space would matter.

Bad i18n bug in TurboKid versions earlier than 1.0.1

faide on #turbogears also says:

It is of outmost importance that you use TurboKid 1.0.1 because it is the first version that corrects a BIG bug regarding i18n filters ...

The version below had a bug where the filters kept being added at each page load in such a way that after a few hundreds of pages you could have page loading times as long as 5 minutes!

If one has a previous version of TurboKid, one (and only one) of these is needed:

So, in short, all i18n users should upgrade to TurboGears 1.0.2.2 or patch TurboKid using http://trac.turbogears.org/ticket/1301.

Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Jun 6 00:57:39 2009
osm

Pages about OpenStreetMap.

Computing time offsets between EXIF and GPS

I like the idea of matching photos to GPS traces. In Debian there is gpscorrelate but it's almost unusable to me because of bug #473362 and it has an awkward way of specifying time offsets.

Here at SoTM10 someone told me that exiftool gained -geosync and -geotag options. So it's just a matter of creating a little tool that shows a photo and asks you to type the GPS time you see in it.

Apparently there are no bindings or GIR files for gtkimageview in Debian, so I'll have to use C.

Here is a C prototype:

/*
 * gpsoffset - Compute EXIF time offset from a photo of a gps display
 *
 * Use with exiftool -geosync=... -geotag trace.gpx DIR
 *
 * Copyright (C) 2009--2010  Enrico Zini <enrico@enricozini.org>
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 */


#define _XOPEN_SOURCE /* glibc2 needs this */
#include <time.h>
#include <gtkimageview/gtkimageview.h>
#include <libexif/exif-data.h>
#include <stdio.h>
#include <stdlib.h>

static int load_time(const char* fname, struct tm* tm)
{
    ExifData* exif_data = exif_data_new_from_file(fname);
    ExifEntry* exif_time = exif_data_get_entry(exif_data, EXIF_TAG_DATE_TIME);
    if (exif_time == NULL)
    {
        fprintf(stderr, "Cannot find EXIF timetamp\n");
        return -1;
    }

    char buf[1024];
    exif_entry_get_value(exif_time, buf, 1024);
    //printf("val2: %s\n", exif_entry_get_value(t2, buf, 1024));

    if (strptime(buf, "%Y:%m:%d %H:%M:%S", tm) == NULL)
    {
        fprintf(stderr, "Cannot match EXIF timetamp\n");
        return -1;
    }

    return 0;
}

static time_t exif_ts;
static GtkWidget* res_lbl;

void date_entry_changed(GtkEditable *editable, gpointer user_data)
{
    const gchar* text = gtk_entry_get_text(GTK_ENTRY(editable));
    struct tm parsed;
    if (strptime(text, "%Y-%m-%d %H:%M:%S", &parsed) == NULL)
    {
        gtk_label_set_text(GTK_LABEL(res_lbl), "Please enter a date as YYYY-MM-DD HH:MM:SS");
    } else {
        time_t img_ts = mktime(&parsed);
        int c;
        int res;
        if (exif_ts < img_ts)
        {
            c = '+';
            res = img_ts - exif_ts;
        }
        else
        {
            c = '-';
            res = exif_ts - img_ts;
        }
        char buf[1024];
        if (res > 3600)
            snprintf(buf, 1024, "Result: %c%ds -geosync=%c%d:%02d:%02d",
                    c, res, c, res / 3600, (res / 60) % 60, res % 60);
        else if (res > 60)
            snprintf(buf, 1024, "Result: %c%ds -geosync=%c%02d:%02d",
                    c, res, c, (res / 60) % 60, res % 60);
        else 
            snprintf(buf, 1024, "Result: %c%ds -geosync=%c%d",
                    c, res, c, res);
        gtk_label_set_text(GTK_LABEL(res_lbl), buf);
    }
}

int main (int argc, char *argv[])
{
    // Work in UTC to avoid mktime applying DST or timezones
    setenv("TZ", "UTC");

    const char* filename = "/home/enrico/web-eddie/galleries/2010/04-05-Uppermill/P1080932.jpg";

    gtk_init (&argc, &argv);

    struct tm exif_time;
    if (load_time(filename, &exif_time) != 0)
        return 1;

    printf("EXIF time: %s\n", asctime(&exif_time));
    exif_ts = mktime(&exif_time);

    GtkWidget* window = gtk_window_new(GTK_WINDOW_TOPLEVEL);
    GtkWidget* vb = gtk_vbox_new(FALSE, 0);
    GtkWidget* hb = gtk_hbox_new(FALSE, 0);
    GtkWidget* lbl = gtk_label_new("Timestamp:");
    GtkWidget* exif_lbl;
    {
        char buf[1024];
        strftime(buf, 1024, "EXIF time: %Y-%m-%d %H:%M:%S", &exif_time);
        exif_lbl = gtk_label_new(buf);
    }
    GtkWidget* date_ent = gtk_entry_new();
    res_lbl = gtk_label_new("Result:");
    GtkWidget* view = gtk_image_view_new();
    GdkPixbuf* pixbuf = gdk_pixbuf_new_from_file(filename, NULL);

    gtk_box_pack_start(GTK_BOX(hb), lbl, FALSE, TRUE, 0);
    gtk_box_pack_start(GTK_BOX(hb), date_ent, TRUE, TRUE, 0);

    gtk_signal_connect(GTK_OBJECT(date_ent), "changed", (GCallback)date_entry_changed, NULL);
    {
        char buf[1024];
        strftime(buf, 1024, "%Y-%m-%d %H:%M:%S", &exif_time);
        gtk_entry_set_text(GTK_ENTRY(date_ent), buf);
    }

    gtk_widget_set_size_request(view, 500, 400);
    gtk_image_view_set_pixbuf(GTK_IMAGE_VIEW(view), pixbuf, TRUE);
    gtk_container_add(GTK_CONTAINER(window), vb);
    gtk_box_pack_start(GTK_BOX(vb), view, TRUE, TRUE, 0);
    gtk_box_pack_start(GTK_BOX(vb), hb, FALSE, TRUE, 0);
    gtk_box_pack_start(GTK_BOX(vb), exif_lbl, FALSE, TRUE, 0);
    gtk_box_pack_start(GTK_BOX(vb), res_lbl, FALSE, TRUE, 0);
    gtk_widget_show_all(window);

    gtk_main ();

    return 0;
}

And here is its simple makefile:

CFLAGS=$(shell pkg-config --cflags gtkimageview libexif)
LDFLAGS=$(shell pkg-config --libs gtkimageview libexif)

gpsoffset: gpsoffset.c

It's a simple prototype but it's a working prototype and seems to do the job for me.

I currently cannot find out why after I click on the text box, there seems to be no way to give the focus back to the image viewer so I can control it with keys.

There is another nice algorithm to compute time offsets to be implemented: you choose a photo taken from a known place and drag it on that place on a map: you can then look for the nearest point on your GPX trace and compute the time offset from that.

I have seen that there are programs for geotagging photos that implement all such algorithms, and have a nice UI, but I haven't seen any in Debian.

Are there any such softwares that can be packaged?

If not, the interpolation and annotation tasks can now already be performed by exiftool, so it's just a matter of building a good UI, and I would love to see someone picking up the task.

Posted Sun Jul 11 12:34:04 2010 Tags:

Searching OSM nodes in Spatialite

Third step of my SoTM10 pet project: finding the POIs.

I put together a query to find all nodes with a given tag inside a bounding box, and also a query to find all the tag values for a given tag name inside a bounding box.

The result is this simple POI search engine:

#
# poisearch - simple geographical POI search engine
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
#

from pysqlite2 import dbapi2 as sqlite

class PoiDB(object):
    def __init__(self):
        self.db = sqlite.connect("pois.db")
        self.db.enable_load_extension(True)
        self.db.execute("SELECT load_extension('libspatialite.so')")
        self.oldsearch = []
        self.bbox = None

    def set_bbox(self, xmin, xmax, ymin, ymax):
        '''Set bbox for searches'''
        self.bbox = (xmin, xmax, ymin, ymax)

    def tagid(self, name, val):
        '''Get the database ID for a tag'''
        c = self.db.cursor()
        c.execute("SELECT id FROM tag WHERE name=? AND value=?", (name, val))
        res = None
        for row in c:
            res = row[0]
        return res

    def tagnames(self):
        '''Get all tag names'''
        c = self.db.cursor()
        c.execute("SELECT DISTINCT name FROM tag ORDER BY name")
        for row in c:
            yield row[0]

    def tagvalues(self, name, use_bbox=False):
        '''
        Get all tag values for a given tag name,
        optionally in the current bounding box
        '''
        c = self.db.cursor()
        if self.bbox is None or not use_bbox:
            c.execute("SELECT DISTINCT value FROM tag WHERE name=? ORDER BY value", (name,))
        else:
            c.execute("SELECT DISTINCT tag.value FROM poi, poitag, tag"
                      " WHERE poi.rowid IN (SELECT pkid FROM idx_poi_geom WHERE ("
                      "       xmin >= ? AND xmax <= ? AND ymin >= ? AND ymax <= ?) )"
                      "   AND poitag.tag = tag.id AND poitag.poi = poi.id"
                      "   AND tag.name=?",
                      self.bbox + (name,))
        for row in c:
            yield row[0]

    def search(self, name, val):
        '''Get all name:val tags in the current bounding box'''
        # First resolve the tagid
        tagid = self.tagid(name, val)
        if tagid is None: return

        c = self.db.cursor()
        c.execute("SELECT poi.name, poi.data, X(poi.geom), Y(poi.geom) FROM poi, poitag"
                  " WHERE poi.rowid IN (SELECT pkid FROM idx_poi_geom WHERE ("
                  "       xmin >= ? AND xmax <= ? AND ymin >= ? AND ymax <= ?) )"
                  "   AND poitag.tag = ? AND poitag.poi = poi.id",
                  self.bbox + (tagid,))
        self.oldsearch = []
        for row in c:
            self.oldsearch.append(row)
            yield row[0], simplejson.loads(row[1]), row[2], row[3]

    def count(self, name, val):
        '''Count all name:val tags in the current bounding box'''
        # First resolve the tagid
        tagid = self.tagid(name, val)
        if tagid is None: return

        c = self.db.cursor()
        c.execute("SELECT COUNT(*) FROM poi, poitag"
                  " WHERE poi.rowid IN (SELECT pkid FROM idx_poi_geom WHERE ("
                  "       xmin >= ? AND xmax <= ? AND ymin >= ? AND ymax <= ?) )"
                  "   AND poitag.tag = ? AND poitag.poi = poi.id",
                  self.bbox + (tagid,))
        for row in c:
            return row[0]

    def replay(self):
        for row in self.oldsearch:
            yield row[0], simplejson.loads(row[1]), row[2], row[3]

Problem 3 solved: now on to the next step, building a user interface for it.

Posted Sat Jul 10 15:50:31 2010 Tags:

Importing OSM nodes into Spatialite

Second step of my SoTM10 pet project: creating a searchable database with the points. What a fantastic opportunity to learn Spatialite.

Learning Spatialite is easy. For example, you can use the two tutorials with catchy titles that assume your best wish in life is to create databases out of shapefiles using a pre-built, i386-only executable GUI binary downloaded over an insecure HTTP connection.

To be fair, the second of those tutorials is called "An almost Idiot's Guide", thus expliciting the requirement of being an almost idiot in order to happily acquire and run software in that way.

Alternatively, you can use A quick tutorial to SpatiaLite which is so quick it has examples that lead you to write SQL queries that trigger all sorts of vague exceptions at insert time. But at least it brought me a long way forward, at which point I could just cross reference things with PostGIS documentation to find out the right way of doing things.

So, here's the importer script, which will probably become my reference example for how to get started with Spatialite, and how to use Spatialite from Python:

#!/usr/bin/python

#
# poiimport - import nodes from OSM into a spatialite DB
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
#

import xml.sax
import xml.sax.handler
from pysqlite2 import dbapi2 as sqlite
import simplejson
import sys
import os

class OSMPOIReader(xml.sax.handler.ContentHandler):
    '''
    Filter SAX events in a OSM XML file to keep only nodes with names
    '''
    def __init__(self, consumer):
        self.consumer = consumer

    def startElement(self, name, attrs):
        if name == "node":
            self.attrs = attrs
            self.tags = dict()
        elif name == "tag":
            self.tags[attrs["k"]] = attrs["v"]

    def endElement(self, name):
        if name == "node":
            lat = float(self.attrs["lat"])
            lon = float(self.attrs["lon"])
            id = int(self.attrs["id"])
            #dt = parse(self.attrs["timestamp"])
            uid = self.attrs.get("uid", None)
            uid = int(uid) if uid is not None else None
            user = self.attrs.get("user", None)

            self.consumer(lat, lon, id, self.tags, user=user, uid=uid)

class Importer(object):
    '''
    Create the spatialite database and populate it
    '''
    TAG_WHITELIST = set(["amenity", "shop", "tourism", "place"])

    def __init__(self, filename):
        self.db = sqlite.connect(filename)
        self.db.enable_load_extension(True)
        self.db.execute("SELECT load_extension('libspatialite.so')")
        self.db.execute("SELECT InitSpatialMetaData()")
        self.db.execute("INSERT INTO spatial_ref_sys (srid, auth_name, auth_srid,"
                        " ref_sys_name, proj4text) VALUES (4326, 'epsg', 4326,"
                        " 'WGS 84', '+proj=longlat +ellps=WGS84 +datum=WGS84"
                        " +no_defs')")
        self.db.execute("CREATE TABLE poi (id int not null unique primary key,"
                        " name char, data text)")
        self.db.execute("SELECT AddGeometryColumn('poi', 'geom', 4326, 'POINT', 2)")
        self.db.execute("SELECT CreateSpatialIndex('poi', 'geom')")
        self.db.execute("CREATE TABLE tag (id integer primary key autoincrement,"
                        " name char, value char)")
        self.db.execute("CREATE UNIQUE INDEX tagidx ON tag (name, value)")
        self.db.execute("CREATE TABLE poitag (poi int not null, tag int not null)")
        self.db.execute("CREATE UNIQUE INDEX poitagidx ON poitag (poi, tag)")
        self.tagid_cache = dict()

    def tagid(self, k, v):
        key = (k, v)
        res = self.tagid_cache.get(key, None)
        if res is None:
            c = self.db.cursor()
            c.execute("SELECT id FROM tag WHERE name=? AND value=?", key)
            for row in c:
                self.tagid_cache[key] = row[0]
                return row[0]
            self.db.execute("INSERT INTO tag (id, name, value) VALUES (NULL, ?, ?)", key)
            c.execute("SELECT last_insert_rowid()")
            for row in c:
                res = row[0]
            self.tagid_cache[key] = res
        return res

    def __call__(self, lat, lon, id, tags, user=None, uid=None):
        # Acquire tag IDs
        tagids = []
        for k, v in tags.iteritems():
            if k not in self.TAG_WHITELIST: continue
            for val in v.split(";"):
                tagids.append(self.tagid(k, val))

        # Skip elements that don't have the tags we want
        if not tagids: return

        geom = "POINT(%f %f)" % (lon, lat)
        self.db.execute("INSERT INTO poi (id, geom, name, data)"
                        "     VALUES (?, GeomFromText(?, 4326), ?, ?)", 
                (id, geom, tags["name"], simplejson.dumps(tags)))

        for tid in tagids:
            self.db.execute("INSERT INTO poitag (poi, tag) VALUES (?, ?)", (id, tid))


    def done(self):
        self.db.commit()

# Get the output file name
filename = sys.argv[1]

# Ensure we start from scratch
if os.path.exists(filename):
    print >>sys.stderr, filename, "already exists"
    sys.exit(1)

# Import
parser = xml.sax.make_parser()
importer = Importer(filename)
handler = OSMPOIReader(importer)
parser.setContentHandler(handler)
parser.parse(sys.stdin)
importer.done()

Let's run it:

$ ./poiimport pois.db < pois.osm 
SpatiaLite version ..: 2.4.0    Supported Extensions:
        - 'VirtualShape'        [direct Shapefile access]
        - 'VirtualDbf'          [direct Dbf access]
        - 'VirtualText'         [direct CSV/TXT access]
        - 'VirtualNetwork'      [Dijkstra shortest path]
        - 'RTree'               [Spatial Index - R*Tree]
        - 'MbrCache'            [Spatial Index - MBR cache]
        - 'VirtualFDO'          [FDO-OGR interoperability]
        - 'SpatiaLite'          [Spatial SQL - OGC]
PROJ.4 Rel. 4.7.1, 23 September 2009
GEOS version 3.2.0-CAPI-1.6.0
$ ls -l --si pois*
-rw-r--r-- 1 enrico enrico 17M Jul  9 23:44 pois.db
-rw-r--r-- 1 enrico enrico 37M Jul  9 16:20 pois.osm
$ spatialite pois.db
SpatiaLite version ..: 2.4.0    Supported Extensions:
        - 'VirtualShape'        [direct Shapefile access]
        - 'VirtualDbf'          [direct DBF access]
        - 'VirtualText'         [direct CSV/TXT access]
        - 'VirtualNetwork'      [Dijkstra shortest path]
        - 'RTree'               [Spatial Index - R*Tree]
        - 'MbrCache'            [Spatial Index - MBR cache]
        - 'VirtualFDO'          [FDO-OGR interoperability]
        - 'SpatiaLite'          [Spatial SQL - OGC]
PROJ.4 version ......: Rel. 4.7.1, 23 September 2009
GEOS version ........: 3.2.0-CAPI-1.6.0
SQLite version ......: 3.6.23.1
Enter ".help" for instructions
spatialite> select id from tag where name="amenity" and value="fountain";
24
spatialite> SELECT poi.name, poi.data, X(poi.geom), Y(poi.geom) FROM poi, poitag WHERE poi.rowid IN (SELECT pkid FROM idx_poi_geom WHERE (xmin >= 2.56 AND xmax <= 2.90 AND ymin >= 41.84 AND ymax <= 42.00) ) AND poitag.tag = 24 AND poitag.poi = poi.id;
Font Picant de la Cellera|{"amenity": "fountain", "name": "Font Picant de la Cellera"}|2.616045|41.952449
Font de Can Pla|{"amenity": "fountain", "name": "Font de Can Pla"}|2.622354|41.974724
Font de Can Ribes|{"amenity": "fountain", "name": "Font de Can Ribes"}|2.62311|41.979193

It's impressive: I've got all sort of useful information for the whole of Spain in just 17Mb!

Let's put it to practice: I'm thirsty, is there any water fountain nearby?

spatialite> SELECT count(1) FROM poi, poitag WHERE poi.rowid IN (SELECT pkid FROM idx_poi_geom WHERE (xmin >= 2.80 AND xmax <= 2.85 AND ymin >= 41.97 AND ymax <= 42.00) ) AND poitag.tag = 24 AND poitag.poi = poi.id;
0

Ouch! No water fountains mapped in Girona... yet.

Problem 2 solved: now on to the next step, trying to show the results in some usable way.

Posted Sat Jul 10 09:10:35 2010 Tags:

Filtering nodes out of OSM files

I have a pet project here at SoTM10: create a tool for searching nearby POIs while offline.

The idea is to have something in my pocket (FreeRunner or N900), which doesn't require an internet connection, and which can point me at the nearest fountains, post offices, atm machines, bars and so on.

The first step is to obtain a list of POIs.

In theory one can use Xapi but all the known Xapi servers appear to be down at the moment.

Another attempt is to obtain it by filtering all nodes with the tags we want out of a planet OSM extract. I downloaded the Spanish one and set to work.

First I tried with xmlstarlet, but it ate all the RAM and crashed my laptop, because for some reason, on my laptop the Linux kernels up to 2.6.32 (don't now about later ones) like to swap out ALL running apps to cache I/O operations, which mean that heavy I/O operations swap out the very programs performing them, so the system gets caught in some infinite I/O loop and dies. Or at least this is what I've figured out so far.

So, we need SAX. I put together this prototype in Python, which can process a nice 8MB/s of OSM data for quite some time with a constant, low RAM usage:

#!/usr/bin/python

#
# poifilter - extract interesting nodes from OSM XML files
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
#


import xml.sax
import xml.sax.handler
import xml.sax.saxutils
import sys

class XMLSAXFilter(xml.sax.handler.ContentHandler):
    '''
    A SAX filter that is a ContentHandler.

    There is xml.sax.saxutils.XMLFilterBase in the standard library but it is
    undocumented, and most of the examples using it you find online are wrong.
    You can look at its source code, and at that point you find out that it is
    an offensive practical joke.
    '''
    def __init__(self, downstream):
        self.downstream = downstream

    # ContentHandler methods

    def setDocumentLocator(self, locator):
        self.downstream.setDocumentLocator(locator)

    def startDocument(self):
        self.downstream.startDocument()

    def endDocument(self):
        self.downstream.endDocument()

    def startPrefixMapping(self, prefix, uri):
        self.downstream.startPrefixMapping(prefix, uri)

    def endPrefixMapping(self, prefix):
        self.downstream.endPrefixMapping(prefix)

    def startElement(self, name, attrs):
        self.downstream.startElement(name, attrs)

    def endElement(self, name):
        self.downstream.endElement(name)

    def startElementNS(self, name, qname, attrs):
        self.downstream.startElementNS(name, qname, attrs)

    def endElementNS(self, name, qname):
        self.downstream.endElementNS(name, qname)

    def characters(self, content):
        self.downstream.characters(content)

    def ignorableWhitespace(self, chars):
        self.downstream.ignorableWhitespace(chars)

    def processingInstruction(self, target, data):
        self.downstream.processingInstruction(target, data)

    def skippedEntity(self, name):
        self.downstream.skippedEntity(name)

class OSMPOIHandler(XMLSAXFilter):
    '''
    Filter SAX events in a OSM XML file to keep only nodes with names
    '''
    PASSTHROUGH = ["osm", "bound"]
    TAG_WHITELIST = set(["amenity", "shop", "tourism", "place"])

    def startElement(self, name, attrs):
        if name in self.PASSTHROUGH:
            self.downstream.startElement(name, attrs)
        elif name == "node":
            self.attrs = attrs
            self.tags = []
            self.propagate = False
        elif name == "tag":
            if self.tags is not None:
                self.tags.append(attrs)
                if attrs["k"] in self.TAG_WHITELIST:
                    self.propagate = True
        else:
            self.tags = None
            self.attrs = None

    def endElement(self, name):
        if name in self.PASSTHROUGH:
            self.downstream.endElement(name)
        elif name == "node":
            if self.propagate:
                self.downstream.startElement("node", self.attrs)
                for attrs in self.tags:
                    self.downstream.startElement("tag", attrs)
                    self.downstream.endElement("tag")
                self.downstream.endElement("node")

    def ignorableWhitespace(self, chars):
        pass

    def characters(self, content):
        pass

# Simple stdin->stdout XMl filter
parser = xml.sax.make_parser()
handler = OSMPOIHandler(xml.sax.saxutils.XMLGenerator(sys.stdout, "utf-8"))
parser.setContentHandler(handler)
parser.parse(sys.stdin)

Let's run it:

$ bzcat /store/osm/spain.osm.bz2 | pv | ./poifilter > pois.osm
[...]
$ ls -l --si pois.osm
-rw-r--r-- 1 enrico enrico 19M Jul 10 23:56 pois.osm
$ xmlstarlet val pois.osm 
pois.osm - valid

Problem 1 solved: now on to the next step: importing the nodes in a database.

Posted Fri Jul 9 16:28:15 2010 Tags:

Mapping using the Openmoko FreeRunner headset

The FreeRunner has a headset which includes a microphone and a button. When doing OpenStreetMap mapping, it would be very useful to be able to keep tangogps on the display and be able to mark waypoints using the headset button, and to record an audio track using the headset microphone.

In this way, I can use tangogps to see where I need to go, where it's already mapped and where it isn't, and then I can use the headset to mark waypoints corresponding to the audio track, so that later I can take advantage of JOSM's audio mapping features.

Enter audiomap:

$ audiomap --help
Usage: audiomap [options]

Create a GPX and audio trackFind the times in the wav file when there is clear
voice among the noise

Options:
  --version      show program's version number and exit
  -h, --help     show this help message and exit
  -v, --verbose  verbose mode
  -m, --monitor  only keep the GPS on and monitor satellite status
  -l, --levels   only show input levels

If called without parameters, or with -v which is suggested, it will:

  1. Fix the mixer settings so that it can record from the headset and detect headset button presses.
  2. Show a monitor of GPS satellite information until it gets a fix.
  3. Synchronize the system time with the GPS time so that the timestamps of the files that are created afterwards are accurate.
  4. Start recording a GPX track.
  5. Start recording audio.
  6. Record a GPX waypoint for every headset button press.

When you are done, you stop audiomap with ^C and it will properly close the .wav file, close the tags in the GPX waypoint and track files and restore the mixer settings.

You can plug the headset out and record using the handset microphone, but then you will not be able to set waypoints until you plug the headset back in.

After you stop audiomap, you will have a track, waypoints and .wav file ready to be loaded in JOSM.

Big thanks go to Luca Capello for finding out how to detect headset button presses.

Posted Sun Jun 7 23:51:37 2009 Tags:

Simple tool to query the GPS using the OpenMoko FSO stack

I was missing a simple command line tool that allows me to perform basic GPS queries in shellscripts.

Enter getgps:

# getgps --help
Usage: getgps [options]

Simple GPS query tool for the FSO stack

Options:
  --version          show program's version number and exit
  -h, --help         show this help message and exit
  -v, --verbose      verbose mode
  -q, --quiet        suppress normal output
  --fix              check if we have a fix
  -s, --sync-time    set system time from GPS time
  --info             get all GPS information
  --info-connection  get GPS connection information
  --info-fix         get GPS fix information
  --info-position    get GPS position information
  --info-accuracy    get GPS accuracy information
  --info-course      get GPS course information
  --info-time        get GPS time information
  --info-satellite   get GPS satellite information

So finally I can write little GPS-aware scripts:

if getgps --fix -q
then
    start_gps_aware_program
else
    start_gps_normal_program
fi

Or this.

Posted Sun Jun 7 17:59:32 2009 Tags:

Voice-controlled waypoints

I have it in my TODO list to implement taking waypoints when pressing the headset button of the openmoko, but that is not done yet.

In the meantime, I did some experiments with audio mapping, and since I did not manage to enter waypoints while recording them, I was looking for a way to make use of them anyway.

Enter findvoice:

$ ./findvoice  --help
Usage: findvoice [options] wavfile

Find the times in the wav file when there is clear voice among the noise

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -v, --verbose         verbose mode
  -p NUM, --percentile=NUM
            percentile to use to discriminate noise from voice
            (default: 90)
  -t, --timestamps      print timestamps instead of human readable information

You give it a wav file, and it will output a list of timestamps corresponding to where it things that you were talking clearly and near the FreeRunner / voice recorder instead of leaving the recorder dangling to pick up background noise.

Its algorithm is crude and improvised because I have no background whatsoever in audio processing, but it basically finds those parts of the audio file where the variance of the samples is above a given percentile: the higher the percentile, the less timestamps you get; the lower the percentile, the more likely it is that it picks a period of louder noise.

For example, you can automatically extract waypoints out of an audio file by using it together with gpxinterpolate:

./findvoice -t today.wav | ./gpxinterpolate today.gpx > today-waypoints.gpx

The timestamps it outputs are computed using the modification time of the .wav file: if your system clock was decently synchronised (which you can do with getgps), then the mtime of the wav is the time of the end of the recording, which gives the needed reference to compute timestamps that are absolute in time.

For example:

getgps --sync-time
arecord file.wav
^C
./findvoice -t file.wav | ./gpxinterpolate today.gpx > today-waypoints.gpx
Posted Sun Jun 7 02:48:40 2009 Tags:

Geocoding Unix timestamps

Geocoding EXIF tags in JPEG images is fun, but there is more that can benefit from interpolating timestamps over a GPX track.

Enter gpxinterpolate:

$ ./gpxinterpolate --help
Usage: gpxinterpolate [options] gpxfile [gpxfile...]

Read one or more GPX files and a list of timestamps on standard input. Output
a GPX file with waypoints at the location of the GPX track at the given
timestamps.

Options:
  --version      show program's version number and exit
  -h, --help     show this help message and exit
  -v, --verbose  verbose mode

For example, you can create waypoints interpolating file modification times:

find . -printf "%Ts %p\n" | ./gpxinterpolate ~/tracks/*.gpx > myfiles.gpx

In case you wonder where you were when you modified or accessed a file, now you can find out.

Posted Sun Jun 7 02:07:43 2009 Tags:

Recording audio on the FreeRunner

The FreeRunner can record audio. It is nice to record audio: for example I can run the recording in background while I keep tangogps in the screen, and take audio notes about where I am while I am doing mapping for OpenStreetMap.

Here is the script that I put together to create geocoded audio notes:

#!/bin/sh

WORKDIR=~/rec
TMPINFO=`mktemp $WORKDIR/info.XXXXXXXX`

# Sync system time and get GPS info
echo "Synchronising system time..."
getgps --sync-time --info > $TMPINFO

# Compute an accurate basename for the files we generate
BASENAME=~/rec/rec-$(date +%Y-%m-%d-%H-%M-%S)
# Then give a proper name to the file with saved info
mv $TMPINFO $BASENAME.info

# Proper mixer settings for recording
echo "Recording..."
alsactl -f /usr/share/openmoko/scenarios/voip-handset.state restore
arecord -D hw -f cd -r 8000 -t wav $BASENAME.wav

echo "Done"

It works like this:

  1. It synchronizes the system time from the GPS (if there is a fix) so that the timestamps on the wav files will be as accurate as possible.
  2. It also gets all sort of information from the GPS and stores them into a file, should you want to inspect it later.
  3. It records audio until it gets interrupted.

The file name of the files that it generates corresponds to the beginning of the recording. The mtime of the wav file obviously corresponds to the end of the recording. This can be used to later georeference the start and end point of the recording.

You can use this to check mixer levels and that you're actually getting any input:

arecord -D hw -f cd -r 8000 -t wav -V mono /dev/null

The getgps script is now described in its own post.

You may now want to experiment, in JOSM, with "Preferences / Audio settings / Modified times (time stamps) of audio files".

Posted Sun Jun 7 01:30:37 2009 Tags:

Uploading gpsdrive tracks to openstreetmap

I've got some gpsdrive tracks and my area is blank on openstreetmap.

People pointed me at gpsbabel, and it took me a while to figure how it works. For the record, don't do any of this:

gpsbabel -i gpsdrive -o gpx -f track0000.sav -F track0000.gpx
gpsbabel -i gpsdrive -o gpx track0000.sav track0000.gpx

What you would have to do is this:

gpsbabel -i gpsdrive -f track0000.sav -o gpx -F track0000.gpx

However, it would choke on gpsdrive's "missing" points with all values set to 1001. You can grep them out, then gpsbabel would work, but openstreetmap would reject the data because the points have no timestamp: gpsbabel won't carry that on from the gpsdrive tracks to the GPX tracks.

The way to go is here, which contains a link to a tiny little perl script that will do the proper conversion for you:

./gpsdrive2gpx.pl track0000.sav > track0000.gpx

Those you can upload them to openstreetmap, at last.

Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Jun 6 00:57:39 2009

Pages about Ubuntu.

First pratical lesson

Notes after today's training session.

Small index of most used shell commands:

  • ls - list directory contents
  • cp - copy files and directories
  • mv - move (rename) files
  • rm - remove files or directories
  • find - search for files in a directory hierarchy
  • cat - concatenate files and print on the standard output
  • more - file perusal filter for crt viewing
  • less - opposite of more (quit with 'q')
  • cd - Change the current directory to DIR. (use "help cd" instead of "man cd")
  • mkdir - make directories
  • rmdir - remove empty directories

Small index of commands useful for combining in pipelines:

  • grep, egrep, fgrep, rgrep - print lines matching a pattern
  • tail - output the last part of files
  • head - output the first part of files
  • sort - sort lines of text files
  • uniq - report or omit repeated lines
  • sed - stream editor
  • wc - print the number of newlines, words, and bytes in files

Problems found during the lesson:

  • You set the system default locale to Amharic, and the gdm login will be in Amharic input mode. We didn't find out how to switch it back to input roman characters. Right click on the input field to set the input method doesn't work. Since usernames are not in Amharic, you're locked out.
  • So you CTRL+ALT+F1, login and try dpkg-reconfigure locales. On Ubuntu Dapper, it does not work anymore.
  • So you dig and dig and dig and finally find that you can force a locale in /etc/default/gdm (but not in /etc/gdm/locale.conf, nor in /etc/gdm/gdm.conf).
  • Then the internet works for a bit and you look up how to reconfigure locales in Ubuntu. Turns out you have to use localeconf, which is not installed by default, is not in universe and thus not on the CDs, and needs to be downloaded from the Internet.
  • The Ubuntu wiki is all on https, which defeats any attempt of proxy caching.
  • An Internet proxy needs to be configured 3 times: in Gnome, in Firefox and in Synaptic (well, apt). This is especially tricky when you forgot to setup the proxy in Synaptic and seemingly unrelated applications fail, like the Ubuntu language selector, which internally invokes the package manager to download missing langpacks.
  • Some short descriptions in the NAME section of manpages are hard to understand, or wrong. Noted on apt-get, apt-cache and less. Top prize goes to apt-cache:

     NAME
            apt-cache - APT package handling utility -- cache manipulator
     DESCRIPTION
            [...] apt-cache does not manipulate the state of the system but
            does provide operations to search and generate interesting output
            from the package metadata. [...]
    

    So apt-cache is a manipulator that doesn't manipulate. A possible improvement can be "query the APT package cache".

  • The language selector in Ubuntu Breezy doesn't really exit and keeps the package database locked. This seems to be fixed in Dapper, and probably had been fixed in some Breezy update. System updates here are a problem: my Dapper (with some Universe things in it) wanted to download more than 120Mb of data, and the Uni network was giving me 14Kbps. It's been a nice opportunity to teach about fuser -uva and kill.
  • dict, squid and many other packages from 'main' are not on the normal Ubuntu CDs: is there an easy way to build a CD with them? Or do Ubuntu CDs with extra packages already exist? I'll have to find out.
  • cupsys has documentation outside of /usr/share/doc, in /usr/share/cups/doc-root.
  • man works on all commands, except cd, which is an internal shell command and thus needs help instead of man. I should remember to ponder about autogenerating manpages from help output.
  • Is there an index-like manpage with a list of the core Unix commands and their short descriptions? It there's not, it's easy to generate:

     #!/bin/sh
     DIR=${1:-"/bin"}
     (
     find $DIR | while read FILE
     do
         if [ -x $FILE ] && ! [ -d $FILE ]
         then
             LANG=C COLUMNS=2000 man `basename $FILE` | \
                      grep ^SYNOPSIS -B 100 | grep ^NAME -A 100 | \
                      tail -n +2 | head -n +2 | \
                      grep -v '^[ \t]*$'
         fi
     done
     ) | sort | uniq | sed 's/^ \+//'
    

    Try running it on /bin and /sbin: it's great!. Also, since it doesn't redirect stderr, it nicely exposes a number of manpage problems.

Lots of bugs to report when I come home: from here it'll take ages, and lots of money on the hotel internet connection, and some are Ubuntu-specific so I'd need to do everything online with Malone.

As usual, teaching is one of the best ways to find bugs.

I propose an Etch training session a month before release.

Other things to do:

  • Find more info about that Wikipedia live CD with Wikipedia browsable without the Internet.
  • Make a collection of Free technical E-books: even those Indian low-cost book editions are too expensive here, so E-books mean a lot.

Update: Matt Zimmerman writes:

I read your blog entry at http://www.enricozini.org/blog/eng/second-day-in-addis and wanted to respond as follows:

  • localeconf is not the standard way to configure locales in Ubuntu; what documentation told you that? It's an unsupported package from Progeny. If what you wanted was to set the system default locale from the command line, editing /etc/environment is probably the best way.

  • I suggest filing a bug report at <https://launchpad.net/products/ubuntu-website about the HTTPS issue>; I don't think it's necessary for the entire wiki to be HTTPS, only authentication.

  • Synaptic may be able to use the GNOME proxy settings without introducing undesirable dependencies; please file a wishlist bug

  • dict, squid and other packages from main are not on the Ubuntu CDs because there is no space. The DVD contains these packages.

  • The cupsys documentation bug was quite likely inherited from Debian and should be reported there

  • You can file bugs in Malone via email; this has been possible for a long time now. Please don't reinforce this misconception.

    https://help.launchpad.net/UsingMaloneEmail

Update:

Posted Sat Jun 6 00:57:39 2009 Tags:

Fixing problems after upgrade to Dapper

Laptop: Asus M3Ae

Problem: Can't mount root partition because of various ACPI errors. Breezy kernel works.

Solution:

1) boot with old kernel 2) sudo echo "libata noacpi=1" >> /etc/mkinitramfs/modules 3) sudo mv /boot/initrd.img-2.6.15-25-686 /boot/initrd.img-2.6.15-25-686.backup 4) mkinitramfs -o /boot/initrd.img-2.6.15-25-686 2.6.15-25-686

Thanks: Matthew Garrett

Posted Sat Jun 6 00:57:39 2009 Tags:

Live CD on a removable disk

Eros is a hardware guru that happened to be the unknown guy sitting next to me on a plane.

He happens to be a happy Kubuntuer. While chatting, he told me one of his systems is an external hard drive made by copying a Kubuntu live CD image on it.

Why did you do so? I asked.

Because this way I can plug it in any computer, and it'll do hardware detection at boot. However it's a hard drive, so it's fast, and I can keep my home and all my customisations on it.

I had never thought of it.

That's an interesting and smart (ab)use of a live CD.

Now I wonder: what would be required to plug the live CD boot time hardware detection infrastructure on an existing Debian or Ubuntu instalation?

Update: slh on IRC suggests (a bit edited by me):

A lot of the former "obscure black magic" for live CDs isn't needed anymore. What is needed is: a kernel with static usb-storage, libusual, ehci-hcd, ohci-hcd, uhci-hdc (or an appropriate initrd/ initramfs). udev takes care of most h/w detection issues these days.

As long as everything needed to boot is contained in a single partition you don't need a fstab: udev, hal and pmount take care of the rest, procfs, sysvfs, devpts, usbfs, shm are mounted by sysvinit.

All what is left is a tool to create the xorg.conf while booting (those tools exist and just need to be called early).

Everything else is just a matter of convenience: enhancing the live span of the USB key by changing data into tmpfs, etc.; if passwordless logins are required then xsession and inittab need to be changed; new ssh host keys generated on boot; small stuff.

With ordinary flash storage, jffs2 and something to reduce write access is a good idea: perhaps unionfs for /var/ and /home/, bind mounting /tmp/ on /var/tmp/), but that's also not strictly necessary.

Mostly it boils down to running the xorg-creation script at every boot time.

There are various tools to do that. Some are here, but there is surely more. (Enrico's note: do we have anything in Debian that we can install and just does that?)

Since USB and PS/2 mice share the same device since kernel 2.6, that part of xorg.conf doesn't strictly need to be detected, same for the keyboard (alps and synaptic touchpads can be easily detected) and X.org can use the screen's ddc info although it's not always reliable.

It can boil down to just detecting the video chipset: something like this, that uses PCI IDs from discover1-data.

It can also become a lot easier with X.org's own ddc detection, which almost boils down to configuring input devices and selecting the video driver. If I understand Daniel Stone correctly, X.org will soon improve its detection routines (fail safe X (auto-)configuration) as well in X.org 7.3.

xresprobe is in debian: it's pretty similar to ddcxinfo-kanotix, both forked off RedHat's kudzu package - and all fail miserably on amd64. That's why ddcxinfo has a fallback to 1024*768 @75 Hz which "always works (+manual overrides)".

Posted Sat Jun 6 00:57:39 2009 Tags:

Live CD on a removable disk, the Debian way

In [live-cd-on-removable-disk] at some point I wrote:

Enrico's note: do we have anything in Debian that we can install and just does that?

Here are the answers:

Sven Mueller writes:

Well, Enrico, a tool I really grew fond of, which auto-configures X on Debian systems is xdebconfigurator, it lacks being auto-run on each system start, which I consider a feature on normal systems, but for your proposed usage (i.e. a portable USB-storage based Debian system), it would certainly be the right thing.

Essentially, it never failed on me. Except for VMware virtual machines, where all it did wrong was that it proposed too high resolutions which resulted from my dual-screen Windows setup I ran VMware on. You might want to give it a try.

Tollef Fog Heen writes:

I added the support in casper for doing this almost a year ago and it has saved me lots of debugging time. Booting the live CD that way is almost as fast as booting an installed system. If you couple this with using the persistent storage support in casper, you can get the configure-on-boot support together with persistency.

In a later update, slh is quited saying that xresprobe doesn't work on AMD64. This is wrong, I wrote that support based on code by Matthew Garret a little more than nine months ago. I wouldn't recommend incorporating it in new-written code, but rather use libx86

And finally, Marco Amadori writes:

Without needing to look for tools external to Debian, there is already the Debian Live software in sid: live-package, that creates a live system, and casper, that generates an initramfs that can configure a Debian system on the fly.

So far there is no hard disk target for live-package, but the "Iso" target can already do the job quite well. At boot time, Casper's initramfs scans all the block devices, so it works also for USB keys and hard drives.

To obtain a hard drive image, you just need to invoke "make-live" with the options to have the required software, then copy the content of the iso (or of the directory ./debian-live/binary) on a partition and install the boot loader.

This is what the future "HD" target of live-package will do; so far it can only build ISO and Netboot images.

Posted Sat Jun 6 00:57:39 2009 Tags:
Posted Sat Jun 6 00:57:39 2009