More diversity in Debian skills

This blog post has been co-authored with Francesca Ciceri.

In his Debconf talk, zack said:

We need to understand how to invite people with different backgrounds than packaging to join the Debian project [...] I don't know what exactly, but we need to do more to attract those kinds of people.

Francesca and I know what we could do: make other kinds of contributions visible.

Basically, we should track and acknowledge the contributions of webmasters, translators, programmers, sysadmins, event organisers, and so on, at the same level as what we do for packagers: DDPO, minechangelogs, Portfolio...

For any non-packaging activity that we can make visible and credited, we get:

  • to acknowledge the people who do it, and show that they are active contributors in the project;

  • to acknowledge the work that gets done, and show the actual amount of non-packaging work that gets done in Debian every day;

  • to allow non-packagers to have a reputation, too: first of all, they deserve it, and among other things, it would make nm processing trivial.  

Here's an example: who's the lead translator for German? And if you are German, who's the lead translator for Spanish? Czech? Thai? I (Enrico) don't know the answers, not even for Italian, but we all should! Or at least it should be trivial to find out.

To start to change this, is just a matter of programming.

Francesca already worked on a list of trackable data sources, at least for translators.

Here are some more details, related to translation:

  • Translations can be tracked via the i18n robot (and relative statistics). This works only with teams who activated the robot and actively use the pseudo-urls in their messages on localisation mailing lists. Some translators don't bother to do it but it's ok to only support the main workflow. It beats extracting .po files from l10n-tagged BTS bugs at any rate.

  • DPN and website translations: for wml pages there's a specific field to be extracted for each translated page: grep for maintainer="name" on normal wml pages, while for DPN translations we have a specific translator="name" field. The problem is that this field is not mandatory, so sometimes there's no indication of the maintainer. Again, it's ok to only support the main workflow.

    Anyway, this is preferable to the cvs log: often the commit is done by the coordinator of the team and not by the actual translator. See above for the alternative solution of using the statistics provided by the i18n bot.

  • DDTSS: since the new release of DDTSS-Django, done by Martijn van Oosterhout about a year ago, the contributions are by default non-anonymous. This should be easy to track.

  • http://wiki.debian.org: it is more complicated because in the wiki we do not have a proper l10n translation workflow, so the only thing that can be tracked are changelogs $LANG/* pages. A nice idea would be to have translated pages list the version of the page that was translated and who did the translation.

  • translation of debian manuals and release notes: usually in the translation of manuals and long documentation there is a specific translator field.

And here are some notes about other fields:

  • DPN editors: for each issue there's a list of editors at the bottom of the page. In the wml: grep for editor=.

  • Artwork: artwork submitted via debianart are easy to track on the portal. Anyway usually you can find the author in the license and copyright file.

  • Programming: the only thing we have is the list of services which can be expanded if needed.

  • Press and publicity: there seems to be not much besides svn logs.

  • l10n-english: The Smith Review Project page has some tracking links. Other activities can probably only be tracked, at the moment, via mailing list activity.

  • Events: we can use the "main coordinator" field on www.debian.org/events/$year/$date-$eventname.wml: grep for <define-tag coord>; for events not published on the http://www.debian.org, but only on http://wiki.debian.org, the coordinator or the contact for the event is usually present on the page itself.

  • Sysadmins: we haven't asked DSA.

And finally, if you are still wondering who those translation coordinators are, they are listed here, although not all teams keep that page up to date.

Of course, when a data source is too hard to mine, it can make sense to see if the workflow could be improved, rather than spending months writing compicated mining code.

This is a fun project for people at Debconf to get together and try.

If by the end of the conference we had a way to credit some group of non-packaging contributors, even if just one like translators or website contributors, at least we would finally have started having official trackers for the activities of non-packagers.