Enrico's pages/ tags/ debian

Pages about Debian.

Audit your debian uploads

Audit your debian uploads

My bank is sending me an e-mail every time I log into the home banking system, so that I can spot malicious logins.

My credit card is sending me a SMS message every time it gets charged, so that I can spot mailicious charges.

Can I get a notification of every Debian upload done with my key, so that I can spot if my key has been stolen?

Let's work on that. As a start, thanks to Ganneff, here is how to do a one-off audit:

# go to merkel to access projectb, which is the postgresql database
# with all dak information
$ ssh merkel
merkel$ psql projectb
# look up the database id of my fingerprint
projectb=> select id, fingerprint from fingerprint where fingerprint like '%797EBFAB';
 id  |               fingerprint                
    -----+------------------------------------------
     394 | 66B4DFB68CB24EBBD8650BC4F4B4B0CC797EBFAB
    (1 row)
# get a list of all uploads done with my key, sorted by date
projectb=> select * from source where sig_fpr=394 order by install_date desc;

First you get to do it (done); then you document it (done); then you automate it. It's quite trivial at this point, so enjoy the new Debian upload monitor.

It's got search as you type to find your full fingerprint, then you get an HTML page with the log of your uploads in the last 2 months, and the page has an RSS feed that you can use to track your own uploads.

Also, generating all this static content is acceptably fast:

merkel$ time ./deb-key-audit 

real    0m7.145s
user    0m4.244s
sys 0m0.384s

If you want to see the code, you can git clone http://merkel.debian.org/~enrico/keylog.git

Currently it wrongly encodes UTF-8 characters: I suppose the strings come out of the database as ASCII instead of UTF-8. A patch would be welcome to fix that.

I will now contact QA to see what we can do with it; if it ends up fitting in some bigger picture then it may be that the RSS links will change, but I'll post about it in that case.

Posted Thu 01 May 2008 17:15:50 CEST Tags: debian
How to work with python-m2crypto

How to work with python-m2crypto

This is a little howto on how to understand the m2crypto Python module, beyond the examples provided:

  1. Build the documentation as explained here
  2. Look at the documentation, which does not contain a single comment
  3. Click the links that show the source (for example, of M2Crypto.SMIME.SMIME.verify)
  4. Look what openssl function gets called by the function you are interested in (for example, pkcs7_verifysomething))
  5. Google the name of that openssl function, trying a few variations, until you find the openssl manpage that tells you what it does, in detail (for example, pcks_verify)
  6. Understand what it does, and backport the understanding to the thin convenience layer that m2crypto adds on top of it
  7. Realise that now you also understand the corresponding openssl smime -verify commandline invocation, and that next time instead of reading the openssl manpage you should look at the openssl api docs instead.
Posted Fri 08 Feb 2008 16:45:50 CET Tags: debian
Happy new year

Happy new year

A year ago we got in touch with various Taiwanese aboriginal tribes to try to start localisation efforts.

Thanks to the research the Taroko people did during 2007 and the prototype work of tonight, the Taroko people in Taiwan can see the computer calendar of the new year in their own language:

trv_TZW Gnome calendar

Posted Mon 31 Dec 2007 16:58:30 CET Tags: debian
Meet the EeePC

Meet the EeePC

Being in Taiwan, we swiftly got hold of an Eeepc.

Instead of installing Debian into it, we decided to keep the original system and see how it works. It's a Debian derivative, and the feeling inside a terminal window is quite familiar.

The boot is very fast. Two seconds after the video bios quickly shows on the screen, the X cursor appears. It's definitely worth having a look at how this devil boots.

The "Asus Launcher" is worth a look. IMHO it's nicer and more useful than the usual launcher menu that we get in Gnome or KDE 3, although it probably only makes sense on a small display. It replaces the desktop background, has tabs, no clutter and allows to launch applications. Turns out it's customisable as well.

What's on the system

KDE 3.4.2, with some applications renamed so that their names are more human. For example, konsole became console.

vim! \o/ But not emacs :)

mc! Someone out there wanted to make my life easier.

fbreader. I had never heard of it, but it's a very good discovery that I've now started to use it on my laptop as well.

Little howtos

To get to a terminal, hit Ctrl+T in the file manager, or Ctrl+Alt+T elsewhere.

The root password is the same as the user password.

To change the system language, I managed with a simple dpkg-reconfigure locales.

Ways they simplified the unix system

It's single user: I didn't find a way to create multiple users besides the terminal, and the login program does not ask for a username, only for the password.

The "win" key has a house painted on it, and it's used as "hide/show all applications" key. When all applications are minimised, the Asus launcher is visible instead of the X background: this behaviour basically turns the key into a sort of "run application" key. The key still works as a kind of shift, although it probably was not intended to.

The repository management is interesting. /etc/apt/sources.list contains:

deb http://update.eeepc.asus.com/p701 p701 main
deb http://update.eeepc.asus.com/p701/tw p701 main

which means they have a repository per eepc model and a subrepository per localised version.

The "Internet" group of applications has a Wikipedia toplevel application: it's nice to see the ecosystem of free software / free culture coming together to provide a nice user experience.

An extra link to the SD card mount point (besides the one in /mount) appears in the home directory automatically when the SD card is inserted. This means that when you do "save as" from all sorts of applications, the SD card is there, easy to reach. This helps if one decides not to use the internal flash for data, and just save everything in the SD card: I like doing this, as it allows me to quickly move the SD card with all the data between the EeePC and other computers.

Changes I made so far

Activate en_GB.UTF-8 via dpkg-reconfigure locales.

Add en_GB.UTF-8 to /etc/scim/global, to get SCIM input methods to work.

Little flaws

Virtual screens are enabled, so W+arrow switches virtual screen. The feeling you get if you hit W+arrow is that all your applications disappeared. This could be improved by having the vm keep the asus launcher at the bottom of the current virtual screen, instead of just at the bottom of the first screen. Or, to disable virtual screens by default.

It is possible to drag the lower panel around, maybe accidentally: that's another of our fancy default "features" that should be disabled by default.

It is also possible to remove applets from the applet bar by mistake: for example I wanted to disconenct the wireless, and I instead ended up quitting the wireless applet. Luckily, the next time I started the computer it magically came back.

~/.xsession-errors is continuously getting the useless stdout/stderr debugging flood of GUI apps. Noone bothers usually, except that in this case the file is on flash, where unneeded writes are also very much unwanted. I'm considering symlinking it to /dev/null, but ideally we should get GUI apps to only write out what is really important.

Battery charging doesn't show how long it is going to take until the battery is fully charged.

No capslock or numlock leds. This probably calls for disabling or remapping of capslock. Numlock is very hard to hit by accident, but capslock is.

Random thoughts

If you buy an eeepc, I really suggest you think of it a mass consumption appliance and stay on the original OS for a while. Most of what's in here is what we use everyday, just on a different context. Try to use it as an appliance and see if it is perfect, and if it isn't, try to find out what is missing. It is a fantastic way to find out important bits that are missing in Debian as well.

Also, if you're used to tailoring everything to yourself before starting to use a Linux system, this is a great way to try the usage experience that we can offer by default. The Firefox welcome page the first time you connect, for example, is surprisingly nice. Everything we know as doable comes a bit as as a surprise because this time someone has done it for us.

I wish that that someone can be invited to talk at the next Debconf: the possibility of having a look at the work that has been done in bending Debian to this nice little device is to me one of the most valuable things so far about the eeepc.

Help/About KDE/Credits

It's reachable by most applications, and says:

The development team would like to thank the following people and organizations for their contributions:

  • the Debian Project,
  • the GNU Project,
  • the KDE Project,
  • the Mozilla Project,
  • the OpenOffice.org Project,
  • the SAMBA Project,
  • the X.Org Foundation,

Linus Torvalds and the other Linux kernel developers, and Free software developers around the world.

I'm using an appliance that is thanking me, and others like me: priceless!

Posted Sat 29 Dec 2007 06:30:05 CET Tags: debian
`mod_proxy_html` and compressed pages

mod_proxy_html and compressed pages

After putting it behind a reverse proxy, our phpmyadmin setup started showing empty pages.

After one morning of deep cursing, this is what happened:

  1. the web server where phpmyadmin runs generates compressed html pages;
  2. modproxyhtml tries to edit them, and "normalises" them, adding <html>...</html> headers around the compressed data;
  3. Firefox fails to decompress because there is extra garbage, and shows a blank page instead of complaining.

Other things to note:

How I found it:

  1. nc -l -p 444;
  2. configure mod_proxy to send connections to netcat instead of the web browser;
  3. compare curl headers and Firefox headers;
  4. add the headers from Firefox to curl one by one, until the output breaks.

How to solve it:

  1. a2enmod deflate
  2. Replace SetOutputFilter proxy-html with SetOutputFilter DEFLATE;proxy-html;INFLATE so that we always have mod_proxy_html work on decompressed HTML.
Posted Mon 03 Dec 2007 14:48:00 CET Tags: debian
apt-xapian-index: search as you type

apt-xapian-index: search as you type

I've recently posted:

Note that I've rewritten all the old posts to only show the main code snippets: if you were put off by the large lumps of code, you may want to give it another go.

Today I'll show how to implement a very attractive feature for a user interface: search as you type. The idea is that you don't need to press enter to fire up a query: instead, the results materialise in front of your eyes as you type them.

The example I created uses curses, but the idea is good on any interactive user interface.

The main thing to keep in mind with search as you type is that the last word is likely to be partially typed, unless maybe some timeout expired since the user's last keystroke.

Xapian comes into help here, as it allows us to expand the partially typed word into an OR query with all the terms that start with it. This means that if we are typing, for example, "progr", we can turn the query into "program OR programmer OR programming OR programmed [...and so on...]".

I won't show the UI code, except a simple input loop that triggers the query at every keystroke:

    def mainloop(self):
        while True:
            c = self.win.getch()
            self.line += chr(c)
            self.results.update(self.line)

The interesting part is in the update function.

First we split the line in words and convert the words into a query:

        # Split the line in words
        args = self.splitline.split(line)
        # Convert the words into terms for the query
        terms = termsForSimpleQuery(args)

Then we expand the last word with all possible completions:

        # Since the last word can be partially typed, we add all words that
        # begin with the last one.
        terms.extend([x.term for x in db.allterms(args[-1])])

Now we can build the query. Of course you can add all other sorts of things to the query, for example a boolean expression of tag filter like in axi-query-pkgtype.py; Xapian will cope.

        # Build the query
        query = xapian.Query(xapian.Query.OP_OR, terms)

Finally the query. For bonus points you can do the adaptive cutoff trick to discard bad results.

In my case, since I don't implement scrolling of results, I also limit them to what fits in the window:

        # Retrieve as many results as we can show
        mset = enquire.get_mset(0, self.size - 1)

Finally, draw the results on screen:

        # Redraw the window
        self.win.clear()

        # Header
        self.win.addstr(0, 0, "%i results found." % mset.get_matches_estimated(), curses.A_BOLD)

        # Results
        for y, m in enumerate(mset):
            # /var/lib/apt-xapian-index/README tells us that the Xapian document data
            # is the package name.
            name = m[xapian.MSET_DOCUMENT].get_data()

            # Get the package record out of the Apt cache, so we can retrieve the short
            # description
            pkg = cache[name]

            # Print the match, together with the short description
            self.win.addstr(y+1, 0, "%i%% %s - %s" % (m[xapian.MSET_PERCENT], name, pkg.summary))

        self.win.refresh()

That's it, try it out.

You can use the wsvn interface to get to the full source code and the module it uses.

You can see a similar technique working in goplay, where it is also integrated with an interactive tag filter.

Posted Tue 06 Nov 2007 23:02:46 CET Tags: debian
apt-xapian-index: smart way of querying tags

apt-xapian-index: smart way of querying tags

I've recently posted:

Note that I've rewritten all the old posts to only show the main code snippets: if you were put off by the large lumps of code, you may want to give it another go.

Today I'll show how to implement a really good way of searching for Debtags tags. When I say really good, I mean the sort of good that after you run it you wonder how could it possibly manage to do it.

The idea is simple: you run a package search, but instead of showing the resulting packages, you ask Xapian to suggest tags like we saw in axi-query-expand.py.

For extra points, I'll use an adaptive cutoff in chosing the packages that go in the rset.

So, let's ask the user to enter some keywords to look for tags, and use them to run a normal package query:

# Build the base query
query = xapian.Query(xapian.Query.OP_OR, termsForSimpleQuery(args))

# Perform the query
enquire = xapian.Enquire(db)
enquire.set_query(query)

Now, instead of showing the results of the query, we ask Xapian what are the tags in the index that are most relevant to this search.

First, we pick some representative packages for the expand:

# Use an adaptive cutoff to avoid to pick bad results as references
matches = enquire.get_mset(0, 1)
topWeight = matches[0].weight
enquire.set_cutoff(0, topWeight * 0.7)

# Select the first 10 documents as the key ones to use to compute relevant
# terms
rset = xapian.RSet()
for m in enquire.get_mset(0, 10):
    rset.add_document(m[xapian.MSET_DID])

Then we define the filter that only keeps tags:

# Filter out all the keywords that are not tags
class Filter(xapian.ExpandDecider):
    def __call__(self, term):
        "Return true if we want the term, else false"
        return term[:2] == "XT"

Then we print the tags:

# This is the "Expansion set" for the search: the 10 most relevant terms that
# match the filter
eset = enquire.get_eset(10, rset, Filter())

# Print out the results
for res in eset:
    print "%.2f %s" % (res.weight, res.term[2:])

That's it. We turned a package search into a tag search, and this allows us to search for tags using keywords that are not present in the tag descriptions at all:

$ ./axi-query-tags.py explore the dungeons
27.50 game::rpg:rogue
26.14 use::gameplaying
17.53 game::rpg
10.27 uitoolkit::ncurses
...

$ ./axi-query-tags.py total world domination
7.55 use::gameplaying
5.68 x11::application
5.35 interface::x11
5.05 game::strategy
...

You can use the wsvn interface to get to the full source code and the module it uses.

You can see a similar technique working in the Debtags tag editor: enter a package, then choose "Available tags: search".

Next in the series: search as you type.

Posted Tue 06 Nov 2007 23:02:46 CET Tags: debian
apt-xapian-index: adaptive quality cutoff

apt-xapian-index: adaptive quality cutoff

I've recently posted:

Note that I've rewritten all the old posts to only show the main code snippets: if you were put off by the large lumps of code, you may want to give it another go.

Today I'll show how to implement an adaptive cutoff to get rid of the worse results.

Recall that Xapian shows results by decreasing order of quality, and as we pull more an more results out, we reach a point where the matches are so approximated that they look random.

This can be a problem if we want to change the order of the result, for example we may want to sort by package size, or by popcon popularity. There are many scenarios in which a really bad match could end up at the top of the results.

For most cases, you just want to say "discard all results whose quality is less than 70%". But sometimes you have queries that OR lots of terms, and even your top result, while still being a very good result, may be below the cutoff you decided.

Implementing an adaptive cutoff is extremely simple: first, you get the quality estimate of the top result:

# Retrieve the first result, and check its relevance
matches = enquire.get_mset(0, 1)
topWeight = matches[0].weight

Then you tell Xapian that you want a cutoff value that is, for example, 70% of that:

# Tell Xapian that we only want results that are at least 70% as good as that
enquire.set_cutoff(0, topWeight * 0.7)

Finally, you repeat the query. If you want, you can go for bigger result sets, as the cutoff will make it so that if you have lots of results, they will very likely be all good results:

matches = enquire.get_mset(0, 200)

This is it.

You can use the wsvn interface to get to the full source code and the module it uses.

Next in the series: smart way of querying tags.

Posted Sat 27 Oct 2007 21:51:43 CEST Tags: debian
apt-xapian-index: performing a simple query

apt-xapian-index: performing a simple query

I've recently posted an introduction of apt-xapian-index.

Today I'll show how to make simple queries to apt-xapian-index. If you feel like reimplementing my examples in another language, let me know and I'll include it to the post.

What I'm going to build is a replacement for apt-cache search that:

This is just a beginning: in future blog posts I'll show how to enhance a search with interesting advanced features.

First thing, we need to import the Xapian module, found in the package python-xapian. Documentation on the Python Xapian API can be found in /usr/share/doc/python-xapian.

import xapian

Then we open the apt-xapian-index database:

# Instantiate a xapian.Database object for read only access to the index
db = xapian.Database("/var/lib/apt-xapian-index/index")

Now we build a query from the command line arguments. We'll assume that if an argument is in the form foo::bar, then it's a Debtags tag instead of a normal keyword.

For normal keywords, we also search for the stemmed version, so that we can, for example, find "editing" when the user searches for "edit".

# Stemmer function to generate stemmed search keywords
stemmer = xapian.Stem("english")

# Build the terms that will go in the query
terms = []
for word in args:
    if word.islower() and word.find("::") != -1:
        # According to /var/lib/apt-xapian-index/README, Debtags tags are
        # indexed with the 'XT' prefix.
        terms.append("XT"+word)
    else:
        # If it is not a Debtags tag, then we consider it a normal keyword.
        word = word.lower()  # The index stores keyword all in lowercase
        terms.append(word)
        # If the word has a stemmed version, add it to the query.
        # /var/lib/apt-xapian-index/README tells us that stemmed terms have a
        # 'Z' prefix.
        stem = stemmer(word)
        if stem != word:
            terms.append("Z"+stem)

Now we have the terms for the query, and we can create a query that ORs them together.

One may ask, why OR and not AND? The reason is that, contrarily to apt-cache, Xapian scores results according to how well they matched.

Matches that match all the terms will score higher than the others, so if we build an OR query what we really have is an AND query that gracefully degenerates to closer matches when they run out of perfect results.

This allows stemmed searches to work nicely: if you look for 'editing', then the query will be 'editing OR Zedit'. Packages with the word 'editing' will match both and score higher, and packages with the word 'edited' will still match 'Zedit' and get included in the results.

# OR the terms together into a Xapian query.
#
query = xapian.Query(xapian.Query.OP_OR, terms)

We then run the query. Queries are run through a xapian.Enquire object:

# Perform the query
enquire = xapian.Enquire(db)
enquire.set_query(query)

The Enquire object returns results as an mset. An mset represents a view of the result set, and can be iterated to access the resulting documents. Here we iterate the mset and output the result of the query, looking up short descriptions with apt:

# Display the top 20 results, sorted by how well they match
matches = enquire.get_mset(0, 20)
print "%i results found." % matches.get_matches_estimated()
print "Results 1-%i:" % matches.size()
for m in matches:
    # /var/lib/apt-xapian-index/README tells us that the Xapian document data
    # is the package name.
    name = m[xapian.MSET_DOCUMENT].get_data()

    # Get the package record out of the Apt cache, so we can retrieve the short
    # description
    pkg = cache[name]

    # Print the match, together with the short description
    print "%i%% %s - %s" % (m[xapian.MSET_PERCENT], name, pkg.summary)

This is it.

You can use the wsvn interface to get to the full source code.

You can run the code passing keywords and Debtags tags. Try running it as:

    ./axi-query-simple.py role::program image edit
    ./axi-query-simple.py role::program game::arcade
    ./axi-query-simple.py kernel image

You can search Debtags tags using debtags tagsearch. In a later blog post, I'll show how to use apt-xapian-index to implement a better-than-you-would-ever-have-thought-possible tag search.

Next in the series: adding simple result filters to the query.

Posted Fri 26 Oct 2007 16:43:21 CEST Tags: debian
apt-xapian-index: searching for similar packages

apt-xapian-index: searching for similar packages

I've recently posted:

Today I'll show how to abuse Xapian to show a list of packages similar to a given one.

This time I'll try just linking to the code in wsvn and showing in the blog only show the most important bits.

So, we have a package name, and we want to show what are the packages similar to that one.

To do it, we simply build a big OR query with all the terms indexed for that package: Xapian will show us the packages whose terms are most similar, and that does the trick.

This works because Xapian gives us the best results first, therefore even if no package except the given one will give an exact match, we still get the nearest matches first.

In order to get the list of indexed terms given a package name we need to do two things:

  1. Get the Xapian document for the package.
  2. Get the termlist of the document.

To get the Xapian document we search for a term that only that document can have. In the index, the package name is indexed with the special prefix "XP", so we can search for that:

def docForPackage(pkgname):
    "Get the document corresponding to the package with the given name"
    # Query the term with the package name
    query = xapian.Query("XP"+pkgname)
    enquire = xapian.Enquire(db)
    enquire.set_query(query)
    # Get the top result only
    matches = enquire.get_mset(0, 1)
    if matches.size() == 0:
        return None
    else:
        m = matches[0]
        return m[xapian.MSET_DOCUMENT]

Then we build the big term list, by iterating the termlist of the document:

# Build a term list with all the terms in the given packages
terms = []
# Get the document corresponding to the package name
doc = docForPackage(pkgname)
if not doc: continue
# Retrieve all the terms in the document
for t in doc.termlist():
    if len(t.term) < 2 or t.term[:2] != 'XP':
        terms.append(t.term)

Note that it's trivial to fetch terms from more than one document, if you want to query "all packages a bit like this one and a bit like that one", although that's less of a useful feature.

Lastly, we build the final query:

# Build the big OR query
query = xapian.Query(xapian.Query.OP_AND_NOT,
            # Terms we want
            xapian.Query(xapian.Query.OP_OR, terms),
            # AND NOT the input packages
            xapian.Query("XP"+pkgname))

I add an AND_NOT part with the input package name so that we don't get in the output the package that we asked for.

This is it:

$ ./axi-query-similar.py debtags
20309 results found.
Results 1-20:
33% debtags-edit - GUI browser and editor for Debian Package Tags
27% tagcolledit - GUI editor for tagged collections
25% libtagcoll2-dev - Functions used to manipulate tagged collections (development version)
24% tagcoll - Commandline tool to perform operations on tagged collections
19% packagesearch - GUI for searching packages and viewing package information
18% doodle - Desktop Search Engine (client)
18% doodled - Desktop Search Engine (daemon)
18% libept0 - High-level library for managing Debian package information
18% upgrade-system - system upgrader from Konflux
18% libept-dev - High-level library for managing Debian package information
17% ept-cache - Commandline tool to search the package archive
16% tracker-utils - metadata database, indexer and search tool - commandline tools
[...]

You can use the wsvn interface to get to the full source code and the module it uses.

Next in the series: adaptive quality cutoff.

Posted Fri 26 Oct 2007 16:40:33 CEST Tags: debian