Introducing apt-xapian-index

apt-xapian-index has just been approved into experimental, and in the next days I'm going to blog more about it.

The package contains a tool called update-apt-xapian-index that indexes Debian package metadata into a Xapian index located at /var/lib/apt-xapian-index/index.

The index is read-only, except for update-apt-xapian-index; however, it is world-readable: every user can query it, all the time, at the same time, even during index updates.

The index can contain more than package descriptions. update-apt-xapian-index indexes data using plugins located in /usr/share/apt-xapian-index/plugins, and any package can add their own. For example, debtags will provide a plugin to index tags.

Since Xapian can index numeric values as well, if anyone makes a popcon package that downloads popcon information, they can provide a plugin to index popcon values. If anyone makes an iterating package that downloads ratings, they can provide a plugin to index ratings.

Another plugin could be a specialised Debian stemmer that generates token such as foo out of libfoo-dev.

I think you get the idea: it's very extensible. You can have a look at the initial set of plugins in subversion.

The index is also self-documenting, so that one can keep track of all the intresting things that can be found in it. update-apt-xapian-index does not only maintain the index, but also the file /var/lib/apt-xapian-index/README that aggregates documentation provided by the plugins.

To query the index, you just use Xapian. Debian contains Xapian bindings for various languages:

In the next days I am going to post various example queries and interesting tricks that the index allows you to do.

It's going to be fun.

Next in the series: performing a simple query.