Introducing apt-xapian-index
apt-xapian-index has just been approved into experimental, and in the next days I'm going to blog more about it.
The package contains a tool called
update-apt-xapian-index that indexes Debian package
metadata into a Xapian index
located at /var/lib/apt-xapian-index/index.
The index is read-only, except for
update-apt-xapian-index; however, it is
world-readable: every user can query it, all the time, at the same
time, even during index updates.
The index can contain more than package descriptions.
update-apt-xapian-index indexes data using plugins
located in /usr/share/apt-xapian-index/plugins, and
any package can add their own. For example, debtags will provide a
plugin to index tags.
Since Xapian can index numeric values as well, if anyone makes a popcon package that downloads popcon information, they can provide a plugin to index popcon values. If anyone makes an iterating package that downloads ratings, they can provide a plugin to index ratings.
Another plugin could be a specialised Debian stemmer that generates token such as foo out of libfoo-dev.
I think you get the idea: it's very extensible. You can have a look at the initial set of plugins in subversion.
The index is also self-documenting, so that one can
keep track of all the intresting things that can be found in it.
update-apt-xapian-index does not only maintain the
index, but also the file
/var/lib/apt-xapian-index/README that aggregates
documentation provided by the plugins.
To query the index, you just use Xapian. Debian contains Xapian bindings for various languages:
- libxapian-dev for C++
- libsearch-xapian-perl for Perl
- libxapian-ruby1.8 for Ruby
- python-xapian for Python
- tclxapian for Tcl
- php5-xapian for PHP
In the next days I am going to post various example queries and interesting tricks that the index allows you to do.
It's going to be fun.
Next in the series: performing a simple query.