apt-xapian-index - Everything you always wanted to index about Debian packages, but were afraid to ask
apt-xapian-index is a maintenance tools for a Xapian index of Debian package information This package provides update-apt-xapian-index, a tool to maintan a Xapian index of Debian package information in /var/lib/apt-xapian-index.
update-apt-xapian-index allows plugins to be installed in /usr/share/apt-xapian-index to index all sorts of extra information, such as Debtags tags, popcon information, package ratings and anything else that would fit.
The index generated by update-apt-xapian-index is self-documenting, as it contains an autogenerated README file with information on the index layout and all the data that can be found in it.
Resources:
News
Cross-distro Meeting on Application Installer
I have been to a Cross-distro Meeting on Application Installer which to the best of our knowledge is also the first one of its kind. Credit goes to Vincent Untz for organising it, to OpenSUSE for hosting it and to the various sponsors for getting us there.
It went surprisingly well. We got along, got stuff done, did as much work as possible to agree on as many formats, protocols and technologies as we possibly could.
The timing of it is very important, as most major distros would like to adopt some of the features that just became popular in the various new app markets and stores, such as screenshots, user comments and ratings. It looks like a lot of new code is about to be written, or a lot of existing code is about to gain quite a bit of popularity.
For my part, I presented the work on Debtags and apt-xapian-index.
With regards to Debtags, other distros seem to be missing a compehensive classification system, and Debtags is, well, it.
With regards to apt-xapian-index, we just noticed that it's the perfect back-end for what everyone would like to do, and the index structure is rather distribution-agnostic, and it's been road-tested with considerable success by at least software-center, so it attracted quite a bit of interest, and will likely attract some more.
Just to prove a point I put together a prototype webby markety appy thing in just a few hours of work.
The meeting was also the ideal place to create a joint effort to match package names across distributions, which means that a lot of things that were hard to share before, such as screenshots, tags and patches, are suddenly not hard to share anymore.
A prototype webby markety appy thing
What better way to introduce my work at an Application Installer meeting than to come with a prototype package browser modeled after shopping sites developed in just a few hours?
It's a little Flask webapp that just works on any Debian system, using the local apt-xapian-index as a backend. It has fast keyword search, faceted navigation and screenshots, and it runs on your system showing the packages that you have available.
To try it:
git clone git://git.debian.org/users/enrico/pkgshelf.git
cd pkgshelf
./web-server.py
Then visit http://localhost:5000
It didn't have much interface polishing, as it's just a quick technology demo. However you can see that:
- keyword search is fast (fast enought that it could be made to search as you type);
- relevant tags appear on the left, grouped by facets;
- the most relevant tags are highlighted;
- the less relevant tags could be hidden behind a
[more]expander; - you can choose several strategies to hide packages you may find irrelevant.
Things that need doing:
- hiding uninteresting facets;
- making it pretty.
It's essentially JavaScript and CSS work. Anyone wants to play?
Using the apt-xapian-index catalogedtime plugin
apt-xapian-index has recently gained a catalogedtime plugin, which stores the timestamp of when the package was first cataloged.
At first glance this does not mean much, so I'll explain with an example. aptitude can show a list of "New Packages": those packages that it sees now for the first time. It is a very useful feature, but you cannot ask it "what packages became available during last week?"
With the catalogedtime plugin, the apt xapian index can now answer that question. You can also sort package results by "newness", or implement a user interface with some sort of "newness timeline" package filter option.
Of course you will find documentation about it in
/var/lib/apt-xapian-index/README, since the index is self-documenting.
As usual in my apt-xapian-index posts, here is a code example:
#!/usr/bin/python # coding: utf-8 # Show packages that were NEW in the last N days # (C) 2010 Enrico Zini <enrico@enricozini.org> # License: WTFPL version 2 (http://sam.zoy.org/wtfpl/) import axi import xapian import time import sys class App: def __init__(self, days): self.db = xapian.Database(axi.XAPIANINDEX) # Read the contents of /var/lib/apt-xapian-index/values self.values, self.descs = axi.readValueDB() # Find the value ID for the catalogedtime timestamps self.cattime_vid = self.values["catalogedtime"] # Filter packages first seen more than than X days ago self.query = xapian.Query(xapian.Query.OP_VALUE_GE, self.cattime_vid, xapian.sortable_serialise(time.time() - 86400*days)) def main(self): # Perform the query self.enquire = xapian.Enquire(self.db) self.enquire.set_query(self.query) # Sort by cataloged time self.enquire.set_sort_by_value(self.cattime_vid, True) # Print the first 200 matches = self.enquire.get_mset(0, 200) for m in matches: name = m.document.get_data() val = m.document.get_value(self.cattime_vid) fval = xapian.sortable_unserialise(val) tm = time.localtime(fval) timestr = time.strftime("%c", tm) print "%s %s" % (timestr, name) if __name__ == "__main__": try: days = int(sys.argv[1]) except: print >>sys.stderr, "Usage: %s days" % sys.argv[0] sys.exit(1) app = App(days) app.main() sys.exit(0)
Enjoy!
fuss-launcher: an application launcher built on apt-xapian-index
Long ago I blogged about using apt-xapian-index to write an application launcher.
Now I just added a couple of new apt-xapian-index plugins that look like they have been made just for that.
In fact, they have indeed been made just for that.
After my blog post in 2008, people from Truelite and the FUSS project took up the challenge and wrote a launcher applet around my example engine.
The prototype has been quite successful in FUSS, and as a consequence I've been asked (and paid) to bring in some improvements.
The result, that I have just uploaded to NEW, is a package called
fuss-launcher:
* New upstream release
- Use newer apt-xapian-index: removed need of local index
- Dragging a file in the launcher shows the applications that can open it
- Remembers the applications launched more frequently
- Allow to set a list of favourite applications
To get it:
apt-get install fuss-launcher(after it passed NEW);- or
git clone http://git.fuss.bz.it/git/launcher.git/andapt-get install python-gtk2 python-xapian python-xdg apt-xapian-index app-install-data
It requires apt-xapian-index >= 0.35.
To try it:
- Make sure your index is up to date, especially if you just installed
app-install-data: just runupdate-apt-xapian-indexas root. - Run
fuss-launcher. - Click on the new tray icon to open the launcher dialog.
- Type some keywords and see the list of matching applications come to life as you type.
It's worth mentioning again that all this work was sponsored by Truelite and the Fuss project, which rocks.
Some screenshots:
When you open the launcher, by default it shows the most frequently started applicationss and the favourite applications:
When you type some keywords, you get results as you type, and context-sensitive completion:
When you drag a file on the launcher you only see the applications that can open that file:
New apt-xapian-index plugins
Besides a fair bit of refactoring and cleanup, I've recently added two new plugins to apt-xapian-index:
app-install
If app-install-data is installed, information about .desktop files will now enter the index.
This allows, for example, to limit query results to only those packages that contain .desktop files, which is quite useful, for example for building desktop-oriented package managers.
aliases
It reads term->aliases mapping from files in /etc/apt-xapian-index/aliases/
or /usr/share/apt-xapian-index/aliases/, and feeds them as
synonyms in the index.
apt-xapian-index ships an example alias file, to give people who know the wrong software names a chance to find the right ones:
# Aliases expanding names of popular applications
excel XToffice::spreadsheet
powerpoint XToffice::presentation
photoshop XTworks-with::image:raster
coreldraw XTworks-with::image:vector
autocad XTworks-with::3dmodel
Notice how it is possible to use index terms that happen to be Debtags tags as synonyms, which yields better results, language independence and extra coolness.
apt-xapian-index now comes with a query tool
I've just uploaded a new version of apt-xapian-index to unstable. Now it comes with a little query tool called axi-cache.
You can search this way:
axi-cache search foo bar baz facet::tag sec:section
In fact, you can use most of the things described here.
You can then say axi-cache more to get more results, or axi-cache again to
retry a search, or axi-cache again wibble wabble to add keywords to the
last search.
This allows to start with a search and tweak it. In order to work it needs to
save the last search so again or more can amend it. Searches are saved in
~/.cache/axi-cache.state.
You can search tags instead of packages by adding --tags.
It will suggest extra terms for the search, and also suggest extra tags.
It can even correct spelling mistakes in the query terms once the index has
been rebuilt with this new version of update-apt-xapian-index.
I need to thank Carl Worth who, with notmuch, reminded me that if I just build a nice interface on top of Xapian's query parser I go quite a long way towards making a Xapian database extremely useful indeed.
axi-cache also integrates with bash-completion so that tab completion is
context-sensitive to the command line being typed:
$ axi-cache search image pro
probability process processors programmability provides
problem processing production pronounced proving
$ axi-cache search kernel pro
problems processor production proved provided
processing processors programming provide provides
Thanks to David Paleino who wrote the bash completion script.
Just for reference, this is the command line help:
$ axi-cache help
Usage: axi-cache [options] command [args]
Query the Apt Xapian index.
Commands:
axi-cache help show a summary of commands
axi-cache search [terms] start a new search
axi-cache again [query] repeat the last search, possibly adding query terms
axi-cache more [count] show more terms from the last search
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-s SORT, --sort=SORT sort by the given value, as listed in /var/lib/apt-
xapian-index/values
--tags show matching tags, rather than packages
--tabcomplete=TYPE suggest words for tab completion of the current
command line (type is 'plain' or 'partial')
If you install the package for the first time, you may need to rebuild the
index by running update-apt-xapian-index as root before using axi-cache.



