apt-xapian-index: search as you type

I've recently posted:

Note that I've rewritten all the old posts to only show the main code snippets: if you were put off by the large lumps of code, you may want to give it another go.

Today I'll show how to implement a very attractive feature for a user interface: search as you type. The idea is that you don't need to press enter to fire up a query: instead, the results materialise in front of your eyes as you type them.

The example I created uses curses, but the idea is good on any interactive user interface.

The main thing to keep in mind with search as you type is that the last word is likely to be partially typed, unless maybe some timeout expired since the user's last keystroke.

Xapian comes into help here, as it allows us to expand the partially typed word into an OR query with all the terms that start with it. This means that if we are typing, for example, "progr", we can turn the query into "program OR programmer OR programming OR programmed [...and so on...]".

I won't show the UI code, except a simple input loop that triggers the query at every keystroke:

    def mainloop(self):
        while True:
            c = self.win.getch()
            self.line += chr(c)

The interesting part is in the update function.

First we split the line in words and convert the words into a query:

        # Split the line in words
        args = self.splitline.split(line)
        # Convert the words into terms for the query
        terms = termsForSimpleQuery(args)

Then we expand the last word with all possible completions:

        # Since the last word can be partially typed, we add all words that
        # begin with the last one.
        terms.extend([x.term for x in db.allterms(args[-1])])

Now we can build the query. Of course you can add all other sorts of things to the query, for example a boolean expression of tag filter like in axi-query-pkgtype.py; Xapian will cope.

        # Build the query
        query = xapian.Query(xapian.Query.OP_OR, terms)

Finally the query. For bonus points you can do the adaptive cutoff trick to discard bad results.

In my case, since I don't implement scrolling of results, I also limit them to what fits in the window:

        # Retrieve as many results as we can show
        mset = enquire.get_mset(0, self.size - 1)

Finally, draw the results on screen:

        # Redraw the window

        # Header
        self.win.addstr(0, 0, "%i results found." % mset.get_matches_estimated(), curses.A_BOLD)

        # Results
        for y, m in enumerate(mset):
            # /var/lib/apt-xapian-index/README tells us that the Xapian document data
            # is the package name.
            name = m[xapian.MSET_DOCUMENT].get_data()

            # Get the package record out of the Apt cache, so we can retrieve the short
            # description
            pkg = cache[name]

            # Print the match, together with the short description
            self.win.addstr(y+1, 0, "%i%% %s - %s" % (m[xapian.MSET_PERCENT], name, pkg.summary))


That's it, try it out.

You can use the wsvn interface to get to the full source code and the module it uses.

You can see a similar technique working in goplay, where it is also integrated with an interactive tag filter.

Next in the series: dynamic tag cloud.