Using the apt-xapian-index catalogedtime plugin

apt-xapian-index has recently gained a catalogedtime plugin, which stores the timestamp of when the package was first cataloged.

At first glance this does not mean much, so I'll explain with an example. aptitude can show a list of "New Packages": those packages that it sees now for the first time. It is a very useful feature, but you cannot ask it "what packages became available during last week?"

With the catalogedtime plugin, the apt xapian index can now answer that question. You can also sort package results by "newness", or implement a user interface with some sort of "newness timeline" package filter option.

Of course you will find documentation about it in /var/lib/apt-xapian-index/README, since the index is self-documenting.

As usual in my apt-xapian-index posts, here is a code example:

#!/usr/bin/python
# coding: utf-8

# Show packages that were NEW in the last N days
# (C) 2010 Enrico Zini <enrico@enricozini.org>
# License: WTFPL version 2 (http://sam.zoy.org/wtfpl/)

import axi
import xapian
import time
import sys

class App:
    def __init__(self, days):
        self.db = xapian.Database(axi.XAPIANINDEX)
        # Read the contents of /var/lib/apt-xapian-index/values
        self.values, self.descs = axi.readValueDB()
        # Find the value ID for the catalogedtime timestamps
        self.cattime_vid = self.values["catalogedtime"]
        # Filter packages first seen more than than X days ago
        self.query = xapian.Query(xapian.Query.OP_VALUE_GE, self.cattime_vid,
                    xapian.sortable_serialise(time.time() - 86400*days))

    def main(self):
        # Perform the query
        self.enquire = xapian.Enquire(self.db)
        self.enquire.set_query(self.query)
        # Sort by cataloged time
        self.enquire.set_sort_by_value(self.cattime_vid, True)
        # Print the first 200
        matches = self.enquire.get_mset(0, 200)
        for m in matches:
            name = m.document.get_data()
            val = m.document.get_value(self.cattime_vid)
            fval = xapian.sortable_unserialise(val)
            tm = time.localtime(fval)
            timestr = time.strftime("%c", tm)
            print "%s %s" % (timestr, name)


if __name__ == "__main__":
    try:
        days = int(sys.argv[1])
    except:
        print >>sys.stderr, "Usage: %s days" % sys.argv[0]
        sys.exit(1)

    app = App(days)
    app.main()
    sys.exit(0)

Enjoy!