Introductory notes about tagging

A group of nice Swiss researchers are doing some research on Debian, and as part of their research they are about to tag around 400 libraries using Debtags. They feel that being able to distinguish projects by categories will help us to improve their analysis of component (library) reuse.

They asked me if I had useful Debtags information for them besides the website and the Debconf5 paper, so I did some research and I share it here.

The list archives have various useful posts:

Also, the vocabulary itself has short and long comments for each tag, and the long comments sometimes have useful instructions.

In order to contribute properly reviewed tags, I suggested to start by posting tag patches on the mailing list: after we discussed some of them and we trust each others, I'll be happy to give commit access to the svn repository.

To create a tag patch, one can use tagcoll:

svn cat svn:// > tags
cp tags tags.edited
[...edit tags.edited...]
tagcoll diff tags tags.edited

Note that there are probably many new tags in the not-yet-revied database, so one may want to proceed this way:

  1. svn cat svn:// > tags
  2. remove from tags all the lines corresponding to the packages you're not interested in
  3. wget
  4. remove from tags-current all the lines corresponding to the packages you're not interested in
  5. tagcoll diff tags tags-current > changes
  6. edit changes removing those changes that make no sense you can review the edits you made to the changes patch this way: svn cat svn:// | tagcoll --patch-from=changes.orig > tmp1 svn cat svn:// | tagcoll --patch-from=changes.edited > tmp2 tagcoll diff tmp1 tmp2
  7. apply the reviewed patch to the svn repository: svn cat svn:// | tagcoll --patch-from=changes copy > tags-patched
  8. work from there

Everyone is of course free to use the list for any questions.