Static site generators

I decided to rethink the state of my personal site, and try out some of the new static site generators that are available now.

To do that, I jotted down a series of things that I want in a static site generator, then wrote a tool to convert my ikiwiki site to other formats, and set out to evaluate things.

As a benchmark I did a full rebuild of my site, which currently contains 1164 static files and 458 markdown pages.

My requirements

Free layout for my site

My / is mostly a high-level index to the site contents.

Blog posts are at /blog.

My talk archive is organised like a separate blog at /talks.

I want the freedom to create other sections of the site, each with its own rss feed, located wherever I want in the site hierarchy.

Assets next to posts

I occasionally blog just a photo with a little comment, and I would like the .md page with the comment to live next to the image in the file system.

I did not even know that I had this as a requirement until I found static site generators that mandated a completely different directory structure for markdown contents and for static assets.

Multiple RSS/Atom feeds

I want at least one RSS/Atom feed per tag, because I use tags for marking which articles go to http://planet.debian.org.

I also want RSS/Atom feeds for specific parts of the site, like the blog and talks.

Arbitrary contents in /index.html

I want total control over the contents of the main home page of the site.

Quick preview while editing contents

I do not like to wait several seconds for the site to be rebuilt at every review iteration of the pages I write.

This makes me feel like the task of editing is harder than it should, and makes me lose motivation to post.

Reasonable time for a full site rebuild

I want to be able to run a full rebuild of the site in a reasonable time.

I could define "reasonable" in this case as how long I can stare at the screen without getting bored, starting to do something else, and forgetting what it was that I was doing with the site.

It is ok if a rebuild takes something like 10 or 30 seconds. It is not ok if it takes minutes.

Code and dependency ecosystems that I can work with

I can deal with Python and Go.

I cannot deal with Ruby or JavaScript.

I forgot all about Perl.

Also, if it isn't in Debian it does not exist.

Decent themes out of the box

One of my hopes in switching to a more mainstream generator is to pick and choose themes and easily give my site a more modern look.

Hugo

Hugo is written in Go and is in Debian testing.

Full rebuild time for my site is acceptable, and it can even parallelize:

  $ time hugo
  real      0m5.285s
  user      0m9.556s
  sys       0m1.052s

Free layout for my site was hard to get.

I could replace /index.html by editing the template page for it, but then I did not find out how to create another similar index in an arbitrary place.

Also, archetypes are applied only on the first path component of new posts, but I would like them instead to be matched on the last path component first, and failing that traveling up to the path until the top. This should be easy to fix by reorganizing the content a bit around here

For example, a path for a new blog post of mine could be blog/2016/debian/ and I would like it to match the debian archetype first, and failing that the blog archetype.

Assets next to posts almost work.

Hugo automatically generates one feed per taxonomy element, and one feed per section. This would be currently sufficient for me, although I don't like the idea that sections map 1 to 1 to toplevel directories in the site structure.

Hugo has a server that watches the file system and rerenders pages as they are modified, so the quick preview while editing works fine.

About themes, it took me several tries to find a theme that would render navigation elements for both sections and tags, and most themes would render by pages with white components all around, and expect me to somehow dig in and tweak them. That frustrated me, because for quite a while I could not tell if I had misconfigured Hugo's taxonomies or if the theme was just somehow incomplete.

Nikola

Nikola is written in Python and is in Debian testing.

Full rebuild time for my site is almost two orders of magnitude more than Hugo, and I am miffed to find the phrases "Nikola is fast." or "Fast building process" in its front page and package description:

  $ time nikola build
  real      3m31.667s
  user      3m4.016s
  sys       0m24.684s

Free layout could be achieved fiddling with the site configuration to tell it where to read sources.

Assets next to post work after tweaking the configuration, but they require to write inconsistent links in the markdown source: https://github.com/getnikola/nikola/issues/2266 I have a hard time accepting that that, because I want to author content with consistent semantic interlinking, because I want to be able 10 years from now to parse it and convert it to something else if a new technology comes out.

Nikola generates one RSS/Atom feed per tag just fine. I have not tried generating feeds for different sections of the site.

Incremental generation inside its built in server works fine.

Pelican

Pelican is written in Python and is in Debian testing.

Full rebuild time for my site is acceptable:

  $ time pelican -d
  real      0m18.207s
  user      0m16.680s
  sys       0m1.448s

By default, pelican seems to put generate a single flat directory of html files regardless of the directory hierarchy of the sources. To have free layout, pelican needs some convincing in the configuration:

  PATH_METADATA = r"(?P<relpath>.+)\.md"
  ARTICLE_SAVE_AS = "{relpath}/index.html"

but even if I do that, the urls that it generates still point to just {slug}/index.html and I have not trivially found a configuration option to fix that accordingly. I got quite uncomfortable at the idea of needing to configure content generation and linking to match, instead of having one automatically being in sync with the other.

Having assets next to posts seems to be possible (also setting STATIC_PATHS = ["."]), but I do not recall making progress on this front.

I did not manage to generate a feed for each tag out of the box, and probably there is some knob in the configuration for it.

I gave up with Pelican as trying it out felt like a constant process of hacking the configuration from defaults that do not make any sense for me, withouth even knowing if a configuration exists that would do what I need

Ikiwiki

Ikiwiki is written in Perl and is in Debian. Although I am not anymore proficient with Perl, I was already using it, so it was worth considering.

Full rebuild time feels a bit on the slow side but is still acceptable:

  $ time ikiwiki --setup site.setup
  real      0m37.356s
  user      0m34.488s
  sys       0m1.536s

In terms of free site structure, all feeds for all or part of the site, ikiwiki just excels.

I even considered writing a python web server that monitors the file system and calls ikiwiki --refresh when anything changes, and calling it a day.

However, when I tried to re-theme my website around a simple bootstrap boilerplate, I found that to be hard, as a some of the HTML structure is hardcoded in Perl (and it's also my fault) and there is only so much that can be done by tweaking the (rather unreadable) templates.

siterefactor

During all these experiments I had built siterefactor to generate contents for all those static site engines, and it was going through all the contents quite fast:

  $ time ./siterefactor src dst -t hugo
  real  0m1.222s
  user  0m0.904s
  sys   0m0.308s

So I wondered how slow it would become if, instead of making it write markdown, I made it write HTML via python markdown and Jinja2:

  $ time ./siterefactor ~/zz/ikiwiki/pub/ ~/zz/ikiwiki/web -t web
  real  0m6.739s
  user  0m5.952s
  sys   0m0.416s

I then started wondering how slower it would become if I implemented postprocessing of all local URLs generated by Markdown to make sure they are kept consistent even if the path of a generated page is different than the path of its source. Not much slower, really.

I then added taxonomies. And arbitrary Jinja2 templates in the input, able to generate page lists and RSS/Atom feeds.

And theming.

And realised that reading all the sources and cross-linking them took 0.2 seconds, and the rest was generation time. And that after cross-linking, each page can be generated independently from all the others.

staticsite

So my site is now generated with staticsite:

  $ time ssite build
  real  0m6.833s
  user  0m5.804s
  sys   0m0.500s

It's comparable with Hugo, and on a single process.