Checking sphinx code blocks

I'm too lazy to manually check code blocks in autogenerated sphinx documentation to see if they are valid and reasonably up to date. Doing it automatically feels much more interesting to me: here's how I did it.

This is a simple sphinx extension to extract code blocks in a JSON file.

If the documentation is written well enough, I even get annotation on what programming language each snippet is made of:

## Extract code blocks from sphinx

from docutils.nodes import literal_block, Text
import json

found = []

def find_code(app, doctree, fromdocname):
    for node in doctree.traverse(literal_block):
        # if "dballe.DB.connect" in str(node):
        lang = node.attributes.get("language", "default")
        for subnode in node.traverse(Text):
                "src": fromdocname,
                "lang": lang,
                "code": subnode,
                "source": node.source,
                "line": node.line,

def output(app, exception):
    if exception is not None:

    dest = app.config.test_code_output
    if dest is None:

    with open(dest, "wt") as fd:
        json.dump(found, fd)

def setup(app):
    app.add_config_value('test_code_output', None, '')

    app.connect('doctree-resolved', find_code)
    app.connect('build-finished', output)

    return {
        "version": '0.1',
        'parallel_read_safe': True,
        'parallel_write_safe': True,

And this is an early prototype python code that runs each code block in a subprocess to see if it works.

It does interesting things, such as: