Writing a simple external data oriented Sphinx extension
What is Sphinx?
I love Sphinx. I was a little hesitant about RST at first coming from a MARKDOWN OR DEATH background I was nervous about it being Yet Another Markup Language to learn like wikicode and its ilk, but I was quickly won over by the power of directives.
But let’s backup. Sphinx allows you to build a website like those you’ve seen under the readthedocs.io domain like Boto3.
I built CloudWanderer’s documentation with Sphinx, and that automatically generates page after page of documentation by pulling in data from CloudWanderer and Boto3 and merging them together in a way that makes sense.
Here’s how I did it.
Creating a new Sphinx project
Grab yourself an empty folder and let’s install and initialise a new Sphinx project.
$ pip install sphinx sphinx_rtd_theme
$ sphinx-quickstart
Welcome to the Sphinx 4.0.0 quickstart utility.
Selected root path: .
> Separate source and build directories (y/n) [n]:
> Project name: Test
> Author name(s): Sam
> Project release []:
> Project language [en]:
Creating file ...
Finished: An initial directory structure has been created.
Great! Now open up the conf.py
that’s been created and change
extensions = []
extensions = ["sphinx_rtd_theme"]
html_theme = "alabaster"
html_theme = "sphinx_rtd_theme"
This will replace the slightly old school default theme with the read the docs theme and make us feel a bit more like we’re in 2021!
Build HTML
Now you can build and open the html
MacOS
$ make html
$ open _build/html/index.html
PowerShell on Windows
> ./make.bat html
> ii _build/html/index.html
Voila! A newly built Sphinx documentation page!
Writing plain old RST
Open up index.rst
and remove everything under:
Welcome to Test's documentation!
================================
Tip: you don’t want to remove that, as that’s the RST equivalent of a H1 header.)
After that, type in any old thing you like, I find the best source of information on RST Sphinx markup is Sublime and Sphinx Guide, just ignore the fact that it’s geared towards Sublime!
Once you’ve saved your index.rst
with your new text, run make html
or ./make.bat html
and refresh your page.
Writing an extension.
Sphinx has a bunch of built in extensions, the most often used ones being things like autodoc and doctest which respectively automatically generate documentation from your code’s docstrings and allow you test the example code snippets you write into your documentation. Both of these are MIND blowingly awesome and you should check them out.
The tutorial I’m outlining here is based heavily on the tutorial from the official Sphinx documentation, streamlined a bit for my learning style, with a few extra bits sprinkled in to answer some of the things I found difficult to track down and understand when I went through it the first time.
Use Case
Sphinx’s extensions are hugely powerful, and we’re going to use it to bring data in from JSON files and make them part of our documentation in a useful fashion.
Create a base extension
Start off by creating a folder called _ext
with a file my_first_sphinx_extension.py
inside it.
Your folder structure should now look a little like this.
$ tree -L 2
.
├── Makefile
├── _build
│ ├── doctrees
│ └── html
├── _ext
│ └── my_first_sphinx_extension.py
├── _static
├── _templates
├── conf.py
├── index.rst
└── make.bat
Open up your my_first_sphinx_extension.py
file and enter in the following:
from docutils import nodes
from sphinx.util.docutils import SphinxDirective
class ShowJsonDirective(SphinxDirective):
has_content = True
def run(self) -> list:
return [nodes.paragraph("", "Hello World.")]
def setup(app: object) -> dict:
app.add_directive("show-json", ShowJsonDirective)
Now update conf.py
to change
extensions = ["sphinx_rtd_theme"]
to include your new extension
extensions = ["sphinx_rtd_theme", "my_first_sphinx_extension"]
and also add
import sys
import os
sys.path.append(os.path.abspath("./_ext"))
anywhere in your conf.py
so that it can find your new extension
Then add a reference to your new directive into index.rst
e.g.
Welcome to Test's documentation!
================================
.. show-json ::
And there we go! Well… that’s a bit underwhelming, let’s get started with formatting our output at least!
Formatting
This is the bit I could not for the life of me figure out how to do when I read the official tutorial.
You’ll notice that we added our Hello World
in a docutils.nodes.paragraph
object.
So where do we find other semantic objects for us to place into our content?
So far as I can tell the official canonical reference is here: Docutils docs but as it’s ironically littered with to be completed
I found it incredibly hard to find and understand the formatting elements I needed.
So let’s just use RST! Screw creating the objects themselves, let’s just build out some RST markup and we’ll let Sphinx handle the rest!
To do that we need to add the following parse_rst
method to our ShowJsonDirective
class
def parse_rst(self, text):
parser = RSTParser()
parser.set_application(self.env.app)
settings = OptionParser(
defaults=self.env.settings,
components=(RSTParser,),
read_config_files=True,
).get_default_values()
document = new_document("<rst-doc>", settings=settings)
parser.parse(text, document)
return document.children
This method expects us to supply 1 argument which is a multi-line string of RST markup and will return the
corresponding docutils objects so we can return them from our extension’s run
method.
To use this, we can change our content to look like this:
def run(self) -> list:
return self.parse_rst("\n* hello\n* world")
This uses the *
RST markup to render an unordered list with two elements (hello and world).
Our whole extension file now looks like (note the new new_document
import):
from sphinx.parsers import RSTParser
from docutils.frontend import OptionParser
from sphinx.util.docutils import SphinxDirective
from docutils.utils import new_document
class ShowJsonDirective(SphinxDirective):
has_content = True
def run(self) -> list:
return self.parse_rst("\n* hello\n* world")
def parse_rst(self, text):
parser = RSTParser()
parser.set_application(self.env.app)
settings = OptionParser(
defaults=self.env.settings,
components=(RSTParser,),
read_config_files=True,
).get_default_values()
document = new_document("<rst-doc>", settings=settings)
parser.parse(text, document)
return document.children
def setup(app: object) -> dict:
app.add_directive("show-json", ShowJsonDirective)
And if we run make html
and refresh the page again we get…
Tip: If you run make html
after editing your extension and nothing seems to have changed, it may be because Sphinx hasn’t detected any change in the doc source (because, well, there hasn’t been one!). You can force Sphinx to generate it again by removing the _build
directory with rm -rf _build && make html
Hooking up the JSON
Now the easy part! Let’s make this dynamic!
Write a file called airports.json
inside the base directory (alongside conf.py
) and place the following inside it.
[
{"Name": "Heathrow", "Url": "https://www.heathrow.com/"},
{"Name": "Gatwick", "Url": "https://www.gatwickairport.com/"},
{"Name": "Luton", "Url": "https://www.london-luton.co.uk/"}
]
Now update your extension’s file to the following:
import json
from sphinx.parsers import RSTParser
from docutils.frontend import OptionParser
from sphinx.util.docutils import SphinxDirective
from docutils.utils import new_document
class ShowJsonDirective(SphinxDirective):
has_content = True
def run(self) -> list:
rst = ""
with open("airports.json", "r") as f:
airports = json.load(f)
for airport in airports:
rst += f"* `{airport['Name']} <{airport['Url']}>`_\n"
print(rst)
return self.parse_rst(rst)
def parse_rst(self, text):
parser = RSTParser()
parser.set_application(self.env.app)
settings = OptionParser(
defaults=self.env.settings,
components=(RSTParser,),
read_config_files=True,
).get_default_values()
document = new_document("<rst-doc>", settings=settings)
parser.parse(text, document)
return document.children
def setup(app: object) -> dict:
app.add_directive("show-json", ShowJsonDirective)
This will take our JSON, and convert it into the following RST:
* `Heathrow <https://www.heathrow.com>`_
* `Gatwick <https://www.gatwickairport.com/>`_
* `Luton <https://www.london-luton.co.uk/>`_
Which creates a list of links as per the standard RST link syntax!
The print
will also make the above appear so that we can see the RST before it’s rendered when we run make html
.
Once you have run make html
again you can see our beautiful auto-generated output!
Conclusion & Further Reading
That’s all there is to it! It’s really easy to get started with Sphinx extensions.
You can find the full code on GitHub.