Implementing my Website in Flask - Part 4

Nov 3, 2021

This is part 4 in a series about implementing a personal blog in Python's Flask framework. In this post, I'll cover how I chose to add posts, access them, and render the syntax highlighting of the code.

If you're interested, other parts of the series can be found below:

If you are interested in further details, you can find the source of this blog on Github.

Adding New Posts

My choice for adding a new blog post is really simple. I just add a new HTML template somewhere where Flask knows to look for it. Then Flask renders the blog post just like any other page.

At first, this might not sound that user-friendly since I essentially have to write code each time I want to add a blog post. But my alternatives would be to:

  • Use a markup language like the one used for Github's READMEs or Wiki articles
  • Implement a rich text editor in a separate Admin screen in my blog

In both cases, I would need to implement some sort of blog post rendering feature beyond what's already being provided by the browser and Jinja. I couldn't convince myself that I would be able to add a feature that wasn't already available with HTML (a markup language in itself). In the second case, I'd also need to create a dynamic Admin screen with a login feature, a database, and a non-trivial user input in the rich text editor.

Since my other options didn't seem like great alternatives, I went basic and just stuck to plain HTML. It also gives the added benefit of being version controlled (instead of being stuck in a database) and editable with Emacs (objectively the best text editor).

Accessing Posts

Serving the post contents with HTML templates is also easy. I can essentially route the URI matching the post name to the rendered HTML template contents with Flask. If I ended up using an Admin screen and database to store files, it would require a database lookup.

One huge downside with the HTML template method is accessing metadata about the posts. I first ran into this problem when trying to implement SEO for the website. Search engines need a lot of data about each post including:

  • Author
  • Date
  • Description
  • Post contents
  • Title
  • URL

Due to how Flask/Jinja's HTML templates are defined, the code that renders the HTML (which contains the SEO content) is pretty much decoupled from the content in the template. The rendering code only knows the template location, not the actual contents of the template. However, I had to somehow get access to the rendered blog post contents before rendering the rest of the page.

What I came up with isn't great, and it can probably still be improved upon. If someone is considering taking my lazy approach of using HTML templates for posts instead of using a database, they'll have to keep this in mind if they'll need the post metadata.

My solution is to load all of the blog posts at startup, render them, and then parse the metadata and blog post contents into a container called the PostList. Afterwards, the PostList can be referenced by the code serving up the HTML if it needs access to any metadata or contents of a blog post. The actual posts in the container are sorted by date, which is convenient for rendering those pages which are time-ordered like the Archive and the index page.

Once the PostList is in the template context, accessing the information within is pretty simple. Here's a snippet of the Archive template as it renders the list of titles and dates from the PostList:

1
2
3
4
5
6
7
8
{% for post in post_list %}
<div>
<a href='{{ post.full_url }}/'>
  <p>{{ post.datestr }}</p>
  <p>{{ post.title }}</p>
</a>
</div>
{% endfor %}

In this example, post_list is the PostList in template context. What this loop does is create a link to each posts with the post's title and date as the text within the link. Then the client can see a short listing of all posts without having to also comb through the content. If the Archive page still needed the contents for some reason, the rendered contents of the post can be accessed just as easily through the contents attribute.

While the interface is simple, what I don't like about this solution is that I have to render and parse HTML templates before serving the same HTML templates to the client. The redundancy can really slow down processing since it needs to handle so much information between requests. This is most noticeable during unit tests, where the PostList needs to be regenerated for each test, sometimes several times.

If you'd like to take a look at my Github, the code I've been referencing is here.

If this was done with a database, the contents of the template wouldn't have to be rendered until it was served to the client. Then it would be up to the user to enter the metadata when they created the post. In most cases that I could imagine, the content would be served much faster at little cost to the maintainer of the blog (at least after they initially set up the database and Admin screen).

It's up to you which approach you go with. If I were to do it again, I'd still stick with the plain HTML. However, I'd probably figure out a way to load the posts once at the beginning of unit testing and then never again to improve testing speed. I'll go into why the slow-ness isn't a factor on the deployed website in my next post about deployment.

Syntax Highlighting

One of my biggest motivating factors to implement my website in Flask instead of using a drag-and-drop website builder was to have more control over the syntax highlighting in the code in my blog posts. To get the syntax highlighting I always dreamed of, I used the Python pygments module. This was actually surprisingly easy to set up and works great.

To get it working, first install with:

$ pip install pygments

A quick test in the console will show how it will convert a simple hello world! function into HTML:

>>> import pygments
>>> import pygments.formatters
>>> import pygments.lexers
>>> pygments.highlight('print("hello world!")',
...     pygments.lexers.PythonLexer(),
...     pygments.formatters.HtmlFormatter())

'<div class="highlight"><pre><span></span><span class="nb">print</span><span class="p">(</span><span class="s2">&quot;hello world!&quot;</span><span class="p">)</span>
</pre></div>
'

The pygments.highlight() method is doing all the work here. The first parameter is the code block to highlight. In this example, we're simply printing print("hello world!").

The second parameter is what's called a "lexer." This is how pygments knows which words and symbols to highlight. Pygments provides a ton of lexers to choose from including HTML, Python, CSS, and many others. In this example, I've chosen Python so that print will be recognized as a keyword.

The third parameter is known as a "formatter". This lets pygments know how to output the highlighted code. Pygments doesn't provide quite as many formatters, and the only ones that were useful to me were the NullFormatter (for plain text) and the HtmlFormatter. However, the HtmlFormatter takes arguments which allow you to specify how the output code should be formatted, like whether to include line numbers. In this example, I've chosen to output the code as plain HTML without line numbers.

The pygments.highlight() method will only provide us use if we can somehow pass code from the HTML templates to the pygments.highlight() method. Luckily, Flask provides a way through context processors. These are functions which run before a template is rendered and inject new variables into the template context. They do so by returning a dict populated with the new variables. Each key is the variable in the template context, while the value is from the app context.

For example:

1
2
3
@app.context_processor
def inject_title():
    return dict(title='My Awesome Webpage')

This would set each instance of {{ title }} in the Jinja HTML template to My Awesome Webpage.

Since Python handles all variables as objects (including functions), there's no reason that we can't inject a function as a context processor the same way!

Instead of calling pygments.highlight() directly, I created a wrapper function called _codeify() which does some pre/post processing before delegating to pygments like removing whitespace. I've provided this as a context processor to Flask. I'll show an example of how that works in a bit. First, I'd like to explain _codeify.

One of the most important things my _codeify method does is to notify Flask that the code is "safe" and that it can be rendered as markup. By default, Flask escapes all dynamically rendered markup. Not doing so could be a huge security risk by allowing user's to inject potentially malicious code.

In this case, I chose to use the flask.Markup() mechanism to mark my code as safe. My _codeify() method ends up looking like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
def _codeify(self, code):
    # Remove leading/trailing whitespace
    formatted_code = code.strip()

    # Syntax highlight
    formatted_code = pygments.highlight(
        code,
        pygments.lexers.PythonLexer(),
        pygments.formatters.HtmlFormatter())

    # Return code as Markup
    return flask.Markup(f'<code>{formatted_code}</code>')

I also found it useful to add a third parameter to _codeify to allow specifying a language. Then I could dynamically select from a dict of lexers and formatters using the language name as a key. Here's an abbreviated version of my _lang_config dict:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
self._lang_config = {
    'default': {
        'lexer': pygments.lexers.TextLexer(),
        'formatter': pygments.formatters.HtmlFormatter(
            wrapcode=True)
    },
    'py': {
        'lexer': pygments.lexers.PythonLexer(),
        'formatter': pygments.formatters.HtmlFormatter(
            linenos=True,
            wrapcode=True)
    },

Then it could be integrated into the previous example like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
def _codeify(self, code, lang=None):
    # Remove leading/trailing whitespace
    formatted_code = code.strip()

    # Syntax highlight
    fallback_config = self._lang_config['default']
    formatted_code = pygments.highlight(
        code,
        self._lang_config.get(lang, fallback_config)['lexer'],
        self._lang_config.get(lang, fallback_config)['formatter'])

    # Return code as Markup
    return flask.Markup(f'<code>{formatted_code}</code>')

Note how I used dict.get() instead of the [] operator so that I could fall back to a default config if a language was not specified. In this case, my default configuration was to use a TextLexer, which didn't add any highlighting.

Looking at the second entry of my _lang_config dict, you'll see that the HtmlFormatter I used for Python has wrapcode and linenos set to true. The wrapcode parameter is needed to meet HTML5 specifications for wrapping code in <pre> and <code> blocks. The linenos parameter specifies that line numbers should be added to the block.

An interesting note is that pygments structures the line numbers as a table, but not in the way you would think. In order to allow easy copy/pasting of code, the table is actually two columns and only one row. Separate lines are only separated by a newline element, not a new table row. That means that the numbers and code need to be using the same text size and font or they will be misaligned. In my case, my table content was also centered vertically, which caused an unexpected misalignment. Keep this quirk in mind as you are developing your page stylesheet.

Speaking of stylesheets, the pygments quickstart page provides a way to generate a stylesheet that can be used with any pygment-ized HTML. Just execute this on the command line after installing pygments:

$ pygmentize -S default -f html > pygments.css

Then include the stylesheet in the head of your HTML along with the rest of the stylesheets.

Putting it all together, first define a method which returns the markup which should be injected into the page:

1
2
3
def _codeify(self, code, lang=None):
    # Contents abbreviated - see above for full example
    return flask.Markup(code)

Then notify the Jinja template engine of the new method by declaring it a context processor:

1
2
3
@app.context_processor
def add_context_processors():
    return dict(codeify=self._codeify)

Then use your new method by calling it within a Jinja template, making sure to include the pygments stylesheet:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
<!DOCTYPE html>
<html lang='en'>
  <head>
    <title>Python print method - example usage</title>
    <link href='pygments.css' rel='stylesheet' />
  </head>
  <body>
    <h1>Example usage of print</h1>
    {{ codeify('print("Hello world!")', lang='py') }}
  </body>
</html>

And with that, you'll have syntax highlighting on your page!

Summary

This post covered a lot, including much of the actual Python that went into this project. I hope anyone who is doing a similar project in the future can take some of the lessons learned from this one to accelerate their project forward by either avoiding some of the pitfalls of the choices I made or building on them.

Next time, I'd like to cover deployment to Netlify. See you then!