Writing my own static site generator in python3
Intro
Why would anyone write a static site generator and not use solutions like Hugo or Jekyll?
Because its pretty simple. I mean really simple. All of this blog was done in like an hour time on a random weekday.
I wanted a own blog, under 100kb in size. No ads. No database in the background. No CSS or Javascript frameworks in the way.
What you currently see on your screen is how i accomplished it.
Blogs and my boredom
I've always thought about writing some blog posts in the past, because some blogs helped me a lot in my journey of computer science and software engineering. I would not be the person today if there were not some awesome folks out there which shared their knowledge freely with everyone.
Also some more specific reasons:
- I was bored and thought i could easily do it myself.
- All the existing solutions need to be configured which would likely take as much time as it took to create this static site generator.
- All exisiting solutions seem very bloated.
Jinja2 and templating
I heavily used Jinja2 to create all my html files. If you know ansible or worked with django before you may know it already.
First i created a base.html file. This will serve as a template for all other files i will write in the future. Here is the first iteration of the base.html i've used for this blog:
<html>
<head>
<title>{% block title %}{% endblock %} | pskirde.com</title>
<meta name="description" content="{% block description %}{% endblock %}">
</head>
<body>
<header>
<span class="site-title">pascal's dev blog</span>
</header>
<main>
{% block content %}{% endblock %}
</main>
</body>
</html>
The base.html is the building block for every other page. The blocks like: {% block title %}{% endblock %} are template placeholders. If you would insert any context with the name "title" into the template it would be shown.
Let me show what i mean with the About me page of this blog:
{% extends "base.html" %}
{% block title %}About me{% endblock %}
{% block description %}Hi i am Pascal. I was born in 1998 and living in northern germany.{% endblock %}
{% block content %}
<!-- Content will go here -->
<h1>Hi i am Pascal</h1>
<p>I was born in 1998 and currently live in Hamburg, Germany.</p>
<br>
<p>My socials</p>
<p><a href="https://x.com/skirdogg">X.com</a></p>
<p><a href="https://github.com/pskirde">Github</a></p>
<p><a href="mailto:pascalskirde@web.de">Email</a></p>
{% endblock %}
You can see that the context of title, description and content is configured via the about-me.j2 file, which is extending the base.html file and filling it with the context.
Python code to render the pages
This is basically where all the magic is happening and the html sites will be generated. This is all happening in the following python script:
#!/usr/bin/env python3
from jinja2 import Environment, FileSystemLoader
from pathlib import Path
def render_templates():
# Create output directory if it doesn't exist
Path("public").mkdir(exist_ok=True)
# Set up Jinja environment
env = Environment(loader=FileSystemLoader("templates"))
# Process all .j2 files
for template_file in Path("templates").glob("**/*.j2"):
# Get relative path and create output path
rel_path = template_file.relative_to("templates")
output_path = Path("public") / rel_path.with_suffix(".html")
# Create subdirectories if needed
output_path.parent.mkdir(parents=True, exist_ok=True)
# Render template and save
template = env.get_template(str(rel_path))
rendered = template.render()
with open(output_path, "w") as f:
f.write(rendered)
print(f"Rendered {template_file} -> {output_path}")
if __name__ == "__main__":
render_templates()
The script renderes all pages with the given context. The base is the base.html and every other site like index.j2, blog.j2 or about-me.j2 is giving the title, description and content of each page.
All sites will then be rendered into html where they can be interpreted by the browser.
Indexing and sorting of the pages
One challenge i needed to solve was to create some preview functionality of the blog articles. I want my latest 10 blog articles to be shown on my index.html page and all blog articles to be available under blog.html.
To accomplish this i needed to render all the blog articles in one run and after that parse them and get the title, description and the timestamp which every blog article has.
When i have all this information, i can easily sort by the 10 newest blog articles and create a short link with the title the description and the time of the writing for my index page.
My index.j2 has the following block in it:
{% for article in articles %}
<h3> <a href="{{ article.filepath }}">{{ article.title }}</a> </h3>
<p> {{ article.time }} </p>
<p> {{ article.description }} </p>
{% endfor %}
I can easily loop over the articles context and give every link the according information it needs.
You can see in the following python script how i parse the html files from step 1 via BeatifulSoup and create the articles context.
Then i insert the according context information into the index.html via the sorted list and insert the context of the list with all articles into the blog.html
#!/usr/bin/env python3
from datetime import datetime
from jinja2 import Environment, FileSystemLoader
from pathlib import Path
from bs4 import BeautifulSoup
def sort_articles(articles):
def parse_date(date_str):
return datetime.strptime(date_str, '%d %B %Y')
return sorted(articles, key=lambda x: parse_date(x['time']), reverse=True)
def parse_html(file_path):
html = Path(file_path).read_text()
soup = BeautifulSoup(html, 'html.parser')
time = soup.find('time').text
title = soup.find('h1').text.strip()
description = soup.find('meta', {'name': 'description'})['content']
filepath = file_path.split("/")[1]
return {
'time': time,
'title': title,
'description': description,
'filepath': filepath
}
def render_templates():
Path("public").mkdir(exist_ok=True)
env = Environment(loader=FileSystemLoader("templates"))
article_info = []
blog_articles = []
for article in blog_articles:
info = parse_html(article)
article_info.append(info)
# get first 10 items from list in chronological order
sorted_list = sort_articles(article_info)
first_10_articles = sorted_list[:10]
env = Environment(loader=FileSystemLoader("templates"))
template = env.get_template("blog.j2")
rendered = template.render(articles=sorted_list)
output_path = Path("public/blog.html")
output_path.parent.mkdir(exist_ok=True)
output_path.write_text(rendered)
print(f"Rendered blog.html")
env = Environment(loader=FileSystemLoader("templates"))
template = env.get_template("index.j2")
rendered = template.render(articles=first_10_articles)
output_path = Path("public/index.html")
output_path.parent.mkdir(exist_ok=True)
output_path.write_text(rendered)
print(f"Rendered index.html")
if __name__ == "__main__":
render_templates()
TODO for the future
For the next iteration i will most likely will create some stuff about the Opengraph information, like image, author, time information.
An autogenerated table of contents for the blog entries would be also a good addition.
Last but definitly not least, a functionality that inside the blog.html page it would be possible to filter by year or by a specific topic, but this will be a task for "future pascal".