In my last blog post, I discussed about docusaurus, how I learnt to use it, and what I learnt from the source code.
In this post, I'll be discussing about how I took whatever I learnt from the project, and put it in practice by adding a brand new feature to my own project til-page-builder inspired from docusaurus.
I talked in detail about what is the feature we're going to implement and how docusaurus inspired me to add it in my last blog post. Make sure to read that before continuing.
Table of Contents
1. Filing Issues 📁
2. Working on Issues
2.1. Adding necessary classes
2.2. Adding a basic TOC implementation
2.3. Adding unique ids to all headings while parsing
3. Filing follow up issues
4. Conclusion 🎇
Filing Issues 📁
The first step whenever implementing something new should be to file issues on Github or any other platform you prefer. Issues not only document your work, but also help you in planning the design for your code.
Whenever you file an issue, you're essentially rethinking your approach and many times, you'll end up changing it to something else. I also went through multiple iterations of revising my plan and ended up with the following issues.
Working on Issues
Now that I had my issues in place and plan in mind, it was time to build something.
Adding necessary classes
I started with my first issue planning out my HeadingItem
class. I first looked at the docusaurus code for inspiration of what properties I should be having.
I noticed id
, value
, and children
which would be sufficient for my purpose. I also added a static function to generate hashed ids based on a heading's text content, an html generation function and ended up with this.
src\builder\toc_generator\heading_item.py
from typing import List
import hashlib
class HeadingItem:
def __init__(self, value: str, children: List['HeadingItem'] = []):
"""Constructor
:param id: id of the corresponding html element
:param value: inner html of the corresponding html element
:param children: child nodes
"""
self.id = HeadingItem.generate_heading_id(value)
self.value = value
self.children = children
@staticmethod
def generate_heading_id(text_content: str):
"""Generate a unique hash id based on the heading text
:param text_content: Text content of the heading
:return: A unique `sha512` id for the heading
"""
# Create a SHA-512 hash object
sha512_hash = hashlib.sha512()
# Update the hash object with the bytes of the string
sha512_hash.update(text_content.encode())
# Get the hexadecimal representation of the hash
return sha512_hash.hexdigest()
def get_html(self) -> str:
return f"""
<li>
<a href='#{self.id}'>{self.value}</a>
</li>
"""
Looks like a mini React component, right. I also realized after finishing the code.
And finally, I needed a TOC
classs to make use of our HeadingItem
and generate required TOC html.
src\builder\toc_generator\index.py
from typing import List
from builder.toc_generator.heading_item import HeadingItem
class TOC:
def __init__(self, items: List['HeadingItem'] = []):
self.items = items
def add_item(self, item: HeadingItem):
self.items.append(item)
def get_html(self):
return f"""
<h2>Table of Contents</h2>
<ul>
{''.join(list(map(lambda item: item.get_html(), self.items)))}
</ul>
"""
def clear(self):
self.items = []
This fixes #13 😉
Adding a basic TOC implementation
Now it was time to add the actual implementation of an initial TOC prototype that would add some visual progress as well.
For this I had to make additions to my already existing HtmlBuilder class.
from builder.toc_generator.index import TOC, HeadingItem
class HtmlBuilder:
TOC_PLACEHOLDER = '<div id="toc-placeholder"></div>'
def __init__(self):
self._cl_args = CommandlineParser().get_args()
self._output_path = self._cl_args.output # Output directory for files
self._document_lang = self._cl_args.lang # Language used for the document
# TOC Object
self.toc = TOC()
def generate_html_for_file(self, file_path):
# Start with a clean slate
self.reset()
... ...
def reset(self):
self.toc.clear()
def generate_document(self, virtual_doc, page_title, lines, input_file_path):
doc, tag, text = virtual_doc
line_cursor_position = 0
... ...
### Adding a TOC placeholder to the body
with tag('body'):
# Add an h1 if the title is present
if has_txt_extension(input_file_path) and self._is_title_present(lines):
with tag('h1'):
text(self._neutralize_newline_character(page_title))
line_cursor_position += 3
# Placeholder for TOC
with tag('div', id='toc-placeholder'):
doc.asis()
After having a TOC placeholder in place, it was time to make changes to _process_text_block_for_markdown
function.
elif line_queries.is_h2(text_block):
formatted_block = text_block.replace(line_queries.H2_TOKEN, "")
with tag('h2'):
doc.asis(formatted_block)
self.toc.add_item(HeadingItem(formatted_block))
elif line_queries.is_h3(text_block):
formatted_block = text_block.replace(line_queries.H3_TOKEN, "")
with tag('h3'):
doc.asis(formatted_block)
self.toc.add_item(HeadingItem(formatted_block))
With this, we now have all the HeadingItems stored in our TOC object.
As the final step, I had generate TOC html and replace it with the placeholder we added before.
file_content = doc.getvalue() # Get the generated html string
file_content = file_content.replace(HtmlBuilder.TOC_PLACEHOLDER, self.toc.get_html()) # Replace TOC placeholder with actual TOC
file_content = indentation.indent(file_content) # Indent the html
gen_file_path = f"{self._output_path}/{os.path.basename(file_path)}"
return HtmlFile(gen_file_path, file_content)
And now we could see our TOC at the top of every generated page.
This fixes #14.
But I was still missing one last thing, that was to make TOC links functional.
Adding unique ids to all headings while parsing
Even though the links we generated do have appropriate ids in their href
, we need to add those ids to the actual headings as well.
And that is the reason why I added a static method to generate hashed ids for a piece of text. The reason I chose a hashing algorithm was that it was easy to plug in, generates unique ids and provides same result on same input, perfect for our use case.
I only needed the following minor change to fix this.
elif line_queries.is_h2(text_block):
formatted_block = text_block.replace(line_queries.H2_TOKEN, "")
with tag('h2', id=HeadingItem.generate_heading_id(formatted_block)):
doc.asis(formatted_block)
self.toc.add_item(HeadingItem(formatted_block))
Alright now that we have correct ids on each heading plugged in with the links in TOC, its time to see it in action.
And this fixes #15.
Time to merge to master!
Filing follow up issues
Since there was still a lot to do, I filed the following follow up issues.
- Implement nesting of links and children: The current implementation just renders the links as a linear list, I need to implement it as a tree data structure such that proper indentations can be added.
-
ID Generation: I need to simplify id generation as
sha512
is clearly an overkill. - Add a TOC token: Need to add support for a TOC token, that the user can place anywhere in their markdown for Table of Contents generation.
Conclusion 🎇
This is how I built an initial prototype for my new feature of TOC generation. I'll start working on the follow up issues and continue filing more, until I can call it complete.
Stay tuned for next post!
Top comments (0)