A few days ago, students of the OSD600 course at Seneca College were assigned to contribute to a project of one of their classmates by making a pull request. The idea behind is to learn the dynamics of open-source development when contributing to features and bugs in a more considerable way.
Instead of filing issues every time we found a detail or feature we would like to see in the project, the professor pushed us to be more proactive when contributing: instead of asking for the code to be written, write yourself!
He taught us the mechanisms behind git and how one can use git to the advantage of maintaining multiple workflows. He explained the concepts of branches, and how they are used in big open source projects like Nodejs.
The assignment specified that we should work on adding a Markdown feature, such as emphasis (
*a*), strong emphasis (
**a**), heading level 1 and 2 (
##), and links (example.com).
I decided to file an issue related to the heading support. I wanted to be as general as possible, so other developers could refer to this issue if they wanted to implement a specific feature. Instead of having to file a new issue, they can add more discussion to the original issue until all checklists have been marked!
I tried to touch as little as I could to the original project, since I wanted to avoid any kind of breakage in the original code. Since the markdown feature can be well isolated from the original project, I created a new module that was in charge of Markdown parsing, the
This module offers a simple public API to create a
MarkdownDocument, that can then be printed as a string (the pun with document and print is totally unintentional, I swear).
Usually, implementation details should not be relied upon, since they might changed at any moment, but I feel that it is important to discuss here. The
markdown_parser module does not use any kind of regular expression engine to parse the text. If you are more interested on this reasoning, you may read the Reasoning section found later in this post.
Either way, the very few changes I did to the original code were related to checking the file extension. I wrote some code to ensure that we only processed
My pull request can be found here. The process was really straightforward, actually. I posted my pull request and linked the original issue with it. I privately messaged Andrew through Slack, telling him that I forgot to update the README of the project. Overall, he was satisfied with the functionality.
He did express that he was not particularly happy on the organization, but that it could be addressed later on a later issue.
While I did not have a lot of issues when working with git, I wish I could automate the process when working on a particular issue or commit. I think I might write a small
bash script that can automate it such process for the feature.
If you are reading this, you might be curious as to why I would not use a regular expression engine to parse the heading feature for the Markdown syntax.
My reasoning might be either very concise but dense, or long but discernible. I would like to have a perfect balance between both.
I have three main reasons:
- In the most fundamental level, regular expressions lack the ability of parsing an arbitrarily nested expression. This is only true on a theoretical level, since there are some
regexengines that have a stack. However, if you are not parsing the document, but just tokenizing it, then regex might be useful.
- Regex can get complicated, specially with how the Common Markdown specification tries to cover several cases. Trying to write a regular expressions that would catch all cases can, to say the least, be a good way to lose your mind.
- There is no regex engine in the standard library of Rust, so that means I need to get a third-party project. I generally do not like adding dependencies when I encounter a problem that I know I can solve in a reasonable amount of time. Adding a dependency to your project might be a risky move, since you need to be sure that the dependency suits your needs. Having as little dependencies as possible is a good thing, since most of the time, you do not require all of the features that the library can offer you. I am not saying that adding dependencies is bad, I mean that adding a dependency to your project is something you should think about. Since I did not find any need for the regex engine, I decided to not include one.