For this week in my course in open-source development, I was tasked with refactoring my ssg project (rost_gen). A simple and though potentially time-consuming task.
So, I created a branch and started looking at my code. I usually refactor my code as I write it, i.e., I spend some time to clean up the code after making any changes. This made the task of refactoring kind of difficult since some of the work had already been done. I also find that it is quite easy to overdo things while refactoring. I used to have a tendency to break things down into tiny functions. While it gives a sense of accomplishment as the code is more "organized" into neat blocks, it also makes it hard to read as one must jump between functions to see what is happening. A tiny function does not do much and does not convey much information on its own. This is especially tiresome if the functions are only used once through the entire program.
So, this time I decided to focus on reducing duplication where possible. In one place, I was checking if an input file's extension was valid by using the same logic in two different if statements. So, I created a new function to perform the check. Upon further investigation, I also saw that the check was really checking if the file extension was one of two options (.txt or .md). And to do this, I was calling four chained methods on the path of the file twice:
Therefore, I decided to read the file extension into a local variable and check it against the valid file extensions.
I also decided to focus on improving readability. This can be done in multiple ways, by renaming variables, functions, moving things around, reducing nesting etc. I renamed some variables that could be potentially confusing to something that would be obvious from the start.
For example, I had a variable to count the number of bytes read from the file named
read_bytes. While this is fine, it could be misinterpreted as the buffer that stores the read bytes. Therefore, I renamed it to
read_bytes_count to make its use obvious.
Further, I also noticed that in some places, I was nesting if statements. Nested statements can make it hard to read through the code so I combined the nested if statements where I could.
Once I was happy with my refactoring, I did some manual testing to confirm that my changes had not broken anything. Following that, I did an iterative rebase using git (
git rebase -i) to squash all my refactoring commits into one larger commit before merging into the main branch. The process went smoothly, I merged my branch into the main branch and pushed the changes.
However, I forgot to push the refactoring branch after running the rebase command. This resulted in the main branch containing the squashed commit, and the refactoring branch containing several commits for the same change. I tried to fix it by merging main into the refactoring branch and pushing again but git thought that my refactoring branch was now one commit behind main (that being the rebase commit). However, since I had already merged my changes into main, I decided to delete the refactoring branch.
Latest comments (0)