The next issue I chose to tackle for release0.2 was something I guess most people would not consider the "norm" use for GitHub. The truth is - many people think GitHub is just a place to share and collaborate code, but we can leverage the site for so much more.
Previously, I worked at the City of Toronto, and anyone can tell you immediately that they follow AODA guidelines as if they created them. This is something that I have come to value and appreciate. Everyone should have the same opportunities no matter the situation they are in.
I was really happy to stumble across an odd, but nothing short of an important repository. Everyone makes their impact in the world differently, and Michaela Greiler is nothing short of that. She hosts the Software Engineering Unlocked podcast, which is a podcast dedicated to sharing how different software companies work around the world. With multiple guests on the podcasts, you get a real feeling of all the different backgrounds and passions that people have in the field of software engineering.
The issue: podcasts aren't exactly accessible to everyone. For many reasons, an individual can feel the need to have closed captions or even a script. The repository created by Michaela is all the transcripts to her podcasts (AI translated by default).
This presents a few issues:
- AI does amazing picking up filler words - it's so good it picks up a lot more and then doesn't transcribe it correctly.
- AI has a hard time picking up accents and different pronunciations of words.
- AI doesn't always recognize who is talking.
Due to the above, the transcriptions were not exactly "accurate". To pick up an issue, all I had to do was create an issue with information about which episode I wanted to transcribe.
After I did this, I forked and cloned the repository. I reviewed the contribution guideline and then the transcription guidelines. there were TWO things I had to keep in mind:
- Keep the markdown to 80 chars. per line
- mark unintelligible audio with ??
I'll be honest, this was not an easy task. It was so hard to pay attention to each word and confirm it with the AI-created transcript. I decided to split the episode into 4 sections, and take a break between each. I ended up taking 2 days to complete this.
Challenges:
- I have a poor and short attention span.
- Markdown is not exactly going to be opened and edited on the notepad. Solution: I had to make sure I downloaded a good markdown editor. I went with Typora. This was a great tool that supported all markdown syntax.
- How do I make sure I am not exceeding 80 chars/line? Solution: I am a big Sublime editor fan. I know they have a ruler for the editor. Did I know how to do it? No.
- Learn how to use the ruler in the sublime editor.
Solution: Simple. I thought I needed to download a package, but don't do that! It's in the settings. Go to the sublime > preferences > settings. Then in the object add
"ruler_column": 80
.
Next, you should see a dotted line in the editor.
And voila! Another PR complete.
Top comments (0)