DEV Community

Cover image for 5 things I learned aspiring for Google Summer of Code 2020
Shivam Sandbhor
Shivam Sandbhor

Posted on

5 things I learned aspiring for Google Summer of Code 2020

A short intro

A bit of background, as this is my first post on dev.to ,I am a first year engineering student from India. I recently got selected as a student developer at Google Summer of Code 2020. I am working on the project VulnerableCode, which is a basically a Django app which collects data about software vulnerabilities and exposes this through an API(our goal is much larger, and much of it will accomplished by the end of this GSoC). We welcome any contributions(there are many many things to work on), unfortunately the documentation is still WIP sorry for that in advance. Do star it !
GSoC result

These are top 5 things I learned in the process of starting as total beginner to getting selected in the prestigious GSoC(TLDR section is at the bottom).


The skill of finding right tools

Yet Another Morpheus MemeAs a software developer the problems we face are most likely not as unique as we think they are. What do I mean by this ? Consider this real example:

So the project I am working on, basically collects the data of all reported software vulnerabilities. These are typically JSON documents which have the name of software vulnerability, it's short description and finally the names of vulnerable software packages. Typically a vulnerability never affects a single software package, rather it affects some range of versions of that software package. Something like this:

 "django": {
            "advisory": "The administrative interface in django.contrib.admin in Django before 1.1.3, 1.2.x before 1.2.4, and 1.3.x before 1.3 beta 1 does not properly restrict use of the query string to perform certain object filtering, which allows remote authenticated users to obtain sensitive information via a series of requests containing regular expressions, as demonstrated by a created_by__password__regex parameter.",
            "cve": "CVE-2010-4534",
            "id": "pyup.io-33058",
            "specs": [
                "<1.1.3",
                ">=1.2,<1.2.4"
            ],
        },
Enter fullscreen mode Exit fullscreen mode

The problem was, given a list of all versions of a package(obtained via the corresponding package manager's API, in this example it is pypi), filter these into two lists, one of which has all the versions which lie inside any of the given version ranges from the JSON document and the other has versions which don't lie inside any of the range. Old me would've written a version class, then implement comparators, realize in middle of it that not all versions are entirely numeric. We have to deal with 'rc' , 'beta', 'alpha' , 'pre' etc. Some ecosystems use all together weird comparasion symbols,(ruby folks '~>' seriously?). As you can see the complexity keeps increasing.

Instead new me searches GitHub to see if anybody already solved the problem, if yes(which 89% of the times is true) then fork it else implement it myself. Those who are curious as how we solved the version range problem, we basically pip installed this less known module and it had everything we needed already implemented, problem solved. They even had tests for the module. Lesson learned: Avoid reinventing the wheel.


Writing readable, beautiful code

![xkcd comic strip](https://imgs.xkcd.com/comics/code_quality.png)Our project requires the PR to pass style check to get merged, so in a way initially I was forced to write beautiful code. This made no sense to me (the whole PEP8, Zen of python) as someone having 'its the answer, which matters not the solution' mentality. Gradually it made sense because,
  • It looks so beautiful, I honestly stare at it to feel good about myself
  • It makes sense even after reading it after a month.
I totally agree with enforcing a style for the entire codebase to maintain consistency and readability. It also helps to increase the 'bus factor' of a project quickly. The Zen of Python is my compass when writing code now :P .

The skill of diving into huge codebase and being productive

My beginning experience with open source consisted of mainly reading pieces of code for hours and having no idea how it all fits together.This has improved after spending most my time understanding how different libraries work and reviewing PRs. I rather than passively reading code,found making notes and diagrams of different classes and their interactions with each other less painful and more productive towards developing an understanding of a codebase.I find this skill very underrated and not talked about. To develop this skill, I completed a self-challenge, mentioned in the next section.

Contributing to open source projects is easy

I initially thought that I required a certain 'expertise' in a programming language before starting to contribute to open source. Fortunately I adopted the mentality of learning on the fly as required to contribute. I recently challenged myself(this actually deserves a separate post) to get a PR merged in a single day for a project which uses a language I don't know and is based on a field of which I have no idea about . I chose Hyperledger Besu an open source ethereum client which used Java. I am happy to tell I successfully completed this challenge and got a tiny PR merged(it was actual code , the image is misleading :) ).

Technical Communication

![the most relevant xkcd I found](https://imgs.xkcd.com/comics/misinterpretation.png) It is just like the great Richard Feynman said *"If you can't explain something in simple terms, you don't understand it"*. Google Summer of Code's application consists of writing a proposal on how to implement the project idea, I knew I had to work hard in this department. I used to believe I am fluent in English but when it came to communicating technical ideas, sometimes I had no idea how to say it. Again this occurred because there were many (huge)gaps in my understanding of the project.I am pretty confident now about having a crystal clear high level understanding of the project after I explained my project and proposal to my family. I found it really useful to be more descriptive than necessary to avoid missing out on context(opinions please) when communicating low level ideas (sometimes down to pointing the line number). In my proposal I relied on using pictures to denote changes in database schema. I also included a GIF to show how the users will interact with yet-to-be made UI for the project.

Things I wished to put here

There were many programming language specific things I have resisted to mention, I believe those demand a separate post. About git-fu ,well it's a classic Karate Kid story regarding my git skills(spoiler: I used edit code on GitHub web).I also found a new hobby about thinking of software architecture of various products, I think the design aspect of the software is what earns us the 'engineer' in the 'software engineer' at same time the saying software development is an art has started making sense. Using Object Oriented Programming has started making sense, I really recommend folks learning about Object Oriented Programming to read some open source code to understand the application of OOP concepts.

TLDR/summary

Top 5 things I learned:

  1. Avoid reinventing the wheel
  2. Readability counts, Zen of Python makes sense. Follow PEP8 or something similar.
  3. The barrier to entry to contribute to open source projects is probably lower than I thought.
  4. Diving into huge codebase and making sense of it is a very valuable underrated skill.
  5. Mastering technical communication makes life easy as a developer.

Top comments (3)

Collapse
 
upieez profile image
Samuel Huang

Thanks for this! It was a good read and I'm excited to try and contribute to an open source project myself too :) all the best for GSoC too!!

Collapse
 
sbs2001 profile image
Shivam Sandbhor

Thank you, good luck for the open source journey!

Collapse
 
atharwa24 profile image
atharwa_24

Stuff you spoke in this read is really helpful.Thanks for it!!!