DEV Community

Alex Smith
Alex Smith

Posted on

How do you get familiar with a new codebase?

I recently started a new gig as a Software Engineer here at DEV. And having changed jobs a few times, one of the more difficult things of starting a new role is getting up to speed with a new codebase.

When learning Software Engineering (school, bootcamps, online tutorials, etc.) you're often starting from scratch or you're building a project where everything is defined upfront. For example, if you have done a Rails tutorial you've likely built a blog with a limited feature set. Decisions made in that tutorial were likely based on a defined scope - build a blog where users can create, edit, and delete articles. And then you're off to the races (shout out to scaffolding).

But like in many fields, this is great...in theory. In the "real world", you can often find yourself walking into an established codebase that has been through a lot.

And what ends up happening is you end up finding lots of code that makes you pause and wonder, "why was this done this way!?". Sometimes there's a good answer for that, but it's some historical piece of context that leaves you having to ask someone who has been around longer.

So my questions to you all are:

  • How do you get up and running with a new and established codebase?
  • What tools do you use, if any?
  • What sort of tools do you wish you had to help with this?

I recently came across Codeflow(0 affiliation, but it's a solid example) and I have had ideas in the past about building a tool to help onboard Software Engineers to a new codebase.

Lastly, it's worth noting that I realize that in an ideal world things would be clearly documented and kept up to date. This is all assuming the trade-off was made to (more or less) skip detailed documentation of nuances.

What do you think?

Discussion (17)

Collapse
rhymes profile image
rhymes

Reading tests helps during onboarding but it can be hard to do if you don't have a "direction". The thing that helped me the most in all various code bases I had to work on was trying to solve bugs :)

Bugs, especially the non trivial ones, have the tendency to send you deep into the code, with many tabs opened in the IDE and some breakpoints set that you have to step through.

I documented one of those debugging sessions (of DEV's own code actually :D), in this post: Come with me on a journey through this website's source code by way of a bug

Also, don't underestimate the power of good "grepping" skills: rg \bWORD\b or rg WORD | rg -v WORD_IM_NOT_INTERESTED_IN is probably one of the most frequent things I type in the terminal these days :D (rg is ripgrep which is what Visual Studio Code uses internally). I'll never thank enough @dmfay for pointing me in the direction of that tool :D

Collapse
endy_tj profile image
Endy Tjahjono

Agree with picking one bug to fix and just focus on fixing that. Without this focus I got overwhelmed by the sheer amount of things to learn (back when I was in the same situation, just joined a new company).

Ugh I remember the stress of such period!

Collapse
atsmith813 profile image
Alex Smith Author

Agreed - reading tests is one of my first steps. It helps that (hopefully) they're written with both a positive assumption (i.e. passing valid arguments) and negative assumptions (i.e. passing invalid arguments). It's quick to get a sense of what the code is supposed to do.

There was a time where I went deep in the rabbit hole in the specs of the codebase...turned out the specs were way out of date 🙈

At risk of going on a slight tangent, do you use rg with VIM? I've read mixed reviews about switching away from silver searcher with VIM which has made me hesitant about switching at all 🤔

Collapse
rhymes profile image
rhymes

I use rg in the terminal. My editor of choice is Sublime Text 3 (its builtin search is a bit slow)

Collapse
sharkbeard profile image
Aaron Campbell

In my career, I have found that most times I am working with a legacy codebase that has some combination of messy code, lack of tests, and no documentation. My main approach for figuring out codebases like this is as follows:

Learning legacy codebases

The four Rs of learning legacy codebases are Read, Refactor, Write Tests and Repeat. (Don't let facts get in the way of a good naming scheme.

1: Read

Read through a specific part of the codebase. Try to grok in your head as much as you can without touching anything.

2: Refactor

For long parts of the codebase that don't make sense, break it up into methods until it does make sense. Is there a red-herring variable that does nothing? Remove it. Refactoring to make it easier to read helps you be able to keep everything in your head at once.

3: Write tests

Write some tests for your newly refactored code. Is it actually doing what you thought it was doing? Test your hypothesis. If your hypothesis is right, you now have tests for the next person (or, more importantly, your future self.) If your hypothesis is wrong, use the fifth of the four Rs of learning legacy codebases, Revert! (don't even worry about it)

Reverting

You broke everything, good job. Now you know one more thing this code does not do. Revert the code you wrote and start again. Do not focus on what the code should do or should look like. This will only lead to anger or shame on your part. Instead, accept the code as it is. This is the way to code zen.

4: Repeat

Now repeat the steps until you know enough about the codebase to do what you need to.

Collapse
m_vemuri profile image
Mukund

Ironically, i find the fastest way is to not start with the code. I find just learning the business and use case beforehand is a great way to start. Answering what the application does, and what is the business use case can help get in the right frame of mind before exploring the code. This builds a solid foundation.

After that, I find just exploring the code with my first story adds an additional layer of understanding. Focus on structure more than implementation details in the first few weeks. If you are a more senior developer, try and understand the approach/paradigm and design pattern used in the underlying code. This will help you remember where everything is from a higher level perspective.

For tools, definitely use an IDE (I prefer intelliJ). An IDE will give you the power to explore a code base quickly and understand how everything is laid out in one application. It will also help you find potential bugs and opportunities of refactoring early on.

Collapse
karatheodory profile image
Vasily Loginov

I definitely agree! I also usually start from business context, answering a question “how is it supposed to be working?”. This way there will also be clear division between normal (expected) things and the stuff usually described as a legacy.

This is especially important if you are coming to a complex context you are not familiar with. Modeling the main system components in my head and on paper added to code digging made it much faster for me to discover what is what.

Collapse
atsmith813 profile image
Alex Smith Author

I use paper to draw flow charts ALL the time 🙈 It's funny because the majority of the time the end result isn't even organized enough to be helpful at all, but the process of drawing it helps me understand a more organized flow in my mind...if that makes any sense!

Collapse
atsmith813 profile image
Alex Smith Author

Business context and IDEs are great points!

Re: business context
It's funny you mention business context...closely related, I was just working on a bug where I found myself saying, "I'm not sure what use case is even possible to lead to this issue". The context is everything.

So that said, any ideas on how to bridge context with code for new hires beyond "vanilla" documentation? Or asking (read: low key bothering) more seasoned teammates?

Re: IDE
I tend to use VIM, but as soon as you mentioned IDE I could imagine where a heavier IDE would be very helpful for navigating the code. Maybe I'll try a heavier IDE for a bit 🤔

Collapse
joshpuetz profile image
Josh Puetz

I am team "jump in and try to swim": I've always been the type learns best via struggle 😂 A better way to put it would be "immersion learning", I'm diving in headfirst to the codebase like a traveling in a foreign country without a translator!

Collapse
atsmith813 profile image
Alex Smith Author

Like jumping into the deep end of the pool with no floaties on before learning how to actually swim 😂

Collapse
baily__case profile image
Baily Case

I just recently started my first junior development job last year, so I have just been diving into a lot of new code bases myself. I find the biggest thing that helps in new code bases is other developers. When I was stuck at different points in finding parts of code, or just looking to understand what they were thinking when they wrote a certain implementation of code.

A couple neat software things I have found helpful is FZF, and rgrep. I use Vim as my primary editor for code so being able to quickly search for files, or even pieces of code in a file is awesome! Not to mention its super fast!

Collapse
atsmith813 profile image
Alex Smith Author

Sounds like I'm going to spend some time this week re-configuring my VIM - goodbye (for now) Silver Searcher 💯

Collapse
baily__case profile image
Baily Case

Glad to hear someone else out there is running VIM! My colleagues think i'm crazy for using it. I just find it a lot faster than using my mouse in VSCode.

Collapse
brandinchiu profile image
Brandin Chiu

Two things to keep in mind here:

  • 9 times out of 10 you will be working on code you did not write. This is the reality of professional development.

  • those same 9 times out of 10, there's no documentation or formal structure at all.

In all of years doing this, I have never encountered a code base that wasn't internally described as "spaghetti".

Ib my experience, the most effective way to get acquainted with a new project is to try to deploy it. This gives you practical experience with configuration, setup, and understanding of these projects which is the best way to get started.

Collapse
atsmith813 profile image
Alex Smith Author

I couldn’t agree more, 9/10 times things are that way - the name of the game.

Deploying is a good idea, I like that!

Collapse
wokejacqueline profile image
jacqueline

Really liked Aaron’s answer.

From a front-end + design perspective, I offer the following advice:

  1. If you have this ability, checkout a few different years worth of files. EG, look at an important page as it was in 2014, 2016, 2018. Oftentimes (and especially where CSS is concerned) methodologies change. If your codebase is anything like mine, they typically change in completely undocumented ways. :) Looking at a view over a few years helps you to understand how CSS is currently approached and how that compares to before.

  2. In the same vein, find a few historic examples of a popular component, like Button or Section and make notes about how these have changed over time. Often design debt is accumulated due to developers not using the “new way” and needlessly overriding old CSS when they could have reached for the correct variable instead. Watch out for this especially if you know the project has gone thru a major redesign.

  3. If you’re new to an established project, search, ask, and bug those around you until you are certain what you’re trying to build doesn’t already exist. Another common way to accumulate technical debt is developer A building a slightly different implementation of something the site already has. If its a common component, like a loader, popover, modal, button, assume the project already has a way to do it... and try to do it that way, unless you absolutely can’t, for a technical reason.

great post ! :)