Discussion on: What is the fastest way to onboard to a new codebase?

View post

Short answer: First, RTFM and then run unit test suites targeting specific modules of interest if available and stepping through the code with a debugger.

Longer answer: I'm a Python guy, but the strategy has been applicable to most codebases I've encountered, unless the debugger at the time didn't really allow for it (I'm looking at YOU, early Golang). Typically (and hopefully), the project will have module-targeted tests. For instance, if I have a module project_name/auth.py, there would be accompanying tests in tests/test_auth.py or similar. Depending on what's being tested (is it a "true" unit test, or a unit test-y thing, as in the case of Django's client tests) I'll set a breakpoint just before a given call and just step into each block of code, or at least just enough to get my head wrapped around the control flow or set a breakpoint as early in the executing code as I deem needed.

In the case of Django (and similar)-land, the unit test-ish approach is kind of nice for this approach. Rather than testing a single class method or function input and output, it goes through the entire lifecycle of the request. Because of this, I can very easily and quickly get a high level overview of all the moving pieces.

Frustrations have been mostly been hit when a language doesn't support the approach or I'm forced to use an IDE (thanks Java ;)). Python-specific frustrations have come sometimes when side effects occur unexpectedly (i.e. an outbound request or database state change on a property accessor).

Grace Taylor • Apr 28 '20

Interesting. What if there aren't unit tests available?

Demian Brecht • Apr 28 '20

Then that would be an incredible source of frustration and likely not a code base I'd want to work with ;) If actually presented that problem though, you can always follow the same pattern: Use breakpoints and use the system rather than running unit tests.

Of course, that story may very well change depending on the complexity of the system. For example, debugging a distributed system where you can't really poke at component-level pieces locally doesn't work quite as well. Then it becomes more hope that there's sufficient logging to allow you to follow the code flow and read through the logs relating it back to the code as much as possible.

I guess I should have prefaced my blurb with: It all depends on what it is that you're trying to wrap your head around.