DEV Community

Cover image for Data Scientists and Software Engineering

Data Scientists and Software Engineering

Bhaskar Karambelkar on April 14, 2017

Originally published on my personal blog Coding != Software Engineering As has been mentioned ad nauseam: Any one can code. What has ...
Collapse
 
fj2c profile image
Fernando Calatayud

All looked good... until you adviced to add comments. Good code doesn't need comments because it's self explaining, if your code is more readable with comments add them... or try to write better code

Collapse
 
bhaskar_vk profile image
Bhaskar Karambelkar

Thanks for reading the post and your comments. I appreciate a good thoughtful discussion. Although I never suspected that that point would be controversial.

Goodness of code is a subjective measure and something not easy to quantify. So saying that code doesn't need documentation, if it's good code is somewhat of a hard sell. In similar vein then someone can say good code doesn't need unit tests because it's good code and ergo bug free.

Having said that, I will say that writing good comments is an skill and art on to itself. Simply describing what the code is doing is not good commenting. Describing why the code is taking the approach it has taken can be illuminating for someone who is reading the code and wondering about the choices the coder made in his implementation.

Collapse
 
fj2c profile image
Fernando Calatayud

I have a constructive suggestion... paste here a piece of code which benefits from your comments, and I'll try to write the uncommented version. You'll decide which version is better ;-)

Collapse
 
mrseanpaul81 profile image
mrseanpaul81

I couldn't agree more... Ultimately the comment will deviate from the code so will become a lie!

Your code should be your comment (there may be a very rare exception but it should be extremely rare like one comment per year!)

Collapse
 
fj2c profile image
Fernando Calatayud

Indeed, that's the main point: people rarely update the comments when the code gets changed, so the comment becomes misleading.

But it's not the only reason... most times, comments are a smell that some piece of code is too complex. Of course, the problem is the complex code, not the comments... but if you feel the need of commenting, better refrain and try to refactor it instead.

About when to comment... in my case, I do it when I write ugly code purposely. It may be due to a performance tweak, ugly but needed, or because the cleaner version generates an unexpected problem. Both of them happens rarely, but may happen, and then better advice the next one against trying to refactor (or be very careful).

Collapse
 
yssybyl profile image
Dr Janet Bastiman

Great post - I have a similar drafted somewhere. It's difficult seeing individuals coming out of academia without an understanding of how to fit into an engineering team.

Re the comments issue - for a data scientist making the transition to an engineer, comments are helpful. Going from a solo effort to a team effort and the mind shift of portability/reuse isn't going to happen overnight. Magic numbers, poorly named variables etc will sneak in occasionally. In my team, we accept this and so I encourage comments, particularly on the "clever hacks". When they get to a point that the comments aren't necessary then they get dropped.

I have an extra rule regarding git that I make sure I state explicitly: no developing in the master branch - it's easy to give a quick overview of source control and forget about branching :)

Collapse
 
ericschillerdev profile image
Unfrozen Caveman Dev

Exactly! I've got several years under my belt as a full stack web developer, and am currently moving more and more into data integration. Data scientists, business intelligence devs., analysts, etc. are all still doing software engineering. The ETL tools I'm using act a lot like code -- and are in fact extensible with code. The ideas are the same: make it readable, think about testing, automation, etc.

Everyone in my list above uses some combination of Python, R, Javascript, Java, C#, VBscript, etc. for modeling, mockups, analysis, etc. Even full on database engineering involves the same sorts of worries that programmers have in terms of input vs. output, how to best model data, usage, etc.

Finally, I'd even note that I used to work with actuaries and insurance underwriters. They set up massive Excel spreadsheets with dozens of macros and pivot tables to model and calculate various rating inputs for risk scenarios and charging for insurance premiums. If that's not software engineering, I don't know what is.

Collapse
 
espoir profile image
Espoir Murhabazi

Thanks a lot for the article !! As a newbie comming from school i discover that making things done are not the most important but making it well done is the most important ! I will follow all suggestions given in your article !!
I've started by learning git , haven't finished yet( branch still gives me headaches)
I but now I'm using it already in my project!
Next will learn TDD
And CI
All the best

Collapse
 
bhaskar_vk profile image
Bhaskar Karambelkar

Thanks Espoir, and all the best for your learning.

Collapse
 
phlash profile image
Phil Ashby

Thanks for a nice start on thinking about engineering, not just cutting code :)

I find a lot of articles and advice assume that the person coding^H^H^H engineering software already understands their problem/task, this is frequently not the case! One of the best lectures I ever attended (back in the 90's!) was with Grady Booch, who advocated modelling the problem, using flexible, physical objects (quite possibly post-it notes and string, although a whiteboard and pen also works) to properly understand what 'things' you are dealing with, how they interact and how they behave, before choosing a language and committing to code, as the effort required to change a code model is typically much greater..

My take on the comments/no-comments discussion: comments are "labels on the trees in the forest", they can help if describing non-obvious choices but they come with maintenance costs, and don't provide the bigger picture that a "map of the forest" can (the aforementioned model of the problem).

Full disclosure: I'm a technical architect by day, code junkie by night!

Collapse
 
t0ss_games profile image
t0ss • Edited

Regarding the discussion below, "Don't write comments" is a mantra I was once guilty of shouting at every opportunity as well. It really seems like something we just yell anytime comments are mentioned now.
Write good comments, when you need them. Other devs new to your codebase will thank you for it. No matter how clean you write it, there are times you need to explain why you're doing something. If you're making a programmer backtrack through interfaces, objects, functions, etc to see why something is happening you're not writing clean code.

Collapse
 
bhaskar_vk profile image
Bhaskar Karambelkar

Thanks for your comments! I am merely making suggestions based on my experiences and I know that there are people who are completely anti commenting. To each his own I guess. :)

Collapse
 
hexhead profile image
Bill White

Consider that the comments could be for YOU, the developer, sometime down the road (years possibly!), not necessarily for anybody else. If for others, then yes, you are in team development territory and has to fit into the last stage of the software engineering calculus-- maintenance and extension. The software development lifecycle (SDLC) is for serious software, like aircraft avionics, high speed finance, controls systems and thus cannot cut corners on comments, documentation and training manuals/delivery. Writing web interfaces, storefronts and the like that get changed by the day are something different, rarely anything I would considering engineering. I still have code/custom hardware running from 1995 (Visual BASIC 3), and thank goodness it's commented!

When I was a child I wrote no comments, then I put away childish things and learned to write in coherent sentences.

Collapse
 
jaworanski profile image
Steve Jaworanski

Comments or make your code readable ...

Collapse
 
bhaskar_vk profile image
Bhaskar Karambelkar

They are not mutually exclusive. You can use good commenting practice improve the readability of your code.