I recently watched a video posted by Prime Reacts based on this blog by Dan Lew on why nitpicking colleagues' code is bad. For those not familiar with the term, Dan Lew defines it as flagging small, insignificant problems in a code review that are not wrong but suboptimal. He also argues that fixing the "problems" will make you, the grouchy reviewer, feel smug but won't benefit your code base or your team's opinion of you.
Watching this video made me feel a petty criminal listening to a convict showing genuine remorse for a crime that I too had committed. Fortunately for me, I haven't been coding long enough to piss off hundreds of colleagues yet. But to those of you whose code I nitpicked - I apologise.
Upon reflection I realised that my previous career paths before software engineering had baked into me a culture of nitpicking. First as an editor, my task was to locate and remove language inconsistencies as well as create a coherent tone across an article. Then, as a compliance analyst I was tasked with nitpicking colleagues' work for typos, document dates and other trivialities that make you question the point of life.
But from here on out I'm a changed man. For the reasons clearly laid out in Dan's blog, I am turning a new leaf and accepting that code perfection does not make code better!
Now it's your turn
The reason you probably opened this blog was because you sought an answer to the question in the title. I've collected 4 code snippets and for each of them provided one or multiple issues. For relative consistency, there are no possible other issues (though please leave a comment if you want). Select things you would seriously considering leaving as feedback in a code review. At the end the blog I'll leave my two scores: my old self versus my new self. See how you fare.
Quiz
Question 1
def add_features(trip_df: DataFrame, zone_df: DataFrame) -> DataFrame:
"""Returns a pandas DataFrame containing
all the features derived from NYC
Taxi dataset to answer all business questions.
"""
# Rename columns
trip_df = rename_columns_as_lowercase(trip_df)
zone_df = rename_columns_as_lowercase(zone_df)
trip_df = update_payment_type_as_string_values(trip_df)
# Add features
trip_df = add_borough_and_zone(trip_df, zone_df, "pulocationid")
trip_df = add_borough_and_zone(trip_df, zone_df, "dolocationid")
return trip_df
Possible issues
- Duplicated function calls can be refactored into loops
- Comments are redundant or misleading
- Function name is not descriptive
Question 2
def build_payload(
train_type: str,
train_age: int,
train_length: float,
train_is_diesel: bool
) -> dict[str, str]:
input_parameters = [
train_type,
train_age,
train_length,
train_is_diesel,
]
payload_names = [
"train_type",
"train_age",
"train_length",
"train_is_diesel",
]
payload = {}
# Set payload defaults
current_time = datetime.datetime.now()
source = "pipeline"
for i, p_name in enumerate(payload_names):
if p_name == "train_is_diesel":
if train_is_diesel:
payload["engine_type"] = "diesel"
else:
payload[p_name] = input_parameters[i]
return payload
Possible issues
- Declare default values as global values for transparency
- Zipping is a more intuitive way to iterate over the lists, so replace
enumerate
withzip
- Refactoring the boolean function argument would make the payload mapping static
- Current time makes the function indeterministic, it should be parameterised
Question 3
def write_df(df: DataFrame) -> None:
# Select columns
small_df = df.select("name", "age", "dob", "gender")
# Cache df
df.cache()
df.write.csv("full_data.csv", df)
small_df.write.csv("subset_data.csv", df)
Possible issues
- This function should do something generic but lack of parameterisation makes it not possible to reuse
- Small df isn't an expressive name
- Comments don't give us useful information
Question 4
class DataRowExtractor:
def __init__(self):
pass
def extract_rows(self, data) -> dict[str, Any]:
extracted_rows = []
for i, row in data.iterrows()
extracted_data.append(row)
return extracted_rows
def main():
csv_path = os.path.join("/working_dir", "data", "table_data.csv")
write_path = os.path.join("/working_dir", "data", "row_data.json")
data = pd.read_csv(csv_path)
data_extractor = DataRowExtractor()
row_data = data_extractor.extract_rows(data)
with open(write_path) as f:
json_data = f"{row_data}"
f.write(row_data)
Possible issues
- OS Path module is verbose and should be replaced with the new pathlib Path API
- The RowExtractor doesn't add any real abstraction and overcomplicates the code
- Pandas has a built in method to create records from a dataframe
- Data is not a descriptive name
Have I really changed?
As promised, I'd give you my scores:
question | old_me | new_me |
---|---|---|
1 | 1 | 1 |
2 | 3 | 1 |
3 | 2 | 1 |
4 | 4 | 2 |
Total | 10 | 5 |
Conclusion
Now obviously I rigged the results to make my improvement look and feel real. Jokes aside, time (and colleagues) will tell whether I really improve at not nitpicking!
One instructive aspect I tried to get across in the examples is that - perhaps in a production environment - the bugs in question 2 and 4 (good work if you found them) may have gone unseen by a reviewer focussed on scrutinising irrelevant banalities.
This is particularly relevant for question two: if we focus on reviewing the design of the function as a whole, we see that by replacing a boolean argument with a string, we can simply return a dictionary or maybe even just drop the function all together. Simple design is less bug prone, and this case we'd avoid the bug in our nested if statement.
Now we can see concretely why nitpicking is dangerous. But what if (like me) - despite having only picked out critical issues in a code review - you still found a bunch of nits want to pick ( I'm thinking here about code simplification through better library knowledge i.e. with extract_rows
) ? I spoke to a family friend with 25> years experience coding about this and he suggested offering your nits in a non-committal to your colleague (I found nits, if you want to discuss them, we can). This makes the feedback loop something your colleague controls and could turn nits into valuable insights.
Top comments (0)