DEV Community

Stop Using "data" as a Variable Name

Devin Witherspoon on November 29, 2020

"There are only two hard things in Computer Science: cache invalidation and naming things." - Phil Karlton Setting aside cache invalidation, wh...

Read full post

Todd Pressley • Nov 29 '20

Thank you for articulating this in your own way and publishing :) This topic reminds me of something a favorite mentor once taught me, when confused about naming a particular function:

"If you're having trouble naming a function, then it's most likely doing too many things."

Years later, when encountering similar issues, I play this back in my head and have found it very useful. It can be expanded to naming just about anything.

Again, love the article!

Devin Witherspoon • Nov 29 '20

Thanks for sharing! That’s a great angle to look at it from. I love that quote, I’ve definitely encountered that scenario many times both as a reviewer and an author.

Given people seem to have appreciated this article, would you mind if I expand on this in the future along the lines of “X reasons why you might be struggling to name something”?

Todd Pressley • Nov 29 '20

That'd be awesome, man! Not at all!

Kelly Brown • Nov 30 '20

Also, stop using "temp" as parts of names. All variables are inherently temporary. They are local in scope. Telling me that it is "temp" adds no new information. There is always a superior name. Take the traditional swap algorithm:

swapValue = a;
a = b;
b = swapValue;

Knowing the variable is used to contain the swapped value is far more useful information than knowing it's a temporary store.

Chaitanya Malireddy • Nov 30 '20

In this context, if this were JavaScript, I'd dispense with the 'temp' variable altogether :) Also, often it makes sense to chain operations to avoid having to name intermediate states.

[a, b] = [b,a]; // destructuring assignment

Kelly Brown • Dec 2 '20

To me, it depends on the complexity of the intermediate state. I used to aggressively pack as many operations into a single line as possible, but more recently, I like capturing operations into local variables both for readability and for debugging (easily mouse over variables to see what the answer was). Obviously, this can be taken too far, so it's judgment call.

Chaitanya Malireddy • Dec 2 '20

Yeah I don't like to pack too much into a clever oneliner either - makes it hard to read and debug. I like chaining methods if I can, like say a bunch of array transformations. But you're right it all depends on what you're trying to do - one has to strike a good balance per usecase.

Kelly Brown • Dec 2 '20

Chain methods are amazing. I can put each call onto its own line, and it becomes a super clear series of steps.

Devin Witherspoon • Nov 30 '20

Great call out! I don’t see this much in production code, but is super common in interview questions, as well as when people are asking for help. I also like your alternative as a replacement without additional context.

Kelly Brown • Dec 2 '20

I kid you not: I have seen tempData before. Best of both worlds!

Etienne Burdet • Nov 30 '20

req and res are two good candidates too! Especially when you start caching, fetching from backend and api… resFromServ, resFromCache, resFromNetwork etc. make things much easier to understand!

Vincent Milum Jr • Dec 23 '20

It is until it isn't. This is one of the large problem with the suggestion to use shorthands, is they mean different things to different people.

Coming from a Win32 background, "res" is a "resource", such as icons, images, etc. They're non-code elements compiled into a file exe file.

Its easy to get keyword conflictions between people when shortening them like this.

Chad Windham • Dec 4 '20

I really like this example, good call out😉

𒎏Wii 🏳️‍⚧️ • Nov 30 '20

I don't like the get prefix at all. Returning a value is the default use of a function, so it shouldn't be part of the functions name.

The word "total" could mean many things. Regarding balance, we can guess that it's the sum, but it's still better to make it clear.

The accountIndex variable should be named i. Generally, one-letter variables are bad, but i, j and a few more are so ubiquitous that they can and should be used. Every competent programmer will know what they mean.

Instead, that loop shouldn't be written like that at all. It looks like C code. Javascript and comparable languages have much nicer ways to iterate over an array and those should be used instead.

A good solution would be either

const sumBalance =
   (accounts) => accounts.reduce((acc, account) => acc+account.balance, 0)

const sumBalance =
   (accounts) => accounts
      .map( (acc) => acc.balance )
      .inject( (a, b) => a+b )

Devin Witherspoon • Nov 30 '20

Thanks for sharing! As far as defaults go, I try to always make the implicit into the explicit. For me that means saying get when it only returns a value. Same goes for the i value, I want my reader to do as little work as possible to understand my code. Using i can also result in referencing the wrong value when looping over multidimensional arrays. Someone who is a bit distracted may also have trouble keeping track of i and j simultaneously.

Regarding your point about the loop, I agree. Personally I don’t write loops using indices either, but many people do, and it’s the most universally recognizable for loop format. Changing the format of the loop would have obfuscated the intent of the exercise - showing how having rules or conventions can help us find better names for things.

𒎏Wii 🏳️‍⚧️ • Nov 30 '20

when looping over multidimensional arrays

For two dimensions, i and j are still easy enough to follow. Starting at 3, it's very rare to not have better names resulting from context, like x, y and z for spacial coordinates, etc.

but many people do

They shouldn't. Iterating over data structures with C-style numeric for loops is a much worse habit than calling a variable data. But fair point on the intent of the exercise, sometimes we have to write "bad" code to avoid having to think up a convoluted example just to illustrate a very basic principle.

cubiclesocial • Dec 3 '20

I've never understood why people use i and j for loop variable names. I prefer x, y, and z because they are generally used for looping spatially over an array (columns = x, rows = y, depth = z). Using x, y, and z also mirrors the Cartesian-like planes in mathematics fairly well. That is, anyone with a strong math background will understand x, y, z, and n intuitively while i and j are largely meaningless with i being used for imaginary numbers. i and j and l (lowercase L) are also the thinnest characters in many fonts, making them harder to read.

Mark • Dec 31 '20

Check out this Answer on StackOverflow:
stackoverflow.com/a/4137890/4035952

The answer is mostly because of Math, and what "i" and "j" stood for and it's pretty easy to understand how it made it's way into code. Luckily these days, simple "for" loops can be often be replaced with functional versions, or use "for...of"

Aedan • Dec 2 '20

While working on smaller scripts or just scribbling for yourself, I always use i, however once you work on a massive project with 1000s of lines of codes where there are multiple loops and multidimensional arrays going on, and they use i, x, z etc. it gets so confusing. It honestly never hurts to write out a word like accountIndex. It's simply good practice that has no real downside. The argument that it takes longer to write accountIndex than just i is true, however with IDEs having autocomplete this isn't a real issue. Even when writing it manually, it takes maybe 1-2 seconds to write accountIndex. While it might cost you 1-2 seconds now, you'll easily make up for it down the road when you go over that code weeks later and you immediately know what accountIndex refers to, rather than seeing the i for the 50th time. Even just scrolling through the code while working on it makes it so much more visible, saving you seconds here and there.

Josef Jelinek • Dec 3 '20

accountIndex is hard to read because the reader really needs to read it... for i there is instant recognition in the brain.

  for (let accountIndex = 0; accountIndex < accounts.length; accountIndex++) {
    totalBalance += accounts[accountIndex].balance;
  }

vs.

  for (let i = 0; i < accounts.length; i++) {
    totalBalance += accounts[i].balance;
  }

fortunatelly, there is a superior alternative (without going into Array methods):

  for (const account of accounts) {
    totalBalance += account.balance;
  }

Mark • Dec 31 '20

Yeah i think it's important to use words where it makes sense, but sometimes the syntax gets lost in long lines because of longer words. For variables, meaningful names are important, but I agree in things like loops, it's better to use alternative methods or function methods instead! +1 for "for...of"

Aedan • Nov 29 '20

Great article, thank you.

I've recently watched "Clean Code - Uncle Bob" (Lesson 1 and 2, both on YouTube). His conclusion is that whenever any line of code requires comments, it's a failure on the developers part because good code should contain variable and function names that explain themselves, thus making comments unnecessary (though I disagree on the comment part, because there are some good reasons for comments in code). He mentions that a lot of developers dislike longer names because it makes the code look "uncool", bigger in file size, looks bloated, leads to horizontal scrollbars in the IDE (or wraps it to a new line) and some developers are even under the false impression that longer names lead to slower code. Just like in your article, he mentions to use names that are perfectly understandable even if consist of multiple words. We really need to stop being afraid of using longer names and start using names so that the name itself becomes self-explanatory and can potentially work without any additional commentary. Thanks again for the article.

Devin Witherspoon • Nov 29 '20

Thanks for the feedback, I'm glad you appreciated it. I agree with Robert Martin on his points about code communicating intent as much as possible. I wish he focused more on the people behind the code and expressing kindness to each other being as important as getting the code right.

Regarding comments, I personally try to add comments for context that is important and isn't really part of the code - e.g. intended lifespan of the code, maybe even linking to a ticket with an important conversation. For communicating intent - I try to resort to tests as much as possible since they're more inclined to change with the code. I don't consider it a failure to add a comment because it was hubris for me to ever think I didn't need them. I think it's an achievement of self awareness to know where to add proper comments rather than a failure of coding ability.

Vishnu Haridas • Dec 3 '20 • Edited

Giving meaningful names are important. Suddenly a few incidents popped up in my mind.

Once I was going through a legacy code, which had a variable like this: bool isWasStarted. I still don't know what it was used for!

The same project was having a function writeLog(text), and I expected the function to write the text to the output stream. The function body looks like this:

function writeLog(text){
   UserManager.createUser(text);
}

Seriously, why people do this?!

Vlastimil Pospichal • Nov 30 '20

The words get, find, fetch, add, update, delete are not a prefix for me, but a name for methods. For classes that have 2-5 methods, this is completely satisfactory, I do not make longer classes. If one word is not enough, it means that the class violates the SRP.

Devin Witherspoon • Nov 30 '20

Yes, I should have pointed out the context of that particular rule. It’s been a long time since I’ve worked with OOP, but you’re definitely right that these can be full method names for well named, well purposed classes.

Vlastimil Pospichal • Nov 30 '20

It was a long and difficult journey before I came to this knowledge. Unfortunately, many developers in OOP do not share this idea.

However, I do not have it as a dogma, but rather as a guide to clean code.

The next level is class names and namespaces. I prefer to combine a maximum of two words in classes, one word for one namespace element. I avoid the general words domain, manager, controller, infrastructure, etc.

Greg Fullard • Nov 30 '20

Reading this article restored some of my faith in humanity. Thank you!

Keno Clayton • Dec 3 '20

Good article 👍🏾 I prefer using i for any sort of single dimensional loop, but for multi-dimensional loops, it is important to know exactly which index you're referring to. e.g.

for ( rows = 0; ... )
  for ( columns = 0 ... )

Devin Witherspoon • Dec 3 '20

Yup, to each their own 👍 I tried to acknowledge that particular point as contentious. Just too much dogma around it. The important part is that the project is consistent and it's actually something that the team applies consistently.

official_dulin • Apr 29 '21 • Edited

If undefined is a meaningful value, replace it with a variable. E.g. We will do something if scoreType is all.

Bad

if (scoreType === undefined) {
  // do something
}

Good:

const all = undefined;
if(scoreType === all) {
  // do something
}

Kelly Brown • Dec 2 '20

I later noticed that I do occasionally use data as a variable name. The scenario often seems to be passing along an opaque byte array. I suppose I could add a little flavor to it by way of receivedData or httpData or whatever context, but I don't know how to improve on that name when that particular method is not in charge of decoding that data.

Devin Witherspoon • Dec 2 '20

I think that falls well into the exception category. byteStream or byteArray may be helpful for describing the nature of the data, but if there’s not a clear intent to the data then I suppose the next option is to make sure that name for it is as short lived as is reasonable.

Ricardo Luz • Dec 4 '20

Hey Devin, thanks for sharing your thoughts in this outstanding article! You commented about use lint to enforce some name patterns. However, I never heard about some linter that does that, do you know someone? Otherwise, it's a great idea to a new open source project! I am already using a vscode plugin to typo check, and it's helping me to avoid this kind of errors once English isn't my mother tongue, as well as find errors left by other devs and fix it.

Devin Witherspoon • Dec 4 '20

I’m not sure if there’s any open source project doing this or if it’s even feasible to do in a generic manner. Some things like checking for returning a promise/async function to avoid get would be easy enough.

Taking advantage of team conventions, you could enforce that there is a prefix on functions, and that some prefixes are applied correctly - though there’s a limit to how much you can do here(e.g. you could enforce that if it returns a Boolean literal it must begin with is, can, or should, but that doesn’t cover all cases).

Because this kind of linter would benefit from knowing team conventions, I don’t think it would be very effective as an open source project.

Christopher Wray • Nov 29 '20

Wow, totally agree! Another one is “payload”. Why in the world would you use that as a variable? What is the payload expected to be? That is what the variable should be named.

I also really like your definition of accountIndex vs i. Makes way more sense.

Devin Witherspoon • Nov 29 '20

payload is a great example! Super generic, could be anything, all it tells us is it’s probably not metadata. I think it has the same exceptions as data as well.

Sandro Miguel Marques • Nov 30 '20

Thank you for this article.
It reminded me of this tweet twitter.com/sandro_m_m/status/1166...

Phong Duong • Dec 3 '20

My native language is not English. I sometimes can't find a name for the variable or function. This makes me confused when I read my code later. So I reduce using the variable and try to use the function chaining. Thank you for sharing.

Chaitanya Malireddy • Nov 30 '20

Great ideas but I wouldn't replace i with accountIndex. That doesn't add much value - it's a common convention to use i as a loop counter.

Also I think this is a good rule of thumb:
twitter.com/unclebobmartin/status/...

The length of a variable name should be proportional to its scope.

Winston Puckett • Nov 30 '20

Yes, yes, yes, yes, yes.

barbgegrasse • Nov 30 '20

It sounds like a good advice

Roshan Shambharkar • Nov 30 '20

how to use code editor in this post i also want to post some content on dev.to but i can not see code editor option in my writer a post section

Devin Witherspoon • Nov 30 '20

I actually have a template comment for this!

Hey, thanks for sharing! BTW, you can tag code blocks with a language to get syntax highlighting by adding the language extension (e.g. js, c, cpp, java) after the the first three backticks.

Devin Witherspoon • Dec 3 '20

Nice, I would have thought the joke was older!

Cameron Thompson • Dec 1 '20

Such a useful reminder!

Devin Witherspoon • Dec 3 '20

Not sure if you’re joking, but that wasn’t part of the original quote, it was a joke that was later added. I’m not sure who said that part first though, might not be as well documented.

Rifad ahmed • Nov 30 '20

Great article

jan paul • Dec 2 '20

topic reminds me on this reddit i.redd.it/6fttplx5kbk21.jpg

Joao Polo • Dec 3 '20

Let me put my 10 cents... the prefix "eval" is useful for evaluate something.
For example, evalAccountsTotalBalance(...) means I'll do some direct operation with arguments.