Introduction
Database naming conventions, despite SQL having been around for decades, are ad hoc. When moving from project to project, y...
For further actions, you may consider blocking this person and/or reporting abuse
Srry to be a pain, but the part with the primary key nit being id but tablename_id ... Well in school and a lot of books the teory sais its ok id,
Maybe what you presented was a local situation, but i don't see the ambiguity when you write the query, cuz you use the alias.id ... And tablename_id normally is a foreign key in other tables.
But as i sayed i think its a matter of situations.
I for one am with team just id as a primary key and tablebame_id as a foreign key when used in other tables.
Mytwocents
When I first published this article a long time ago, people complained that they needed to see the second version of the
id
query because they didn't see what was wrong with the first version. After all,origin.id = thread.id
looks fine.In the second version, because we use explicit names for the ids, you can immediately see that
origin.star_id = thread.email_id
is wrong. That's the point of this: if your naming conventions let you immediately see that a mistake is made, that's a huge win.I've been a SQL developer for 30 years. And this is SO wrong.... Table.id is the PK (or at least the unique constraint) on this table. Table.other_id is the Foreign key to the related table. Origin.id = email.id immediately stands out as being incorrect because at least one side should reference the FK field unless one of them is a straight 1:1 extension table.
Table names should be singular as each row represents a single occurrence of the modelled entity, for the same reason your class names are singular. If you are going to conflict with reserved words, pick better table names: AppUser, BillingConstraint, ShippingTable.
I personally don't like underscored table names it makes code scanning harder because fields do tend to be underscored as are system configurations.
Right at the top you said SQL databases are case insensitive. This depends on the collation settings for the instance, database and table. Case sensitive collations are a nightmare!
While I've laid out my reasoning and you should feel free to disagree. I point out my id naming reasons, you don't like them, that's fine.
As for table names being plural, as mentioned, many experts disagree strongly on this. Joe Celko, world-famous database expert and one of the authors of the SQL-89 and SQL-92 standards, argues in SQL for Smarties, that table names should be plural because they represent collections, not individual entities.
As for case-insensitivity, section 5.1 of the SQL-92 standard states (where the angle-bracketed words refer to a previously defined grammar):
SQL the language (not the data it manipulates) has been case-insensitive for decades.
I'll grant that there is no one perfect answer but I felt it was important to put across a different set of arguments. The most important thing is to be consistent with whatever naming convention you decide to use. The ANSI standard SQL-92 - The clue is in the title. That standard will be 30 years old next year, the world has moved on. Each vendor has proprietary extensions to the standard, my preferred flavour is MSSQL which allows you to define the collation rules for your database which includes whether the database and the data it contains is case sensitive, whether it uses the standard ASCIII character set or allows for 'foreign' characters such as Cyrillic, turkish or accented characters, whether Upper and lower case characters should be treated differently and whether NULL appears at the start or end of the index. This may not be standard, but it is 'real world'.
Please don't quote 'World Renowned' Joe Celko: I have had enough run-ins with him on SQLLServerCentral over the years to know that he is an absolutist and simply cannot accept that data is impure and that applications and their databases are not always (usually) developed by people with Doctorates in data management and who have the full picture of what the end solution will need to look like in 10 years time when the business needs are not well defined and you don't have an infinite amount of time. Joe has some good points and his books are comprehensive but on a personality basis he is way of the mark with understanding how to help people sove the problems they find themselves in. His attitude is - well you shouldn't have gotten there in the first place...
I know I am late to the party, but I am going to jump in here. I understand that it is common practice to name the PK, ID. But I have always disagreed with this. Reason being I don't like to spend time researching stuff that should be plainly obvious by reading the code. Here is an example:
WHERE tblUser.ID = tblCar.ID
Obviously, there is a problem here, we are looking at two PK fields, which shouldn't happen except in special cases. But which one is wrong? Unless I coded this, I can't 100% know which side is wrong. The policy may be that the PK is always on the left side, but who is to say that Joe Blow actually did that. We already know he messed up the criteria, so maybe when he messed up he did it all the way wrong and put the actual key he wanted to use on the wrong side. The bottom line is I can't actually know that without doing research on the table designs. This is a complete waste of time. Given that we know the intent, what is supposed to be accomplished here, helps. It may not give us the full picture. and for giggles, lets ay the intent is to see which mechanic is assigned to the car. Now we have a little light, but it is still a mess, Perhaps this is supposed to be the PK tblCar.ID to tblUser.CarID. After all, one mechanic can be assigned to fix many cars in one day. Or maybe, it is supposed to be Pk tblUser.ID to tblCar.UserID, after all, many mechanics can be assigned to fix a car if it is tough job. See my point? These situations are rare, but as programmers shouldn't we prepare of all issues as opposed to ommitting the rare cases?
Another example:
WHERE tblUser.UserID = tblCar.ID
This is clear, this states plainly that the userID field is the PK field. If our policy is to always name Key fields, Pk and Fk respectively, we know the issue is in the design of the tblCar as there is missing a name for the key field. Perhaps this field is a PK, or maybe someone didn't name the FK. The result though, is we know EXACTLY where to find the issue from just reading the code. In this case, we open the design of tblCar and name this field appropriately. If it turns out this field is the wrong field for our criteria, then good for us, we just fixed two issues by finding one.
OK. Lets unpack your comment as the arguments are common and as always - Make a decision and stick to it.
Firstly - don't name for tables tblUser and tblCar. table is the default object in the database. Yes User is probably a reserved word so find a better table name (like systemUser or employee). Also when you need to 'replace' the table with a view, for example if the fields get split into two separeate tables, to prevent changing ALL your code you will now need to create the view (or a synonym) called tblUser which is doubly confusing because it now isn't a table.....
A.ID = B.ID is patently wrong in the vast majority of cases. It would only really be valid if you have aliased a field as part of a view or CTE or table B is an extension table to A and is a 1:1 relationship so it is an instant code-smell and that makes it easy for a seasoned developer to spot (so it is a good thinig...) Yes you have to interrogate the code to find the intention but you are doing that anyway.
In you example you suggested that the User table in this context is the mechanic. If this was the case then I would be aliasing the tblUser and (tbl)Mechanic (I wouldn't use the tbl - but each to their own) and then presumably the tblCar should have a field MechanicID or MechanicUserID. Under your proposal the JOIN would look like
tblMechanic.UserID = Car.MechanicID
Under my proposal it would be
tblMechanic.ID = Car.MechanicID Which to me is much clearer and confirms that we are joining a PK to an FK. Your way requires more intimate knowlege of the database to be sure that UserID is the PK on the aliased table.
This becomes especially important if the tblUser is going to be used multiple times in the query (Mechanic, Salesman, Valet, Delivery driver, Service receptionist etc).
You said:
That is a many to many relationship and therefore should be modelled appropriately in the database with a bridging table (e.g. ServiceTeam)
ID INT IDENTITY(1,1), /*yes I know....*/
Car_ID INT NOT NULL, /*FK to Car*/
Mechanic_ID NOT NULL /*FK to User in the context of the role of Mechanic*/
(alternately as the same person may take on many roles...)
Role_ID /*FK to job role to identify Mechanic, Valet, Driver, Receptionist*/
User_ID /**FK to User table./
Feel free to respond as these discussions help junior developer to avoid pitfalls later on in the project.
Your example is already wrong, wrong wrong. origin.id = thread.id you should immediately know there is something wrong as it both points to primary keys in different tables, so you do know something is wrong already.
And with origin.star_id = thread.email_id, yeah you immediately see something is wrong, but it's wrong due to 2 things (at least if we're going from the notion at least one of the two is the primary key), so it should be origin.star_id = thread.star_id or origin.email_id = thread.email_id or origin.origin_id = thread.origin_id or origin.id = thread.origin_id (and so on).
Sorry about this but, from this i see that the problem us more of ambiguous aliases than column names.
But as i saud is a matter of opinion and project.
I really like this article, I've used pretty much the same conventions, mostly because this is what it was taught to me, and would say some things are common sense, that common sense that sometimes we miss when working on a project(s). (I'm guilty of the
id
usage instead of the descriptive one but changed it on a later project, and it definitely helps)At some point, I was feeling that naming columns shouldn't be wordy but realized it's better to be descriptive and simple if possible than trying to oversimplify things unnecessarily.
In programming, we often suggest that long variable names be used for variables that have a wider scope. It's important that those be descriptive. It's even more important, in my opinion, for databases, because many people will be staring at those names over and over and over again. They tend to persist even longer than that library I wrote last week. Since databases are often the foundation of apps, I try hard to put more effort into making them as understandable as possible.
Not sure where the ideas some of the commenters have regarding singular naming and camel case are coming from, but I strongly STRONGLY disagree. Plural table names imply that you're storing a collection of objects. Singular implies the opposite. Can you imagine if in code we called a collection "user" instead of "users"?
As for PascalCase or camelCase naming...this is just ugly.
The argument that it better represents what folks are coding for falls apart easy. Java uses camelCase, as does OO PHP. C, Python, and Ruby use snake_case. C# uses PascalCase. Go uses both camelCase and PascalCase depending on what you're doing. All those considered, when it comes to database columns, underscores make the most sense when we really need to focus on ultimate readability of the name in a data context. (Don't apply this to the programming language being used!!!). Heck most ORMs will convert this automatically in the first place. If anything, using the _ convention, as it's so common, is only going to help coding efforts. Take Laravel's Eloquent ORM for example It already knows that if you have underscores that it can turn those in to camelCase properties when you access the generated object. It's just such a bad argument that you tend to see from crotchety old DBAs that were working since before best practices were even an idea in tech.
I'm 100% with this article. The only disagreement I have is on the naming of the id column. This isn't one I've come across before, but here is why I disagree: It actually increases ambiguity rather than decreases it. If I see a tablename_id pattern, I'm going to assume a 1:1 relationship, not a PK. You could have a self referential column which, while I can't stand the idea, happens in legacy databases or in newer databases created by novice engineers. I get what you're saying, but I think this is where aliases really shine.
Full disclosure: I'm not a DBA, just an engineer that has to work with them. I can tell you right now which I'd prefer, as would all the engineers I have ever worked with.
My god, it seems that every single point you set as THE naming convention is exactly opposite as what I think is a good naming convention.
*Table names should be plural
Nop, use singular as every record is a single object, not multiple, and the more likely to be the same as a keyword is bullocks ofcourse, as it would also assert to fieldnames
*Spell out id fields (item_id instead of id)
This is certainly a topic that many people disagree on, many people think every table should have an Id field and only the reference from a foreign table should have the Id. Especially if you want to make things more generic. And ofcourse there are also a lot of people who think you should have the fullname in both tables.
Personally at the moment I opt for the Id variant as it makes a lot much easier.
*Don't use ambiguous column names
Ok, I give you this, but to me this is more due to logic then anything else.
"While not perfect, the above database naming conventions will make your database world a better place"
Well, that's all in the eye of the beholder, as many people have different idea's about what makes it a better place.
"Use underscore_names instead of CamelCase Nope, Use CamelCase as it reads more fluent and conforms better to the actual code it's being used."
I guess if the only language you use is C# then this makes sense. Otherwise why are you tying naming to the programming language in use? It should be totally agnostic. This is just bad practice in the first place.
It's got nothing to do with the language I'm using, I'm using even multiple languages. But it's got all to do with what I think reads better and looks more clean, and even less typing. everything lowercase and using underscores is like writing ancient english to me, it's just ugly.
And calling lowercase/underscore best practice, is all in the eye of the beholder. Most people I work with use the CamelCasing with the database designs now.
In reality, in these cases there is no real best practice, it's just what you prefer.
"conforms better to the actual code being used" directly implies the language you're using because it's not true in all cases (that it conforms) by a long shot.
I've never in my life seen that kind of casing in database design, like ever. Not sure who you're working with or where but yeah, I think this ain't the hill to die on.
FYI, first link is dead
Thank you, Tib. Fixed!
Thanks , great artical!
But how should I name my for database correctly?
very helpful! 👍🏻