Primary key/foreign Key naming convention

SqlDatabase DesignNaming Conventions

Sql Problem Overview


In our dev group we have a raging debate regarding the naming convention for Primary and Foreign Keys. There's basically two schools of thought in our group:

1:

Primary Table (Employee)   
Primary Key is called ID

Foreign table (Event)  
Foreign key is called EmployeeID

or

2:

Primary Table (Employee)  
Primary Key is called EmployeeID

Foreign table (Event)  
Foreign key is called EmployeeID

I prefer not to duplicate the name of the table in any of the columns (So I prefer option 1 above). Conceptually, it is consistent with a lot of the recommended practices in other languages, where you don't use the name of the object in its property names. I think that naming the foreign key EmployeeID (or Employee_ID might be better) tells the reader that it is the ID column of the Employee Table.

Some others prefer option 2 where you name the primary key prefixed with the table name so that the column name is the same throughout the database. I see that point, but you now can not visually distinguish a primary key from a foreign key.

Also, I think it's redundant to have the table name in the column name, because if you think of the table as an entity and a column as a property or attribute of that entity, you think of it as the ID attribute of the Employee, not the EmployeeID attribute of an employee. I don't go an ask my coworker what his PersonAge or PersonGender is. I ask him what his Age is.

So like I said, it's a raging debate and we go on and on and on about it. I'm interested to get some new perspectives.

Sql Solutions


Solution 1 - Sql

If the two columns have the same name in both tables (convention #2), you can use the USING syntax in SQL to save some typing and some boilerplate noise:

SELECT name, address, amount
  FROM employees JOIN payroll USING (employee_id)

Another argument in favor of convention #2 is that it's the way the relational model was designed.

> The significance of each column is > partially conveyed by labeling it with > the name of the corresponding domain.

Solution 2 - Sql

It doesn't really matter. I've never run into a system where there is a real difference between choice 1 and choice 2.

Jeff Atwood had a great article a while back on this topic. Basically people debate and argue the most furiously those topics which they cannot be proven wrong on. Or from a different angle, those topics which can only be won through filibuster style endurance based last-man-standing arguments.

Pick one and tell them to focus on issues that actually impact your code.

EDIT: If you want to have fun, have them specify at length why their method is superior for recursive table references.

Solution 3 - Sql

I think it depends on your how you application is put together. If you use ORM or design your tables to represent objects then option 1 may be for you.

I like to code the database as its own layer. I control everything and the app just calls stored procedures. It is nice to have result sets with complete column names, especially when there are many tables joined and many columns returned. With this stype of application, I like option 2. I really like to see column names match on joins. I've worked on old systems where they didn't match and it was a nightmare,

Solution 4 - Sql

Have you considered the following?

Primary Table (Employee)   
Primary Key is PK_Employee

Foreign table (Event)  
Foreign key is called FK_Employee

Solution 5 - Sql

Neither convention works in all cases, so why have one at all? Use Common sense...

e.g., for self-referencing table, when there are more than one FK column that self-references the same table's PK, you HAVE to violate both "standards", since the two FK columns can't be named the same... e.g., EmployeeTable with EmployeeId PK, SupervisorId FK, MentorId Fk, PartnerId FK, ...

Solution 6 - Sql

I agree that there is little to choose between them. To me a much more significant thing about either standard is the "standard" part.

If people start 'doing their own thing' they should be strung up by their nethers. IMHO :)

Solution 7 - Sql

The convention we use where I work is pretty close to A, with the exception that we name tables in the plural form (ie, "employees") and use underscores between the table and column name. The benefit of it is that to refer to a column, it's either "employees _ id" or "employees.id", depending on how you want to access it. If you need to specify what table the column is coming from, "employees.employees _ id" is definitely redundant.

Solution 8 - Sql

If you are looking at application code, not just database queries, some things seem clear to me:

  1. Table definitions usually directly map to a class that describes one object, so they should be singular. To describe a collection of an object, I usually append "Array" or "List" or "Collection" to the singular name, as it more clearly than use of plurals indicates not only that it is a collection, but what kind of a collection it is. In that view, I see a table name as not the name of the collection, but the name of the type of object of which it is a collection. A DBA who doesn't write application code might miss this point.

  2. The data I deal with often uses "ID" for non-key identification purposes. To eliminate confusion between key "ID"s and non-key "ID"s, for the primary key name, we use "Key" (that's what it is, isn't it?) prefixed with the table name or an abbreviation of the table name. This prefixing (and I reserve this only for the primary key) makes the key name unique, which is especially important because we use variable names that are the same as the database column names, and most classes have a parent, identified by the name of the parent key. This also is needed to make sure that it is not a reserved keyword, which "Key" alone is. To facilitate keeping key variable names consistent, and to provide for programs that do natural joins, foreign keys have the same name as is used in the table in which they are the primary key. I have more than once encountered programs which work much better this way using natural joins. On this last point, I admit a problem with self-referencing tables, which I have used. In this case, I would make an exception to the foreign key naming rule. For example, I would use ManagerKey as a foreign key in the Employee table to point to another record in that table.

Solution 9 - Sql

I like convention #2 - in researching this topic, and finding this question before posting my own, I ran into the issue where:

I am selecting * from a table with a large number of columns and joining it to a second table that similarly has a large number of columns. Both tables have an "id" column as the primary key, and that means I have to specifically pick out every column (as far as I know) in order to make those two values unique in the result, i.e.:

SELECT table1.id AS parent_id, table2.id AS child_id

Though using convention #2 means I will still have some columns in the result with the same name, I can now specify which id I need (parent or child) and, as Steven Huwig suggested, the USING statement simplifies things further.

Solution 10 - Sql

I've always used userId as a PK on one table and userId on another table as a FK. 'm seriously thinking about using userIdPK and userIdFK as names to identify one from the other. It will help me to identify PK and FK quickly when looking at the tables and it seems like it will clear up code when using PHP/SQL to access data making it easier to understand. Especially when someone else looks at my code.

Solution 11 - Sql

I use convention #2. I'm working with a legacy data model now where I don't know what stands for in a given table. Where's the harm in being verbose?

Solution 12 - Sql

How about naming the foreign key

role_id

where role is the role the referenced entity has relativ to the table at hand. This solves the issue of recursive reference and multiple fks to the same table.

In many cases will be identical to the referenced table name. In this cases it becomes identically to one of your proposals.

In any case havin long arguments is a bad idea

Solution 13 - Sql

"Where in "employee INNER JOIN order ON order.employee_id = employee.id" is there a need for additional qualification?".

There is no need for additional qualification because the qualification I talked of is already there.

"the reason that a business user refers to Order ID or Employee ID is to provide context, but at a dabase level you already have context because you are refereing to the table".

Pray, tell me, if the column is named 'ID', then how is that "refereing [sic] to the table" done exactly, unless by qualifying this reference to the ID column exactly in the way I talked of ?

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJeremyView Question on Stackoverflow
Solution 1 - SqlSteven HuwigView Answer on Stackoverflow
Solution 2 - SqlRussell SteenView Answer on Stackoverflow
Solution 3 - SqlKM.View Answer on Stackoverflow
Solution 4 - SqlWouterView Answer on Stackoverflow
Solution 5 - SqlCharles BretanaView Answer on Stackoverflow
Solution 6 - SqlMatBailieView Answer on Stackoverflow
Solution 7 - SqlJarett MillardView Answer on Stackoverflow
Solution 8 - SqlBruce PatinView Answer on Stackoverflow
Solution 9 - SqlJYeltonView Answer on Stackoverflow
Solution 10 - SqlRossView Answer on Stackoverflow
Solution 11 - SqlOMG PoniesView Answer on Stackoverflow
Solution 12 - SqlJens SchauderView Answer on Stackoverflow
Solution 13 - SqlErwin SmoutView Answer on Stackoverflow