How to create a unique index on a NULL column?

SqlSql ServerIndexingConstraintsUnique

Sql Problem Overview


I am using SQL Server 2005. I want to constrain the values in a column to be unique, while allowing NULLS.

My current solution involves a unique index on a view like so:

CREATE VIEW vw_unq WITH SCHEMABINDING AS
    SELECT Column1
      FROM MyTable
     WHERE Column1 IS NOT NULL

CREATE UNIQUE CLUSTERED INDEX unq_idx ON vw_unq (Column1)

Any better ideas?

Sql Solutions


Solution 1 - Sql

Using SQL Server 2008, you can create a filtered index.

CREATE UNIQUE INDEX AK_MyTable_Column1 ON MyTable (Column1) WHERE Column1 IS NOT NULL

Another option is a trigger to check uniqueness, but this could affect performance.

Solution 2 - Sql

The calculated column trick is widely known as a "nullbuster"; my notes credit Steve Kass:

CREATE TABLE dupNulls (
pk int identity(1,1) primary key,
X  int NULL,
nullbuster as (case when X is null then pk else 0 end),
CONSTRAINT dupNulls_uqX UNIQUE (X,nullbuster)
)

Solution 3 - Sql

Pretty sure you can't do that, as it violates the purpose of uniques.

However, this person seems to have a decent work around: http://sqlservercodebook.blogspot.com/2008/04/multiple-null-values-in-unique-index-in.html

Solution 4 - Sql

It is possible to use filter predicates to specify which rows to include in the index.

From the documentation:

> WHERE Creates a filtered index by specifying which > rows to include in the index. The filtered index must be a > nonclustered index on a table. Creates filtered statistics for the > data rows in the filtered index.

Example:

CREATE TABLE Table1 (
  NullableCol int NULL
)

CREATE UNIQUE INDEX IX_Table1 ON Table1 (NullableCol) WHERE NullableCol IS NOT NULL;

Solution 5 - Sql

Strictly speaking, a unique nullable column (or set of columns) can be NULL (or a record of NULLs) only once, since having the same value (and this includes NULL) more than once obviously violates the unique constraint.

However, that doesn't mean the concept of "unique nullable columns" is valid; to actually implement it in any relational database we just have to bear in mind that this kind of databases are meant to be normalized to properly work, and normalization usually involves the addition of several (non-entity) extra tables to establish relationships between the entities.

Let's work a basic example considering only one "unique nullable column", it's easy to expand it to more such columns.

Suppose we the information represented by a table like this:

create table the_entity_incorrect
(
  id integer,
  uniqnull integer null, /* we want this to be "unique and nullable" */
  primary key (id)
);

We can do it by putting uniqnull apart and adding a second table to establish a relationship between uniqnull values and the_entity (rather than having uniqnull "inside" the_entity):

create table the_entity
(
  id integer,
  primary key(id)
);

create table the_relation
(
  the_entity_id integer not null,
  uniqnull integer not null,

  unique(the_entity_id),
  unique(uniqnull),
  /* primary key can be both or either of the_entity_id or uniqnull */
  primary key (the_entity_id, uniqnull), 
  foreign key (the_entity_id) references the_entity(id)
);

To associate a value of uniqnull to a row in the_entity we need to also add a row in the_relation.

For rows in the_entity were no uniqnull values are associated (i.e. for the ones we would put NULL in the_entity_incorrect) we simply do not add a row in the_relation.

Note that values for uniqnull will be unique for all the_relation, and also notice that for each value in the_entity there can be at most one value in the_relation, since the primary and foreign keys on it enforce this.

Then, if a value of 5 for uniqnull is to be associated with an the_entity id of 3, we need to:

start transaction;
insert into the_entity (id) values (3); 
insert into the_relation (the_entity_id, uniqnull) values (3, 5);
commit;

And, if an id value of 10 for the_entity has no uniqnull counterpart, we only do:

start transaction;
insert into the_entity (id) values (10); 
commit;

To denormalize this information and obtain the data a table like the_entity_incorrect would hold, we need to:

select
  id, uniqnull
from
  the_entity left outer join the_relation
on
  the_entity.id = the_relation.the_entity_id
;

The "left outer join" operator ensures all rows from the_entity will appear in the result, putting NULL in the uniqnull column when no matching columns are present in the_relation.

Remember, any effort spent for some days (or weeks or months) in designing a well normalized database (and the corresponding denormalizing views and procedures) will save you years (or decades) of pain and wasted resources.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionNuno GView Question on Stackoverflow
Solution 1 - SqlPhil HaseldenView Answer on Stackoverflow
Solution 2 - SqlonedaywhenView Answer on Stackoverflow
Solution 3 - SqlwillasaywhatView Answer on Stackoverflow
Solution 4 - SqlMartin StaufcikView Answer on Stackoverflow
Solution 5 - SqlroyView Answer on Stackoverflow