SQL NOT IN not working

SqlSql Server

Sql Problem Overview


I have two databases, one which holds the inventory, and another which contains a subset of the records of the primary database.

The following SQL statement is not working:

SELECT  stock.IdStock
        ,stock.Descr       
FROM    [Inventory].[dbo].[Stock] stock
WHERE   stock.IdStock NOT IN
		(SELECT foreignStockId FROM
		 [Subset].[dbo].[Products])

The not in does not work. Removing the NOT gives the correct results, i.e. products that are in both databases. However, using the NOT IN is not returning ANY results at all.

What am I doing wrong, any ideas?

Sql Solutions


Solution 1 - Sql

SELECT foreignStockId
FROM   [Subset].[dbo].[Products]  

Probably returns a NULL.

A NOT IN query will not return any rows if any NULLs exists in the list of NOT IN values. You can explicitly exclude them using IS NOT NULL as below.

SELECT stock.IdStock,
       stock.Descr
FROM   [Inventory].[dbo].[Stock] stock
WHERE  stock.IdStock NOT IN (SELECT foreignStockId
                             FROM   [Subset].[dbo].[Products]
                             WHERE  foreignStockId IS NOT NULL) 

Or rewrite using NOT EXISTS instead.

SELECT stock.idstock,
       stock.descr
FROM   [Inventory].[dbo].[Stock] stock
WHERE  NOT EXISTS (SELECT *
                   FROM   [Subset].[dbo].[Products] p
                   WHERE  p.foreignstockid = stock.idstock) 

As well as having the semantics that you want the execution plan for NOT EXISTS is often simpler as looked at here.

The reason for the difference in behaviour is down to the three valued logic used in SQL. Predicates can evaluate to True, False, or Unknown.

A WHERE clause must evaluate to True in order for the row to be returned but this is not possible with NOT IN when NULL is present as explained below.

'A' NOT IN ('X','Y',NULL) is equivalent to 'A' <> 'X' AND 'A' <> 'Y' AND 'A' <> NULL)

  • 'A' <> 'X' = True
  • 'A' <> 'Y' = True
  • 'A' <> NULL = Unknown

True AND True AND Unknown evaluates to Unknown per the truth tables for three valued logic.

The following links have some additional discussion about performance of the various options.

Solution 2 - Sql

If NOT IN does not work, you may always try to do LEFT JOIN. Then filter by WHERE using one of the values from the joined table, which are NULL. Provided, the value you were joining by does not contain any NULL value.

Solution 3 - Sql

Adding my 2 cents:

I've seen SQL Server returning wrong results even when switching to not exists and left join - in corrupt databases. Run DBCC CHECKTABLE on the tables involved, also look at the NOT IN query execution plan and rebuild the indexes involved, this should help.

Solution 4 - Sql

You can also use Case clause to tackle such issues

SELECT  stock.IdStock
        ,stock.Descr        
FROM    [Inventory].[dbo].[Stock] stock
WHERE   (Case when stock.IdStock IN
        (SELECT foreignStockId FROM
        [Subset].[dbo].[Products]) then 1 else 0 end) = 0 

this syntax works in SQL Server, Oracle and postgres

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionSamView Question on Stackoverflow
Solution 1 - SqlMartin SmithView Answer on Stackoverflow
Solution 2 - SqlMPękalskiView Answer on Stackoverflow
Solution 3 - SqlAlex from JitbitView Answer on Stackoverflow
Solution 4 - SqlRaj KamalView Answer on Stackoverflow