SQL: find missing IDs in a table

Sql

Sql Problem Overview


I have table with a unique auto-incremental primary key. Over time, entries may be deleted from the table, so there are "holes" in this field's values. For example, table data may be as follows:

 ID  | Value    | More fields...
---------------------------------
 2   | Cat      | ... 
 3   | Fish     | ...
 6   | Dog      | ...
 7   | Aardvark | ...
 9   | Owl      | ...
 10  | Pig      | ...
 11  | Badger   | ...
 15  | Mongoose | ...
 19  | Ferret   | ...

I'm interested in a query that will return the list of missing IDs in the table. For the data above, the expected results are:

 ID 
----
 1
 4
 5
 8
 12
 13
 14
 16
 17
 18

Notes:

  1. It is assumed that the initial first ID was 1
  2. The maximum ID that should be examined is the final one, i.e. it's okay to assume that there were no additional entries after the current last one (see additional data on this point below)

A drawback of the above requirements is that the list will not return IDs that were created after ID 19 and that were deleted. I'm currently solving this case in code, because I hold the max ID created. However, if the query can take as a parameter MaxID, and also return those IDs between the current max and MaxID, that would be a nice "bonus" (but certainly not a must).

I'm currently working with MySQL, but consider moving to SQL Server, so I would like the query to fit both. Also, if you are using anything that can't run on SQLite, please mention it, thanks.

Sql Solutions


Solution 1 - Sql

I landed on this page hoping to find a solution for SQLITE as this was the only answer I found when searching for this same question for SQLITE.

The final solution I found was from this article here Float Middle Blog - SQLITE answer

Hope it helps someone else out :-)

the simple solution being:

SELECT DISTINCT id +1
FROM mytable
WHERE id + 1 NOT IN (SELECT DISTINCT id FROM mytable);

genius.

Solution 2 - Sql

This question often comes up, and sadly, the most common (and most portable) answer is to create a temporary table to hold the IDs that should be there, and do a left join. The syntax is pretty similar between MySQL and SQL Server. The only real difference is the temporary tables syntax.

In MySQL:

declare @id int
declare @maxid int

set @id = 1
select @maxid = max(id) from tbl

create temporary table IDSeq
(
    id int
)

while @id < @maxid
begin
    insert into IDSeq values(@id)

    set @id = @id + 1
end

select 
    s.id 
from 
    idseq s 
    left join tbl t on 
        s.id = t.id 
 where t.id is null

 drop table IDSeq

In SQL Server:

declare @id int
declare @maxid int

set @id = 1
select @maxid = max(id) from tbl

create table #IDSeq
(
    id int
)

while @id < @maxid --whatever you max is
begin
    insert into #IDSeq values(@id)

    set @id = @id + 1
end

select 
    s.id 
from 
    #idseq s 
    left join tbl t on 
        s.id = t.id 
 where t.id is null

 drop table #IDSeq

Solution 3 - Sql

Here's the query for SQL Server:

;WITH Missing (missnum, maxid)
AS
(
 SELECT 1 AS missnum, (select max(id) from @TT)
 UNION ALL
 SELECT missnum + 1, maxid FROM Missing
 WHERE missnum < maxid
)
SELECT missnum
FROM Missing
LEFT OUTER JOIN @TT tt on tt.id = Missing.missnum
WHERE tt.id is NULL
OPTION (MAXRECURSION 0); 

Hope this is helpful.

Solution 4 - Sql

PostgreSQL-only, inspired by other answers here.

SELECT all_ids AS missing_ids
FROM generate_series((SELECT MIN(id) FROM your_table), (SELECT MAX(id) FROM your_table)) all_ids
EXCEPT 
SELECT id FROM your_table

Solution 5 - Sql

I know it's an old question and already has an accepted answer, but using a temp table isn't really necessary. Fixed formatting (sorry for double post).

DECLARE @TEST_ID integer, @LAST_ID integer, @ID integer

SET @TEST_ID = 1 -- start compare with this ID 
SET @LAST_ID = 100 -- end compare with this ID

WHILE @TEST_ID <= @LAST_ID 
BEGIN 
  SELECT @ID = (SELECT <column> FROM <table> WHERE <column> = @TEST_ID) 
  IF @ID IS NULL 
  BEGIN 
    PRINT 'Missing ID: ' + CAST(@TEST_ID AS VARCHAR(10)) 
  END 
  SET @TEST_ID = @TEST_ID + 1 
END

Solution 6 - Sql

This is an Oracle only solution. It doesn't address the full question, but is left here for others that may be using Oracle.

select level id           -- generate 1 .. 19
from dual
connect by level <= 19

minus                     -- remove from that set

select id                 -- everything that is currently in the 
from table                -- actual table

Solution 7 - Sql

The single query can find the missing IDs..

SELECT distinct number

FROM master..spt_values

WHERE number BETWEEN 1 and (SELECT max(id) FROM MyTable)

AND number NOT IN (SELECT id FROM MyTable)

Solution 8 - Sql

> to get the missing rows from table

DECLARE @MaxID INT = (SELECT MAX(ID) FROM TABLE1)
SELECT SeqID AS MissingSeqID
FROM (SELECT ROW_NUMBER() OVER (ORDER BY column_id) SeqID from sys.columns) LkUp
LEFT JOIN dbo.TABLE1 t ON t.ID = LkUp.SeqID
WHERE t.ID is null and SeqID < @MaxID

Solution 9 - Sql

I just have found the solution for Postgres:

select min(gs) 
from generate_series(1, 1999) as gs 
where gs not in (select id from mytable)

Solution 10 - Sql

Update: This method took way too long so I wrote a linux command to find gaps in a text file. It does so in reverse order so first dump all id's to a text file like so;

nohup mysql --password=xx -e 'select id from tablename order by id desc' databasename > /home/ids.txt &

The first and last two lines are just to keep track of how long it took. 1.5million IDs(ish) took me 57sec & that's on a slow server. Set the max id in i and have at it.

T="$(date +%s)"; \
i=1574115; \
while read line; do \
	if  [[ "$line" != "$i" ]] ; then \
		if [[ $i -lt 1 ]] ; then break; fi; \
		if  [[ $line -gt 1 ]] ; then \
			missingsequenceend=$(( $line + 1 )); \
			minusstr="-"; \
			missingsequence="$missingsequenceend$minusstr$i"; \
			expectnext=$(( $line - 1 )); \
			i=$expectnext; \
			echo -e "$missingsequence"; \
		fi; \
	else \
		i=$(( $i - 1 )); \
	fi; \
done \
< /home/ids.txt; \
T="$(($(date +%s)-T))"; \
echo "Time in seconds: ${T}"

Example output:

1494505-1494507
47566-47572
Time in seconds: 57

Also, I got syntax errors with the code from Eric's answer, but after changing the delimiter, using semicolons in the proper places and storing it in a procedure, it works.

Make sure you set the proper max ID, database name and table name (it's in the select query). And if you want to change the procedure name, change it in all 3 places.

use dbname;
drop procedure if exists dorepeat;
delimiter #
CREATE PROCEDURE dorepeat()
BEGIN
set @id = 1;
set @maxid = 1573736;
drop table if exists IDSeq;
create temporary table IDSeq
(
	id int
);

WHILE @id < @maxid DO
	insert into IDSeq values(@id);
	set @id = @id + 1;
END WHILE;

select 
	s.id 
from 
	IDSeq s 
	left join tablename t on 
		s.id = t.id 
 where t.id is null;

drop table if exists IDSeq;

END#
delimiter ;
CALL dorepeat;

I also found this query elwhere, but I haven't tested it.

SELECT a.id+1 AS start, MIN(b.id) - 1 AS end
    FROM tablename AS a, tablename AS b
    WHERE a.id < b.id
    GROUP BY a.id
    HAVING start < MIN(b.id)

Solution 11 - Sql

TRY in MySQL

DELIMITER ||
DROP PROCEDURE IF EXISTS proc_missing ||
CREATE PROCEDURE proc_missing()
BEGIN 
SET @minID = (SELECT MIN(`id`) FROM `tbl_name` WHERE `user_id`=13);
SET @maxID = (SELECT MAX(`id`) FROM `tbl_name` WHERE `user_id`=13);
REPEAT 
	SET @tableID = (SELECT `id` FROM `tbl_name` WHERE `id` = @minID);
	IF (@tableID IS NULL) THEN
		INSERT INTO temp_missing SET `missing_id` = @tableID;
	END IF;
	SET @minID = @minID + 1;
UNTIL(@minID <= @maxID)
END REPEAT;
END ||
DELIMITER ;

Solution 12 - Sql

A few days ago, I was working on a production report and found some numbers missing. The missing numbers are very important, so I was asked to find a list of all missing numbers for investigation purposes. I posted a blog entry here, with a full demo, including a script to find missing numbers/IDs in a sample table.

The script suggested is quite long, so I won't include it here. Here are the basic steps used:

  1. Create one temp table and store all distinct Numbers.
  2. Find NextID which has something missing before it. Store into one TempTable.
  3. Create one temp table to store missing number details.
  4. Start to find the missing id using WHILE Loop.
  5. Select missing data from #MissingID temp table.

Solution 13 - Sql

Converting the SQL CTE (from Paul Svirin) to the Oracle version it looks like this (replace :YOURTABLE with the name of your table):

WITH Missing (missnum,maxid) as (
  SELECT 1 missnum, (select max(id) from :YOURTABLE) maxid from dual
  UNION ALL
  SELECT m.missnum + 1,m.maxid 
  FROM Missing m
  WHERE m.missnum < m.maxid
)
SELECT missnum
FROM Missing
LEFT OUTER JOIN :YOURTABLE tt on tt.id = Missing.missnum
WHERE tt.id is NULL

Solution 14 - Sql

Using @PaulSvirin's answer, I've expanded it with a UNION to show ALL the data in my table, including the missing records with NULLs.

WITH Missing(missnum, maxid) AS
          (SELECT (SELECT MIN(tmMIN.TETmeetingID)
                   FROM tblTETMeeting AS tmMIN)
                      AS missnum,
                  (SELECT MAX(tmMAX.TETmeetingID)
                   FROM tblTETMeeting AS tmMAX)
                      AS maxid
           UNION ALL
           SELECT missnum + 1, maxid
           FROM Missing
           WHERE missnum < maxid)
SELECT missnum AS TETmeetingID,
       tt.DateID,
       tt.WeekNo,
       tt.TETID
FROM Missing LEFT JOIN tblTETMeeting tt ON tt.TETmeetingID = Missing.missnum
WHERE tt.TETmeetingID IS NULL
UNION
SELECT tt.TETmeetingID,
       tt.DateID,
       tt.WeekNo,
       tt.TETID
FROM tblTETMeeting AS tt
OPTION ( MAXRECURSION 0 )

Work's great!

TETmeetingID	DateID	WeekNo	TETID
29	3063	21	1
30	null	null	null
31	null	null	null
32	null	null	null
33	null	null	null
34	3070	22	1
35	3073	23	1

Solution 15 - Sql

Easiest solution for me: Create a select that gives all ids up to max sequence value (ex:1000000), and filter:

with listids as (
Select Rownum idnumber From dual Connect By Rownum <= 1000000)

select * from listids
where idnumber not in (select id from table where id <=1000000)

Solution 16 - Sql

A modified version borrowing @Eric proposal. This is for SQL Server and holds in a temp table the start and end value for missing ranges. If the gap is just one value it puts NULL as end value for easier visualization.

It will produce an output like this

|StartId| EndId |
|-------|-------|
|     1	| 10182 |
| 10189	| NULL  |
| 10246	| 15000 |

And this is the script where myTable and id needs to be replaced by your table and identity column.

declare @id bigint
declare @endId bigint
declare @maxid bigint
declare @previousid bigint=0

set @id = 1
select @maxid = max(id) from myTable

create table #IDGaps
(
	startId bigint,
	endId bigint
)

while @id < @maxid
begin
	if NOT EXISTS(select id from myTable where id=@id)
	BEGIN
		SET @previousid=@id
		select top 1 @endId=id from myTable where id>@id

		IF @id=@endId-1
			insert into #IDGaps values(@id,null)
		ELSE
			insert into #IDGaps values(@id,@endId-1)

		SET @id=@endId
		
	END
	ELSE
		set @id = @id + 1
end

select * from #IDGaps

drop table #IDGaps 

Solution 17 - Sql

I have a large audit table and needed something that ran quickly - this worked well for me. It merges the top and bottom IDs for the missing ranges

select minQ.num,minId,maxId from 

(SELECT DISTINCT id +1 as minId, Row_Number() Over ( Order By id ) As Num
FROM tblAuditLoghistory
WHERE id + 1 NOT IN (SELECT DISTINCT id FROM tblAuditLogHistory)
AND id < (SELECT max(id) FROM tblAuditLoghistory)) Minq
 join


(SELECT DISTINCT id - 1 as maxId, Row_Number() Over ( Order By id ) As Num
FROM tblAuditLoghistory
WHERE id - 1 NOT IN (SELECT DISTINCT id FROM tblAuditLogHistory)
AND id > (SELECT min(id) FROM tblAuditLoghistory)) maxQ on minQ.num=maxQ.num

Solution 18 - Sql

This problem can be solved with only one query

select lft.id + 1 as missing_ids
from tbl as lft left outer join tbl as rght on lft.id + 1 = rght.id
where rght.id is null and lft.id between 1 and (Select max(id)-1 from tbl)

Tested on Mysql

Solution 19 - Sql

Try This Query. This single query is enough to get missing numbers:(Please replace TABLE_NAME to which table name you are using)

select sno as missing from(SELECT @row := @row + 1 as sno FROM 
(select 0 union all select 1 union all select 3 union all select 4 union all 
select 5 union all select 6 union all select 6 union all select 7 union all 
select 8 union all select 9) t,(select 0 union all select 1 union all select 3 
union all select 4 union all select 5 union all select 6 union all select 6 
union all select 7 union all select 8 union all select 9) t2,(select 0 
union all select 1 union all select 3 union all select 4 union all select 5 
union all select 6 union all select 6 union all select 7 union all select 8 
union all select 9) t3, (select 0 union all select 1 union all select 3 union 
all select 4 union all select 5 union all select 6 union all select 6 union all 
select 7 union all select 8 union all select 9) t4, 
(SELECT @row:=0) as b where @row<1000) as a where a.sno  not in 
  (select distinct b.no from 
(select b.*,if(@mn=0,@mn:=b.no,@mn) as min,(@mx:=b.no) as max from 
  (select ID as no from TABLE_NAME as a) as b,
        (select @mn:=0,@mx:=0) as x order by no) as b) and 
         a.sno between @mn and @mx;

Solution 20 - Sql

SELECT DISTINCT id -1
FROM users
WHERE id != 1 AND id - 1 NOT IN (SELECT DISTINCT id FROM users)

Explanation: ( id - 1 )..... checking for any previous id present in table

( id != 1 ).....neglecting when current id is 1 as its previous id will be 0 zero.

Solution 21 - Sql

This what i used to find the missing id of one table named as tablename

select a.id+1 missing_ID from tablename a where a.id+1 not in (select id from tablename b where b.id=a.id+1) and a.id!=(select id from tablename c order by id desc limit 1)

It will return the missing ids. If there are two(2) or more continuous missing ids, it will return only the first.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRoee AdlerView Question on Stackoverflow
Solution 1 - SqlNik BurnsView Answer on Stackoverflow
Solution 2 - SqlEricView Answer on Stackoverflow
Solution 3 - SqlPavlo SvirinView Answer on Stackoverflow
Solution 4 - SqlSevView Answer on Stackoverflow
Solution 5 - SqlPer LöfgrenView Answer on Stackoverflow
Solution 6 - SqlEvilTeachView Answer on Stackoverflow
Solution 7 - SqlsovanView Answer on Stackoverflow
Solution 8 - SqlKiran.BakwadView Answer on Stackoverflow
Solution 9 - SqlKardem KelmothView Answer on Stackoverflow
Solution 10 - SqlnatliView Answer on Stackoverflow
Solution 11 - SqldiEchoView Answer on Stackoverflow
Solution 12 - SqlAnveshView Answer on Stackoverflow
Solution 13 - SqlSven SowaView Answer on Stackoverflow
Solution 14 - SqlFandango68View Answer on Stackoverflow
Solution 15 - SqlÀlex RestaView Answer on Stackoverflow
Solution 16 - SqlguillemView Answer on Stackoverflow
Solution 17 - SqlJMcLemanView Answer on Stackoverflow
Solution 18 - SqlmrbadnewsView Answer on Stackoverflow
Solution 19 - SqlJustin VincentView Answer on Stackoverflow
Solution 20 - SqlpiyushView Answer on Stackoverflow
Solution 21 - SqlarionargoView Answer on Stackoverflow