SQL how to compare two tables for same data content?

SqlDatabase

Sql Problem Overview


I have two tables TableA and TableB which have the same format of columns, which means both tables have the columns

A B C D E F 

where A and B are the primary keys.

How do I write an SQL query to check if TableA and TableB (which have identical primary keys) contain exactly the same values in every column?

It means that these two tables have exactly the same data.

Sql Solutions


Solution 1 - Sql

You should be able to "MINUS" or "EXCEPT" depending on the flavor of SQL used by your DBMS.

select * from tableA
minus
select * from tableB

If the query returns no rows then the data is exactly the same.

Solution 2 - Sql

Using relational operators:

SELECT * FROM TableA
UNION 
SELECT * FROM TableB
EXCEPT 
SELECT * FROM TableA
INTERSECT
SELECT * FROM TableB;

Change EXCEPT to MINUS for Oracle.

Slightly picky point: the above relies on operator precedence, which according to the SQL Standard is implementation dependent, so YMMV. It works for SQL Server, for which the precedence is:

  1. Expressions in parentheses

  2. INTERSECT

  3. EXCEPT and UNION evaluated from left to right.

Solution 3 - Sql

dietbuddha has a nice answer. In cases where you don't have a MINUS or EXCEPT, one option is to do a union all between the tables, group by with all the columns and make sure there is two of everything:

SELECT col1, col2, col3
FROM
(SELECT * FROM tableA
UNION ALL  
SELECT * FROM tableB) data
GROUP BY col1, col2, col3
HAVING count(*)!=2

Solution 4 - Sql

SELECT c.ID
FROM clients c
WHERE EXISTS(SELECT c2.ID 
FROM clients2 c2
WHERE c2.ID = c.ID);

Will return all ID's that are the SAME in both tables. To get the differences change EXISTS to NOT EXISTS.

Solution 5 - Sql

Taking the script from onedaywhen, I modified it to also show which table each entry comes from.

DECLARE @table1 NVARCHAR(80)= 'table 1 name'
DECLARE @table2 NVARCHAR(80)= 'table 2 name'
DECLARE @sql NVARCHAR (1000)

SET @sql = 
'
SELECT ''' + @table1 + ''' AS table_name,* FROM
(
SELECT * FROM ' + @table1 + '
EXCEPT
SELECT * FROM ' + @table2 + '
) x

UNION 

SELECT ''' + @table2 + ''' AS table_name,* FROM 
(
SELECT * FROM ' + @table2 + '
EXCEPT
SELECT * FROM ' + @table1 + '
) y
'

EXEC sp_executesql @stmt = @sql

Solution 6 - Sql

Enhancement to dietbuddha's answer...

select * from
(
    select * from tableA
    minus
    select * from tableB
)
union all
select * from
(
    select * from tableB
    minus
    select * from tableA
)

Solution 7 - Sql

Source: Use NATURAL FULL JOIN to compare two tables in SQL by Lukas Eder

Clever approach of using NATURAL FULL JOIN to detect the same/different rows between two tables.

Example 1 - status flag:

SELECT t1.*, t2.*, CASE WHEN t1 IS NULL OR t2 IS NULL THEN 'Not equal' ELSE 'Equal' END
FROM t1
NATURAL FULL JOIN t2;

Example 2 - filtering rows

SELECT *
FROM (SELECT 't1' AS t1, t1.* FROM t1) t1 
NATURAL FULL JOIN (SELECT 't2' AS t2, t2.* FROM t2) t2 
WHERE t1 IS NULL OR t2 IS NULL -- show differences
--WHERE  t1 IS NOT NULL AND t2 IS NOT NULL    -- show the same

db<>fiddle demo

Solution 8 - Sql

just to complet, a proc stored using except method to compare 2 tables and give result in same table with 3 errors status, ADD, DEL, GAP table must have same PK, you declare the 2 tables and fields to compare of 1 or both table

Just use like this ps_TableGap 'tbl1','Tbl2','fld1,fld2,fld3','fld4'fld5'fld6' (optional)

/****** Object:  StoredProcedure [dbo].[ps_TableGap]    Script Date: 10/03/2013 16:03:44 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

-- =============================================
-- Author:		 Arnaud ALLAVENA
-- Create date: 03.10.2013
-- Description:	Compare tables
-- =============================================
create PROCEDURE [dbo].[ps_TableGap]
	-- Add the parameters for the stored procedure here
	@Tbl1 as varchar(100),@Tbl2 as varchar(100),@Fld1 as varchar(1000), @Fld2 as varchar(1000)= ''
AS
BEGIN
	-- SET NOCOUNT ON added to prevent extra result sets from
	-- interfering with SELECT statements.
	
	SET NOCOUNT ON;
--Variables
--@Tbl1 = table 1
--@Tbl2 = table 2
--@Fld1 = Fields to compare from table 1
--@Fld2 Fields to compare from table 2
Declare @SQL varchar(8000)= '' --SQL statements
Declare @nLoop int = 1 --loop counter
Declare @Pk varchar(1000)= '' --primary key(s) 
Declare @Pk1 varchar(1000)= '' --first field of primary key
declare @strTmp varchar(50) = '' --returns value in Pk determination
declare @FldTmp varchar (1000) = '' --temporarily fields for alias calculation

--If @Fld2 empty we take @Fld1
--fields rules: fields to be compare must be in same order and type - always returns Gap
If @Fld2 = '' Set @Fld2 = @Fld1

--Change @Fld2 with Alias prefix xxx become _xxx 
while charindex(',',@Fld2)>0
begin
	Set @FldTmp = @FldTmp + (select substring(@Fld2,1,charindex(',',@Fld2)-1) + ' as _' + substring(@Fld2,1,charindex(',',@Fld2)-1) + ',')
	Set @Fld2 = (select ltrim(right(@Fld2,len(@Fld2)-charindex(',',@Fld2))))
end
Set @FldTmp = @FldTmp + @Fld2 + ' as _' + @Fld2
Set @Fld2 = @FldTmp

--Determinate primary key jointure
--rule: same pk in both tables
Set @nLoop = 1
Set @SQL = 'Declare crsr cursor for select COLUMN_NAME from INFORMATION_SCHEMA.KEY_COLUMN_USAGE where TABLE_NAME = '''
 + @Tbl1 + ''' or TABLE_SCHEMA + ''.'' + TABLE_NAME = ''' + @Tbl1 +  ''' or TABLE_CATALOG + ''.'' + TABLE_SCHEMA + ''.'' + TABLE_NAME = ''' + @Tbl1 
 + ''' order by ORDINAL_POSITION'
exec(@SQL)
open crsr 
fetch next from crsr into @strTmp
while @@fetch_status = 0
begin 
	if @nLoop = 1 
	begin 
		Set @Pk = 's.' + @strTmp + ' = b._' + @strTmp
		Set @Pk1 = @strTmp
		set @nLoop = @nLoop + 1 
	end	
	Else
	Set @Pk = @Pk + ' and s.' + @strTmp + ' = b._' + @strTmp
fetch next from crsr into @strTmp 

end 
close crsr
deallocate crsr

--SQL statement build
set @SQL = 'select case when s.' + @Pk1 + ' is null then ''Del'' when b._' + @Pk1 + ' is null then ''Add'' else ''Gap'' end as TypErr, '''
set @SQL = @SQL + @Tbl1 +''' as Tbl1, s.*, ''' + @Tbl2 +''' as Tbl2 ,b.* from (Select ' + @Fld1 + ' from ' + @Tbl1
set @SQL = @SQL + ' EXCEPT SELECT ' + @Fld2 + ' from ' + @Tbl2 + ')s full join (Select ' + @Fld2 + ' from ' + @Tbl2 
set @SQL = @SQL + ' EXCEPT SELECT ' + @Fld1 + ' from ' + @Tbl1 +')b on '+ @Pk 

--Run SQL statement
Exec(@SQL)
END

Solution 9 - Sql

You can find differences of 2 tables using combination of insert all and full outer join in Oracle. In sql you can extract the differences via full outer join but it seems that insert all/first doesnt exist in sql! Hence, you have to use following query instead:

select * from A
full outer join B on
A.pk=B.pk
where A.field1!=B.field1
or A.field2!=B.field2 or A.field3!=B.field3 or A.field4!=B.field4 
--and A.Date==Date1

Although using 'OR' in where clause is not recommended and it usually yields in lower performance, you can still use above query if your tables are not massive. If there is any result for the above query, it is exactly the differences of 2 tables based on comparison of fields 1,2,3,4. For improving the query performance, you can filter it by date as well(check the commented part)

Solution 10 - Sql

    SELECT unnest(ARRAY[1,2,2,3,3]) 
    EXCEPT
    SELECT unnest(ARRAY[1,1,2,3,3])
UNION
    SELECT unnest(ARRAY[1,1,2,3,3])
    EXCEPT
    SELECT unnest(ARRAY[1,2,2,3,3])

Result is null, but sources are different!

But:

(
    SELECT unnest(ARRAY[1,2,2,3])
    EXCEPT ALL
    SELECT unnest(ARRAY[2,1,2,3])
)
UNION
(
    SELECT unnest(ARRAY[2,1,2,3])
    EXCEPT ALL
    SELECT unnest(ARRAY[1,2,2,3])
)

works.

Solution 11 - Sql

I had this same issue in SQL Server and wrote this T-SQL script to automate the process (actually this is the watered-down version, mine wrote all the diff to a single table for easy reporting).

Update 'MyTable' and 'MyOtherTable' to the names of the tables you wish to compare.

DECLARE @ColName varchar(100)
DECLARE @Table1 varchar(100) = 'MyTable'
DECLARE @Table2 varchar(100) = 'MyOtherTable'


IF (OBJECT_ID('tempdb..#col') IS NOT NULL) DROP TABLE #col
SELECT  IDENTITY(INT, 1, 1) RowNum , c.name
INTO    #col
FROM    SYS.Objects o 
        JOIN SYS.columns c on o.object_id = c.object_id
WHERE   o.name = @Table1 AND NOT c.Name IN ('List','Columns','YouWantToIgnore')

DECLARE @Counter INT = (SELECT MAX(RowNum) FROM #col)

    WHILE @Counter > 0

	    BEGIN
            SET @ColName = (SELECT name FROM #Col WHERE RowNum= @Counter)
                EXEC ('SELECT  t1.Identifier
                        ,t1.'+@ColName+' AS '+@Table1+@ColName+'
                        ,t2.'+@ColName+' AS '+@Table2+@ColName+'
                FROM    '+@Table1+' t1
                        LEFT JOIN '+@Table2+' t2 ON t1.Identifier = t2.Identifier 
                WHERE   t1.'+@ColName+' <> t2.'+@ColName)
            SET @Counter = @Counter - 1 
        END

Solution 12 - Sql

I wrote this to compare the results of a pretty nasty view I ported from Oracle to SQL Server. It creates a pair of temp tables, #DataVariances and #SchemaVariances, with differences in (you guessed it) the data in the tables and the schema of the tables themselves.

It requires both tables have a primary key, but you could drop it into tempdb with an identity column if the source tables don't have one.

declare @TableA_ThreePartName nvarchar(max) = ''
declare @TableB_ThreePartName nvarchar(max) = ''
declare @KeyName nvarchar(max) = ''

/***********************************************************************************************
	
	Script to compare two tables and return differneces in schema and data.

	Author: Devin Lamothe		2017-08-11

***********************************************************************************************/
set nocount on

-- Split three part name into database/schema/table
declare @Database_A nvarchar(max) = (
	select	left(@TableA_ThreePartName,charindex('.',@TableA_ThreePartName) - 1))
declare @Table_A nvarchar(max) = (
	select	right(@TableA_ThreePartName,len(@TableA_ThreePartName) - charindex('.',@TableA_ThreePartName,len(@Database_A) + 2)))
declare @Schema_A nvarchar(max) = (
	select	replace(replace(@TableA_ThreePartName,@Database_A + '.',''),'.' + @Table_A,''))

declare @Database_B nvarchar(max) = (
	select	left(@TableB_ThreePartName,charindex('.',@TableB_ThreePartName) - 1))
declare @Table_B nvarchar(max) = (
	select	right(@TableB_ThreePartName,len(@TableB_ThreePartName) - charindex('.',@TableB_ThreePartName,len(@Database_B) + 2)))
declare @Schema_B nvarchar(max) = (
	select	replace(replace(@TableB_ThreePartName,@Database_B + '.',''),'.' + @Table_B,''))

-- Get schema for both tables
declare @GetTableADetails nvarchar(max) = '
	use [' + @Database_A +']
		select 	COLUMN_NAME
			 ,	DATA_TYPE
		  from	INFORMATION_SCHEMA.COLUMNS
		 where	TABLE_NAME = ''' + @Table_A + '''
		   and	TABLE_SCHEMA = ''' + @Schema_A + '''
	'
create table #Table_A_Details (
	ColumnName nvarchar(max)
,	DataType nvarchar(max)
)
insert into #Table_A_Details
exec (@GetTableADetails)

declare @GetTableBDetails nvarchar(max) = '
	use [' + @Database_B +']
		select 	COLUMN_NAME
			 ,	DATA_TYPE
		  from	INFORMATION_SCHEMA.COLUMNS
		 where	TABLE_NAME = ''' + @Table_B + '''
		   and	TABLE_SCHEMA = ''' + @Schema_B + '''
	'
create table #Table_B_Details (
	ColumnName nvarchar(max)
,	DataType nvarchar(max)
)
insert into #Table_B_Details
exec (@GetTableBDetails)


-- Get differences in table schema
			select	ROW_NUMBER() over (order by
						a.ColumnName
					,	b.ColumnName) as RowKey
				 ,	a.ColumnName as A_ColumnName
				 ,	a.DataType as A_DataType
				 ,	b.ColumnName as B_ColumnName
				 ,	b.DataType as B_DataType
			  into	#FieldList
			  from	#Table_A_Details a
   full outer join	#Table_B_Details b
				on	a.ColumnName = b.ColumnName
			 where	a.ColumnName is null
				or	b.ColumnName is null
				or	a.DataType <> b.DataType

		drop table	#Table_A_Details
		drop table	#Table_B_Details

			select	coalesce(A_ColumnName,B_ColumnName) as ColumnName
				 ,	A_DataType
				 ,	B_DataType
			  into	#SchemaVariances
			  from	#FieldList

-- Get differences in table data
declare @LastColumn int = (select max(RowKey) from #FieldList)
declare @RowNumber int = 1
declare @ThisField nvarchar(max)
declare @TestSql nvarchar(max)



create table #DataVariances (
	TableKey			nvarchar(max)
,	FieldName			nvarchar(max)
,	TableA_Value		nvarchar(max)
,	TableB_Value		nvarchar(max)
)

delete from #FieldList where A_DataType in ('varbinary','image') or B_DataType in ('varbinary','image') 

while @RowNumber <= @LastColumn begin
	set @TestSql = '
		select	coalesce(a.[' + @KeyName + '],b.[' + @KeyName + ']) as TableKey
			 ,	''' + @ThisField + ''' as FieldName
			 ,	a.[' + @ThisField + '] as [TableA_Value]
			 ,	b.[' + @ThisField + '] as [TableB_Value]
		  from	[' + @Database_A + '].[' + @Schema_A + '].[' + @Table_A + '] a 
	inner join	[' + @Database_B + '].[' + @Schema_B + '].[' + @Table_B + '] b
			on	a.[' + @KeyName + '] = b.[' + @KeyName + ']
		 where	ltrim(rtrim(a.[' + @ThisField + '])) <> ltrim(rtrim(b.[' + @ThisField + ']))
			or (a.[' + @ThisField + '] is null and  b.[' + @ThisField + '] is not null)
			or (a.[' + @ThisField + '] is not null and  b.[' + @ThisField + '] is null)
'

insert into #DataVariances
exec (@TestSql)

set @RowNumber = @RowNumber + 1
set @ThisField = (select coalesce(A_ColumnName,B_ColumnName) from #FieldList a where RowKey = @RowNumber)

end

drop table #FieldList

print 'Query complete.  Select from #DataVariances to verify data integrity or #SchemaVariances to verify schemas match.  Data types varbinary and image are not checked.'

Solution 13 - Sql

SELECT * 
FROM TABLE A
WHERE NOT EXISTS (SELECT 'X' 
                  FROM  TABLE B 
                  WHERE B.KEYFIELD1 = A.KEYFIELD1 
                  AND   B.KEYFIELD2 = A.KEYFIELD2 
                  AND   B.KEYFIELD3 = A.KEYFIELD3)
;

'X' is any value.

Switch the tables to see the different discrepancies.

Make sure to join the key fields in your tables.

Or just use the MINUS operator with 2 select statements, however, MINUS can only work in Oracle.

Solution 14 - Sql

Most of the responses seem to ignore the issue raised by Kamil. (That is where the tables contain identical rows, but different ones are repeated in each table.) Unfortunately, I am not able to use his solution, because I am in Oracle. The best I've been able to come up with is:

SELECT * FROM
   (
   SELECT column1, column2, ..., COUNT(*) AS the_count
   FROM tableA
   GROUP BY column1, column2, ...
   MINUS
   SELECT column1, column2, ..., COUNT(*) AS the_count
   FROM tableB
   GROUP BY column1, column2, ...
   )
UNION ALL
   (
   SELECT column1, column2, ..., COUNT(*) AS the_count
   FROM tableB
   GROUP BY column1, column2, ...
   MINUS
   SELECT column1, column2, ..., COUNT(*) AS the_count
   FROM tableA
   GROUP BY column1, column2, ...
   )

Solution 15 - Sql

We can compare data from two tables of DB2 tables using the below simple query,

Step 1:- Select which all columns we need to compare from table (T1) of schema(S)

     SELECT T1.col1,T1.col3,T1.col5 from S.T1

Step 2:- Use 'Minus' keyword for comparing 2 tables.

Step 3:- Select which all columns we need to compare from table (T2) of schema(S)

     SELECT T2.col1,T2.col3,T2.col5 from S.T1

END result:

     SELECT T1.col1,T1.col3,T1.col5 from S.T1
     MINUS 
     SELECT T2.col1,T2.col3,T2.col5 from S.T1;

If the query returns no rows then the data is exactly the same.

Solution 16 - Sql

To compare T1(PK, A, B) and T2(PK, A, B).

First compare primary key sets to look for missing key values on either side:

SELECT T1.*, T2.* FROM T1 FULL OUTER JOIN T2 ON T1.PK=T2.PK WHERE T1.PK IS NULL OR T2.PK IS NULL;

Then list all value mismatch:

SELECT T1.PK, 'A' AS columnName, T1.A AS leftValue, T2.A AS rightValue FROM T1 JOIN T2 ON T1.PK=T2.PK WHERE COALESCE(T1.A,0) != COALESCE(T2.A,0)
UNION ALL
SELECT T1.PK, 'B' AS columnName, T1.B AS leftValue, T2.B AS rightValue FROM T1 JOIN T2 ON T1.PK=T2.PK WHERE COALESCE(T1.B,0) != COALESCE(T2.B,0)

A and B must be of same type. You can use INFORMATION SCHEMA to generate the SELECT. Don't forget the COALESCE to also include IS NULL results. You could also use FULL OUTER JOIN and COALESCE(T1.PK,0)=COALESCE(T2.PK,0).

For example for columns of type varchar:

SELECT concat('SELECT T1.PK, ''', COLUMN_NAME, ''' AS columnName, T1.', COLUMN_NAME, ' AS leftValue, T2.', COLUMN_NAME, ' AS rightValue FROM T1 JOIN T2 ON T1.PK=T2.PK WHERE COALESCE(T1.',COLUMN_NAME, ',0)!=COALESCE(T2.', COLUMN_NAME, ',0)')
FROM INFORMATION_SCHEMA.COLUMNS 
WHERE TABLE_NAME='T1' AND DATA_TYPE IN ('nvarchar','varchar');

Solution 17 - Sql

In SQL Server... Using the row counts and then comparing this to the row count of the intersect:

DECLARE @t1count int = (SELECT COUNT(*) FROM table1)

IF (@t1count = (SELECT COUNT(*) FROM table2))  
   IF (SELECT COUNT (*) FROM (SELECT * FROM table1 INTERSECT SELECT * FROM table2) AS dT) = @t1count 
     SELECT 'Equal' 
   ELSE SELECT 'Not equal'
ELSE
 SELECT 'Not equal'

I wrote it this way so that when the row counts of the tables aren't equal, then the intersect is completely skipped which will improve performance in those cases.

Solution 18 - Sql

Try This

SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'table1'
intersect
SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'table2';

Solution 19 - Sql

select count(a.A) from A a
inner join B b on b.A = a.A
and b.B = a.B
and b.C = a.C
and b.D = a.D
and b.E = a.E
and b.F = a.F
if the answer equals with the count of table A that means table A and B are have exactly same data , but if the answer is 0 then the Table A is not equals B .

Solution 20 - Sql

In MySQL, where "minus" is not supported, and taking performance into account, this is a fast

query:
SELECT 
t1.id, 
t1.id 
FROM t1 inner join t2 using (id) where concat(t1.C, t1.D, ...)<>concat(t2.C, t2.D, ...)

Solution 21 - Sql

An alternative, enhanced query based on answer by dietbuddha & IanMc. The query includes description to helpfully show where rows exist and are missing. (NB: for SQL Server)

(
	select 'InTableA_NoMatchInTableB' as Msg, * from tableA
	except
	select 'InTableA_NoMatchInTableB' , * from tableB
)
union all
(
	select 'InTableB_NoMatchInTableA' as Msg, * from tableB
	except
	select 'InTableB_NNoMatchInTableA' ,* from tableA
)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionnikkyView Question on Stackoverflow
Solution 1 - SqldietbuddhaView Answer on Stackoverflow
Solution 2 - SqlonedaywhenView Answer on Stackoverflow
Solution 3 - SqljzdView Answer on Stackoverflow
Solution 4 - SqlimizView Answer on Stackoverflow
Solution 5 - SqlRobert SieversView Answer on Stackoverflow
Solution 6 - SqlIanMcView Answer on Stackoverflow
Solution 7 - SqlLukasz SzozdaView Answer on Stackoverflow
Solution 8 - SqlArnaud ALLAVENAView Answer on Stackoverflow
Solution 9 - Sqluser3665906View Answer on Stackoverflow
Solution 10 - SqlKamil ValentaView Answer on Stackoverflow
Solution 11 - SqlCyndi BakerView Answer on Stackoverflow
Solution 12 - SqlDevin LamotheView Answer on Stackoverflow
Solution 13 - SqlHEXU55View Answer on Stackoverflow
Solution 14 - SqlMark W. BolmeView Answer on Stackoverflow
Solution 15 - SqlMadhushankar MJView Answer on Stackoverflow
Solution 16 - SqlLudovic AubertView Answer on Stackoverflow
Solution 17 - SqlZorkolotView Answer on Stackoverflow
Solution 18 - SqllostcoderView Answer on Stackoverflow
Solution 19 - SqlMohsen AghakaramiView Answer on Stackoverflow
Solution 20 - SqlJehad KeriakiView Answer on Stackoverflow
Solution 21 - SqlTerry CView Answer on Stackoverflow