Using merge..output to get mapping between source.id and target.id

Sql ServerSql Server-2008Merge

Sql Server Problem Overview


Very simplified, I have two tables Source and Target.

declare @Source table (SourceID int identity(1,2), SourceName varchar(50))
declare @Target table (TargetID int identity(2,2), TargetName varchar(50))

insert into @Source values ('Row 1'), ('Row 2')

I would like to move all rows from @Source to @Target and know the TargetID for each SourceID because there are also the tables SourceChild and TargetChild that needs to be copied as well and I need to add the new TargetID into TargetChild.TargetID FK column.

There are a couple of solutions to this.

  1. Use a while loop or cursors to insert one row (RBAR) to Target at a time and use scope_identity() to fill the FK of TargetChild.
  2. Add a temp column to @Target and insert SourceID. You can then join that column to fetch the TargetID for the FK in TargetChild.
  3. SET IDENTITY_INSERT OFF for @Target and handle assigning new values yourself. You get a range that you then use in TargetChild.TargetID.

I'm not all that fond of any of them. The one I used so far is cursors.

What I would really like to do is to use the output clause of the insert statement.

insert into @Target(TargetName)
output inserted.TargetID, S.SourceID
select SourceName
from @Source as S

But it is not possible

The multi-part identifier "S.SourceID" could not be bound.

But it is possible with a merge.

merge @Target as T
using @Source as S
on 0=1
when not matched then
  insert (TargetName) values (SourceName)
output inserted.TargetID, S.SourceID;

Result

TargetID    SourceID
----------- -----------
2           1
4           3

I want to know if you have used this? If you have any thoughts about the solution or see any problems with it? It works fine in simple scenarios but perhaps something ugly could happen when the query plan get really complicated due to a complicated source query. Worst scenario would be that the TargetID/SourceID pairs actually isn't a match.

MSDN has this to say about the from_table_name of the output clause.

> Is a column prefix that specifies a table included in the FROM clause of a DELETE, UPDATE, or MERGE statement that is used to specify the rows to update or delete.

For some reason they don't say "rows to insert, update or delete" only "rows to update or delete".

Any thoughts are welcome and totally different solutions to the original problem is much appreciated.

Sql Server Solutions


Solution 1 - Sql Server

In my opinion this is a great use of MERGE and output. I've used in several scenarios and haven't experienced any oddities to date. For example, here is test setup that clones a Folder and all Files (identity) within it into a newly created Folder (guid).

DECLARE @FolderIndex TABLE (FolderId UNIQUEIDENTIFIER PRIMARY KEY, FolderName varchar(25));
INSERT INTO @FolderIndex 
	(FolderId, FolderName)
	VALUES(newid(), 'OriginalFolder');
        
DECLARE @FileIndex TABLE (FileId int identity(1,1) PRIMARY KEY, FileName varchar(10));
INSERT INTO @FileIndex 
    (FileName)
    VALUES('test.txt');
        
DECLARE @FileFolder TABLE (FolderId UNIQUEIDENTIFIER, FileId int, PRIMARY KEY(FolderId, FileId));
INSERT INTO @FileFolder 
	(FolderId, FileId)
	SELECT	FolderId, 
			FileId
	FROM	@FolderIndex
	CROSS JOIN	@FileIndex;  -- just to illustrate
        
DECLARE @sFolder TABLE (FromFolderId UNIQUEIDENTIFIER, ToFolderId UNIQUEIDENTIFIER);
DECLARE	@sFile TABLE (FromFileId int, ToFileId int);
        
-- copy Folder Structure
MERGE @FolderIndex fi
USING	(	SELECT	1 [Dummy],
					FolderId, 
					FolderName
			FROM	@FolderIndex [fi]
			WHERE	FolderName = 'OriginalFolder'
		) d ON 	d.Dummy = 0
WHEN NOT MATCHED 
THEN INSERT 
	(FolderId, FolderName)
	VALUES (newid(), 'copy_'+FolderName)
OUTPUT	d.FolderId,
		INSERTED.FolderId
INTO	@sFolder (FromFolderId, toFolderId);

-- copy File structure
MERGE	@FileIndex fi
USING	(	SELECT	1 [Dummy],
					fi.FileId, 
					fi.[FileName]
			FROM	@FileIndex fi
			INNER
			JOIN	@FileFolder fm ON 
					fi.FileId = fm.FileId
			INNER
			JOIN	@FolderIndex fo ON 
					fm.FolderId = fo.FolderId
			WHERE	fo.FolderName = 'OriginalFolder'
		) d ON  d.Dummy = 0
WHEN NOT MATCHED 
THEN INSERT ([FileName])
	VALUES ([FileName])
OUTPUT	d.FileId,
		INSERTED.FileId
INTO	@sFile (FromFileId, toFileId);

-- link new files to Folders
INSERT INTO @FileFolder (FileId, FolderId)
	SELECT	sfi.toFileId, sfo.toFolderId
	FROM	@FileFolder fm
	INNER
	JOIN	@sFile sfi ON  
			fm.FileId = sfi.FromFileId
	INNER
	JOIN	@sFolder sfo ON 
			fm.FolderId = sfo.FromFolderId
-- return    
SELECT  * 
FROM    @FileIndex fi 
JOIN    @FileFolder ff ON  
        fi.FileId = ff.FileId 
JOIN    @FolderIndex fo ON  
        ff.FolderId = fo.FolderId

Solution 2 - Sql Server

I would like to add another example to add to @Nathan's example, as I found it somewhat confusing.

Mine uses real tables for the most part, and not temp tables.

I also got my inspiration from here: another example

-- Copy the FormSectionInstance
DECLARE @FormSectionInstanceTable TABLE(OldFormSectionInstanceId INT, NewFormSectionInstanceId INT)

;MERGE INTO [dbo].[FormSectionInstance]
USING
(
	SELECT
		fsi.FormSectionInstanceId [OldFormSectionInstanceId]
		, @NewFormHeaderId [NewFormHeaderId]
		, fsi.FormSectionId
		, fsi.IsClone
		, @UserId [NewCreatedByUserId]
		, GETDATE() NewCreatedDate
		, @UserId [NewUpdatedByUserId]
		, GETDATE() NewUpdatedDate
	FROM [dbo].[FormSectionInstance] fsi
	WHERE fsi.[FormHeaderId] = @FormHeaderId 
) tblSource ON 1=0 -- use always false condition
WHEN NOT MATCHED
THEN INSERT
( [FormHeaderId], FormSectionId, IsClone, CreatedByUserId, CreatedDate, UpdatedByUserId, UpdatedDate)
VALUES( [NewFormHeaderId], FormSectionId, IsClone, NewCreatedByUserId, NewCreatedDate, NewUpdatedByUserId, NewUpdatedDate)

OUTPUT tblSource.[OldFormSectionInstanceId], INSERTED.FormSectionInstanceId
INTO @FormSectionInstanceTable(OldFormSectionInstanceId, NewFormSectionInstanceId);


-- Copy the FormDetail
INSERT INTO [dbo].[FormDetail]
	(FormHeaderId, FormFieldId, FormSectionInstanceId, IsOther, Value, CreatedByUserId, CreatedDate, UpdatedByUserId, UpdatedDate)
SELECT
	@NewFormHeaderId, FormFieldId, fsit.NewFormSectionInstanceId, IsOther, Value, @UserId, CreatedDate, @UserId, UpdatedDate
FROM [dbo].[FormDetail] fd
INNER JOIN @FormSectionInstanceTable fsit ON fsit.OldFormSectionInstanceId = fd.FormSectionInstanceId
WHERE [FormHeaderId] = @FormHeaderId

Solution 3 - Sql Server

Here's a solution that doesn't use MERGE (which I've had problems with many times I try to avoid if possible). It relies on two memory tables (you could use temp tables if you want) with IDENTITY columns that get matched, and importantly, using ORDER BY when doing the INSERT, and WHERE conditions that match between the two INSERTs... the first one holds the source IDs and the second one holds the target IDs.

-- Setup...   We have a table that we need to know the old IDs and new IDs after copying.
-- We want to copy all of DocID=1
DECLARE @newDocID int = 99;
DECLARE @tbl table (RuleID int PRIMARY KEY NOT NULL IDENTITY(1, 1), DocID int, Val varchar(100));
INSERT INTO @tbl (DocID, Val) VALUES (1, 'RuleA-2'), (1, 'RuleA-1'), (2, 'RuleB-1'), (2, 'RuleB-2'), (3, 'RuleC-1'), (1, 'RuleA-3')

-- Create a break in IDENTITY values.. just to simulate more realistic data
INSERT INTO @tbl (Val) VALUES ('DeleteMe'), ('DeleteMe');
DELETE FROM @tbl WHERE Val = 'DeleteMe';
INSERT INTO @tbl (DocID, Val) VALUES (6, 'RuleE'), (7, 'RuleF');

SELECT * FROM @tbl t;

-- Declare TWO temp tables each with an IDENTITY - one will hold the RuleID of the items we are copying, other will hold the RuleID that we create
DECLARE @input table (RID int IDENTITY(1, 1), SourceRuleID int NOT NULL, Val varchar(100));
DECLARE @output table (RID int IDENTITY(1,1), TargetRuleID int NOT NULL, Val varchar(100));

-- Capture the IDs of the rows we will be copying by inserting them into the @input table
-- Important - we must specify the sort order - best thing is to use the IDENTITY of the source table (t.RuleID) that we are copying
INSERT INTO @input (SourceRuleID, Val) SELECT t.RuleID, t.Val FROM @tbl t WHERE t.DocID = 1 ORDER BY t.RuleID;

-- Copy the rows, and use the OUTPUT clause to capture the IDs of the inserted rows.
-- Important - we must use the same WHERE and ORDER BY clauses as above
INSERT INTO @tbl (DocID, Val)
OUTPUT Inserted.RuleID, Inserted.Val INTO @output(TargetRuleID, Val)
SELECT @newDocID, t.Val FROM @tbl t 
WHERE t.DocID = 1
ORDER BY t.RuleID;

-- Now @input and @output should have the same # of rows, and the order of both inserts was the same, so the IDENTITY columns (RID) can be matched
-- Use this as the map from old-to-new when you are copying sub-table rows
-- Technically, @input and @output don't even need the 'Val' columns, just RID and RuleID - they were included here to prove that the rules matched
SELECT i.*, o.* FROM @output o
INNER JOIN @input i ON i.RID = o.RID

-- Confirm the matching worked
SELECT * FROM @tbl t

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMikael ErikssonView Question on Stackoverflow
Solution 1 - Sql Servernathan_jrView Answer on Stackoverflow
Solution 2 - Sql ServerDragos DurlutView Answer on Stackoverflow
Solution 3 - Sql ServerMichael BrayView Answer on Stackoverflow