how to emulate "insert ignore" and "on duplicate key update" (sql merge) with postgresql?

Postgresql

Postgresql Problem Overview


Some SQL servers have a feature where INSERT is skipped if it would violate a primary/unique key constraint. For instance, MySQL has INSERT IGNORE.

What's the best way to emulate INSERT IGNORE and ON DUPLICATE KEY UPDATE with PostgreSQL?

Postgresql Solutions


Solution 1 - Postgresql

With PostgreSQL 9.5, this is now native functionality (like MySQL has had for several years):

>INSERT ... ON CONFLICT DO NOTHING/UPDATE ("UPSERT") > >9.5 brings support for "UPSERT" operations. INSERT is extended to accept an ON CONFLICT DO UPDATE/IGNORE clause. This clause specifies an alternative action to take in the event of a would-be duplicate violation.

...

>Further example of new syntax:

INSERT INTO user_logins (username, logins)
VALUES ('Naomi',1),('James',1) 
ON CONFLICT (username)
DO UPDATE SET logins = user_logins.logins + EXCLUDED.logins;

Solution 2 - Postgresql

Edit: in case you missed warren's answer, PG9.5 now has this natively; time to upgrade!


Building on Bill Karwin's answer, to spell out what a rule based approach would look like (transferring from another schema in the same DB, and with a multi-column primary key):

CREATE RULE "my_table_on_duplicate_ignore" AS ON INSERT TO "my_table"
  WHERE EXISTS(SELECT 1 FROM my_table 
                WHERE (pk_col_1, pk_col_2)=(NEW.pk_col_1, NEW.pk_col_2))
  DO INSTEAD NOTHING;
INSERT INTO my_table SELECT * FROM another_schema.my_table WHERE some_cond;
DROP RULE "my_table_on_duplicate_ignore" ON "my_table";

Note: The rule applies to all INSERT operations until the rule is dropped, so not quite ad hoc.

Solution 3 - Postgresql

For those of you that have Postgres 9.5 or higher, the new ON CONFLICT DO NOTHING syntax should work:

INSERT INTO target_table (field_one, field_two, field_three ) 
SELECT field_one, field_two, field_three
FROM source_table
ON CONFLICT (field_one) DO NOTHING;

For those of us who have an earlier version, this right join will work instead:

INSERT INTO target_table (field_one, field_two, field_three )
SELECT source_table.field_one, source_table.field_two, source_table.field_three
FROM source_table 
LEFT JOIN target_table ON source_table.field_one = target_table.field_one
WHERE target_table.field_one IS NULL;

Solution 4 - Postgresql

Try to do an UPDATE. If it doesn't modify any row that means it didn't exist, so do an insert. Obviously, you do this inside a transaction.

You can of course wrap this in a function if you don't want to put the extra code on the client side. You also need a loop for the very rare race condition in that thinking.

There's an example of this in the documentation: http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html, example 40-2 right at the bottom.

That's usually the easiest way. You can do some magic with rules, but it's likely going to be a lot messier. I'd recommend the wrap-in-function approach over that any day.

This works for single row, or few row, values. If you're dealing with large amounts of rows for example from a subquery, you're best of splitting it into two queries, one for INSERT and one for UPDATE (as an appropriate join/subselect of course - no need to write your main filter twice)

Solution 5 - Postgresql

To get the insert ignore logic you can do something like below. I found simply inserting from a select statement of literal values worked best, then you can mask out the duplicate keys with a NOT EXISTS clause. To get the update on duplicate logic I suspect a pl/pgsql loop would be necessary.

INSERT INTO manager.vin_manufacturer
(SELECT * FROM( VALUES
  ('935',' Citroën Brazil','Citroën'),
  ('ABC', 'Toyota', 'Toyota'),
  ('ZOM',' OM','OM')
  ) as tmp (vin_manufacturer_id, manufacturer_desc, make_desc)
  WHERE NOT EXISTS (
    --ignore anything that has already been inserted
	SELECT 1 FROM manager.vin_manufacturer m where m.vin_manufacturer_id = tmp.vin_manufacturer_id)
)

Solution 6 - Postgresql

INSERT INTO mytable(col1,col2) 
    SELECT 'val1','val2' 
    WHERE NOT EXISTS (SELECT 1 FROM mytable WHERE col1='val1')

Solution 7 - Postgresql

As @hanmari mentioned in his comment. when inserting into a postgres tables, the on conflict (..) do nothing is the best code to use for not inserting duplicate data.:

query = "INSERT INTO db_table_name(column_name)
         VALUES(%s) ON CONFLICT (column_name) DO NOTHING;"

The ON CONFLICT line of code will allow the insert statement to still insert rows of data. The query and values code is an example of inserted date from a Excel into a postgres db table. I have constraints added to a postgres table I use to make sure the ID field is unique. Instead of running a delete on rows of data that is the same, I add a line of sql code that renumbers the ID column starting at 1. Example:

q = 'ALTER id_column serial RESTART WITH 1'

If my data has an ID field, I do not use this as the primary ID/serial ID, I create a ID column and I set it to serial. I hope this information is helpful to everyone. *I have no college degree in software development/coding. Everything I know in coding, I study on my own.

Solution 8 - Postgresql

Looks like PostgreSQL supports a schema object called a rule.

http://www.postgresql.org/docs/current/static/rules-update.html

You could create a rule ON INSERT for a given table, making it do NOTHING if a row exists with the given primary key value, or else making it do an UPDATE instead of the INSERT if a row exists with the given primary key value.

I haven't tried this myself, so I can't speak from experience or offer an example.

Solution 9 - Postgresql

This solution avoids using rules:

BEGIN
   INSERT INTO tableA (unique_column,c2,c3) VALUES (1,2,3);
EXCEPTION 
   WHEN unique_violation THEN
     UPDATE tableA SET c2 = 2, c3 = 3 WHERE unique_column = 1;
END;

but it has a performance drawback (see PostgreSQL.org):

> A block containing an EXCEPTION clause is significantly more expensive > to enter and exit than a block without one. Therefore, don't use > EXCEPTION without need.

Solution 10 - Postgresql

On bulk, you can always delete the row before the insert. A deletion of a row that doesn't exist doesn't cause an error, so its safely skipped.

Solution 11 - Postgresql

For data import scripts, to replace "IF NOT EXISTS", in a way, there's a slightly awkward formulation that nevertheless works:

DO
$do$
BEGIN
PERFORM id
FROM whatever_table;

IF NOT FOUND THEN
-- INSERT stuff
END IF;
END
$do$;

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestiongpilotinoView Question on Stackoverflow
Solution 1 - PostgresqlwarrenView Answer on Stackoverflow
Solution 2 - PostgresqlEoghanMView Answer on Stackoverflow
Solution 3 - PostgresqlhanmariView Answer on Stackoverflow
Solution 4 - PostgresqlMagnus HaganderView Answer on Stackoverflow
Solution 5 - PostgresqlKeyoView Answer on Stackoverflow
Solution 6 - Postgresqluser2342158View Answer on Stackoverflow
Solution 7 - PostgresqlYankeeownzView Answer on Stackoverflow
Solution 8 - PostgresqlBill KarwinView Answer on Stackoverflow
Solution 9 - PostgresqlNumberFourView Answer on Stackoverflow
Solution 10 - PostgresqlDavid NoriegaView Answer on Stackoverflow
Solution 11 - Postgresqlanalytik_workView Answer on Stackoverflow