How to stop insertion of Duplicate documents in a mongodb collection

MongodbMongodb QueryDatabaseNosql

Mongodb Problem Overview


Let us have a MongoDB collection which has three docs..

db.collection.find()

 { _id:'...', user: 'A', title: 'Physics',   Bank: 'Bank_A' }
 { _id:'...', user: 'A', title: 'Chemistry', Bank: 'Bank_B' }
 { _id:'...', user: 'B', title: 'Chemistry', Bank: 'Bank_A' }

We have a doc,

 doc = { user: 'B', title: 'Chemistry', Bank:'Bank_A' }

If we use

 db.collection.insert(doc) 

here, this duplicate doc will get inserted in database.

 { _id:'...', user: 'A', title: 'Physics',   Bank: 'Bank_A' }
 { _id:'...', user: 'A', title: 'Chemistry', Bank: 'Bank_B' }
 { _id:'...', user: 'B', title: 'Chemistry', Bank: 'Bank_A' }
 { _id:'...', user: 'B', title: 'Chemistry', Bank: 'Bank_A' }

How this duplicate can be stopped. On which field should indexing be done or any other approach?

Mongodb Solutions


Solution 1 - Mongodb

Don't use insert.

Use update with upsert=true. Update will look for the document that matches your query, then it will modify the fields you want and then, you can tell it upsert:True if you want to insert if no document matches your query.

db.collection.update(
   <query>,
   <update>,
  {
    upsert: <boolean>,
     multi: <boolean>,
    writeConcern: <document>
   }
  )

So, for your example, you could use something like this:

db.collection.update(doc, doc, {upsert:true})

Solution 2 - Mongodb

You should use a compound index on the set of fields that uniquely identify a document within your MongoDB collection. For example, if you decide that the combination of user, title and Bank are your unique key you would issue the following command:

db.collection.createIndex( { user: 1, title: 1, Bank: 1 }, {unique:true} )

Please note that this should be done after you have removed previously stored duplicates.

http://docs.mongodb.org/manual/tutorial/create-a-compound-index/

http://docs.mongodb.org/manual/tutorial/create-a-unique-index/

Solution 3 - Mongodb

It has been updated from the above answers.

please use db.collection.updateOne() instead of db.collection.update(). and also db.collection.createIndexes() instead of db.collection.ensureIndex()

Update: the methods update() and ensureIndex() has been deprecated from mongodb 2.*, you can see more details in mongo and the path is ./mongodb/lib/collection.js. For update(), the recommend methods are updateOne, updateMany, or bulkWrite. For ensureIndex(), the recommend method is createIndexes.

Solution 4 - Mongodb

Maybe this is a bit slower than other ways but it works too. It can be used inside a loop:

db.collection.replaceOne(query, data, {upsert: true})

The query may be something like:

{ _id: '5f915390950f276680720b57' }

https://docs.mongodb.com/manual/reference/method/db.collection.replaceOne

Solution 5 - Mongodb

What you are looking for is the AddToSet instead of Push or Insert. Using the Upsert flag dosen't seem to work for me.

ie: var updateSet = Builders<T>.Update.AddToSet(collectionField, value);

Note that AddToSet seems to do a value comparison.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionshashankView Question on Stackoverflow
Solution 1 - MongodbVicView Answer on Stackoverflow
Solution 2 - MongodbJohn PetroneView Answer on Stackoverflow
Solution 3 - MongodbCreemView Answer on Stackoverflow
Solution 4 - MongodbBanzyView Answer on Stackoverflow
Solution 5 - MongodbJohn DoeView Answer on Stackoverflow