mongodb sort order on _id

Mongodb

Mongodb Problem Overview


I wonder how mongodb compare the "_id" field when doing query like the following:

db.data.find({"_id":{$gt:ObjectId("502aa46c0674d23e3cee6152")}}).sort({"_id":1}).limit(10);

Is it purely based on timestamp portion of the id?

Mongodb Solutions


Solution 1 - Mongodb

To expand slightly on what Andre said:

Since the ObjectID timestamp is only to the second, two (or more) ObjectIDs could easily be created with the same value for the timestamp (the first 4 bytes). If these were created on the same machine (machine ID - the next 3 bytes), by the same process (PID - the next 2 bytes), then the only thing to differentiate them would be the "inc" field, the last 3 bytes at the end.

Update: Jan 2020

This answer continues to be popular so it is worth updating a little. The ObjectID spec has evolved since this answer was written 8 years ago and the 5 bytes after the timestamp are now simply random, which will greatly decrease the likelihood of any collisions. The last three bytes are still incremental, but initialised at a random value to start, again making collisions less likely. The ObjectID now contains less context (you can't easily tell where it was generated and by what process) but I would guess that the information was not being used in any meaningful way and has been deprecated in favor of better randomisation of the ID.

End Update

See here for the full spec:

https://docs.mongodb.com/manual/reference/method/ObjectId/#ObjectIDs-BSONObjectIDSpecification

That "inc" field is either an ever incrementing field (then you can reasonably expect the sort to be in the insert/create order) or a random value (then likely unique, but not ordered), assuming the spec is implemented correctly of course. Note that the ObjectIDs may be generated by the driver, or the application (or indeed manually) rather than by MongoDB itself, so unless you have full control over how they are generated, then any or all of the above may apply.

Solution 2 - Mongodb

In a way you are correct, if you sort by the _id you will sort by the insertion time. This does not mean that the only comparison is done on the timestamp portion. ObjectID's are a BSON object type in their own right, they can be directly compared with each other. As they start with a timestamp, it follows logically that those in the past will be less than those in the future.

You can find more detail in the documentation

Solution 3 - Mongodb

copy paste from Mongo specs https://docs.mongodb.com/manual/reference/bson-types/#objectid

The relationship between the order of ObjectId values and generation time is not strict within a single second. If multiple systems, or multiple processes or threads on a single system generate values, within a single second; ObjectId values do not represent a strict insertion order. Clock skew between clients can also result in non-strict ordering even for values, because client drivers generate ObjectId values, not the mongod process.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionSamView Question on Stackoverflow
Solution 1 - MongodbAdam ComerfordView Answer on Stackoverflow
Solution 2 - MongodbAndre de FrereView Answer on Stackoverflow
Solution 3 - MongodbandreyroView Answer on Stackoverflow