Mongo $in operator performance

Mongodb

Mongodb Problem Overview


Is it slow/poor form to use the $in operator in MongoDB with a large array of possibilities?

posts.find({
    author : {
        $in : ['friend1','friend2','friend3'....'friend40'] 
    }
})

App Engine, for example, won't let you use more than 30 because they translate directly to one query per item in the IN array, and so instead force you into using their method for handling fan out. While that's probably the most efficient method in Mongo too, the code for it is significantly more complex so I'd prefer to just use this generic method.

Will Mongo execute these $in queries efficiently for reasonable-sized datasets?

Mongodb Solutions


Solution 1 - Mongodb

It can be fairly efficient with small lists (hard to say what small is, but at least into the tens/hundreds) for $in. It does not work like app-engine since mongodb has actual btree indexes and isn't a column store like bigtable.

With $in it will skip around in the index to find the matching documents, or walk through the whole collection if there isn't an index to use.

Solution 2 - Mongodb

Assuming you have created index on the author field, from algorithmic point of view, the time complexity of $in operation is: $(N*log(M)), where N is the length of input array and M is the size of the collection.

The time complexity of $in operation will not change unless you change a database (Though I don't think any db can break O(N*log(M))).

However, from engineering point of view, if N goes to a big number, it is better to let your business logic server to simulate the $in operation, either by batch or one-by-one.

This is simply because: memory in database servers is way more valuable than the memory in business logic servers.

Solution 3 - Mongodb

If you build an index (ensureIndex) on the list element, it should be pretty quick.

Have you tried using explain()? Its a good, built-in way to profile your queries: http://www.mongodb.org/display/DOCS/Indexing+Advice+and+FAQ#IndexingAdviceandFAQ-Use%7B%7Bexplain%7D%7D.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDerek DahmerView Question on Stackoverflow
Solution 1 - MongodbScott HernandezView Answer on Stackoverflow
Solution 2 - MongodbMingView Answer on Stackoverflow
Solution 3 - MongodbAlexView Answer on Stackoverflow