Find MongoDB records where array field is not empty

MongodbMongoose

Mongodb Problem Overview


All of my records have a field called "pictures". This field is an array of strings.

I now want the newest 10 records where this array IS NOT empty.

I've googled around, but strangely enough I haven't found much on this. I've read into the $where option, but I was wondering how slow that is to native functions, and if there is a better solution.

And even then, that does not work:

ME.find({$where: 'this.pictures.length > 0'}).sort('-created').limit(10).execFind()

Returns nothing. Leaving this.pictures without the length bit does work, but then it also returns empty records, of course.

Mongodb Solutions


Solution 1 - Mongodb

If you also have documents that don't have the key, you can use:

ME.find({ pictures: { $exists: true, $not: {$size: 0} } })

MongoDB doesn't use indexes if $size is involved, so here is a better solution:

ME.find({ pictures: { $exists: true, $ne: [] } })

If your property can have invalid values (like null boolean or others) , then you an add an additional check using $types as proposed in this answer:

With mongo >= 3.2:

ME.find({ pictures: { $exists: true, $type: 'array', $ne: [] } })

With mongo < 3.2:

ME.find({ pictures: { $exists: true, $type: 4, $ne: [] } })

Since the MongoDB 2.6 release, you can compare with the operator $gt, but this could lead to unexpected results (you can find a detailed explanation in this answer):

ME.find({ pictures: { $gt: [] } })

Solution 2 - Mongodb

After some more looking, especially in the mongodb documents, and puzzling bits together, this was the answer:

ME.find({pictures: {$exists: true, $not: {$size: 0}}})

Solution 3 - Mongodb

This might also work for you:

ME.find({'pictures.0': {$exists: true}});

Solution 4 - Mongodb

You care about two things when querying - accuracy and performance. With that in mind, I tested a few different approaches in MongoDB v3.0.14.

TL;DR db.doc.find({ nums: { $gt: -Infinity }}) is the fastest and most reliable (at least in the MongoDB version I tested).

EDIT: This no longer works in MongoDB v3.6! See the comments under this post for a potential solution.

Setup

I inserted 1k docs w/o a list field, 1k docs with an empty list, and 5 docs with a non-empty list.

for (var i = 0; i < 1000; i++) { db.doc.insert({}); }
for (var i = 0; i < 1000; i++) { db.doc.insert({ nums: [] }); }
for (var i = 0; i < 5; i++) { db.doc.insert({ nums: [1, 2, 3] }); }
db.doc.createIndex({ nums: 1 });

I recognize this isn't enough of a scale to take performance as seriously as I am in the tests below, but it's enough to present the correctness of various queries and behavior of chosen query plans.

Tests

db.doc.find({'nums': {'$exists': true}}) returns wrong results (for what we're trying to accomplish).

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': {'$exists': true}}).count()
1005

--

db.doc.find({'nums.0': {'$exists': true}}) returns correct results, but it's also slow using a full collection scan (notice COLLSCAN stage in the explanation).

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums.0': {'$exists': true}}).count()
5
MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums.0': {'$exists': true}}).explain()
{
  "queryPlanner": {
    "plannerVersion": 1,
    "namespace": "test.doc",
    "indexFilterSet": false,
    "parsedQuery": {
      "nums.0": {
        "$exists": true
      }
    },
    "winningPlan": {
      "stage": "COLLSCAN",
      "filter": {
        "nums.0": {
          "$exists": true
        }
      },
      "direction": "forward"
    },
    "rejectedPlans": [ ]
  },
  "serverInfo": {
    "host": "MacBook-Pro",
    "port": 27017,
    "version": "3.0.14",
    "gitVersion": "08352afcca24bfc145240a0fac9d28b978ab77f3"
  },
  "ok": 1
}

--

db.doc.find({'nums': { $exists: true, $gt: { '$size': 0 }}}) returns wrong results. That's because of an invalid index scan advancing no documents. It will likely be accurate but slow without the index.

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $exists: true, $gt: { '$size': 0 }}}).count()
0
MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $exists: true, $gt: { '$size': 0 }}}).explain('executionStats').executionStats.executionStages
{
  "stage": "KEEP_MUTATIONS",
  "nReturned": 0,
  "executionTimeMillisEstimate": 0,
  "works": 2,
  "advanced": 0,
  "needTime": 0,
  "needFetch": 0,
  "saveState": 0,
  "restoreState": 0,
  "isEOF": 1,
  "invalidates": 0,
  "inputStage": {
    "stage": "FETCH",
    "filter": {
      "$and": [
        {
          "nums": {
            "$gt": {
              "$size": 0
            }
          }
        },
        {
          "nums": {
            "$exists": true
          }
        }
      ]
    },
    "nReturned": 0,
    "executionTimeMillisEstimate": 0,
    "works": 1,
    "advanced": 0,
    "needTime": 0,
    "needFetch": 0,
    "saveState": 0,
    "restoreState": 0,
    "isEOF": 1,
    "invalidates": 0,
    "docsExamined": 0,
    "alreadyHasObj": 0,
    "inputStage": {
      "stage": "IXSCAN",
      "nReturned": 0,
      "executionTimeMillisEstimate": 0,
      "works": 1,
      "advanced": 0,
      "needTime": 0,
      "needFetch": 0,
      "saveState": 0,
      "restoreState": 0,
      "isEOF": 1,
      "invalidates": 0,
      "keyPattern": {
        "nums": 1
      },
      "indexName": "nums_1",
      "isMultiKey": true,
      "direction": "forward",
      "indexBounds": {
        "nums": [
          "({ $size: 0.0 }, [])"
        ]
      },
      "keysExamined": 0,
      "dupsTested": 0,
      "dupsDropped": 0,
      "seenInvalidated": 0,
      "matchTested": 0
    }
  }
}

--

db.doc.find({'nums': { $exists: true, $not: { '$size': 0 }}}) returns correct results, but the performance is bad. It technically does an index scan, but then it still advances all the docs and then has to filter through them).

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $exists: true, $not: { '$size': 0 }}}).count()
5
MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $exists: true, $not: { '$size': 0 }}}).explain('executionStats').executionStats.executionStages
{
  "stage": "KEEP_MUTATIONS",
  "nReturned": 5,
  "executionTimeMillisEstimate": 0,
  "works": 2016,
  "advanced": 5,
  "needTime": 2010,
  "needFetch": 0,
  "saveState": 15,
  "restoreState": 15,
  "isEOF": 1,
  "invalidates": 0,
  "inputStage": {
    "stage": "FETCH",
    "filter": {
      "$and": [
        {
          "nums": {
            "$exists": true
          }
        },
        {
          "$not": {
            "nums": {
              "$size": 0
            }
          }
        }
      ]
    },
    "nReturned": 5,
    "executionTimeMillisEstimate": 0,
    "works": 2016,
    "advanced": 5,
    "needTime": 2010,
    "needFetch": 0,
    "saveState": 15,
    "restoreState": 15,
    "isEOF": 1,
    "invalidates": 0,
    "docsExamined": 2005,
    "alreadyHasObj": 0,
    "inputStage": {
      "stage": "IXSCAN",
      "nReturned": 2005,
      "executionTimeMillisEstimate": 0,
      "works": 2015,
      "advanced": 2005,
      "needTime": 10,
      "needFetch": 0,
      "saveState": 15,
      "restoreState": 15,
      "isEOF": 1,
      "invalidates": 0,
      "keyPattern": {
        "nums": 1
      },
      "indexName": "nums_1",
      "isMultiKey": true,
      "direction": "forward",
      "indexBounds": {
        "nums": [
          "[MinKey, MaxKey]"
        ]
      },
      "keysExamined": 2015,
      "dupsTested": 2015,
      "dupsDropped": 10,
      "seenInvalidated": 0,
      "matchTested": 0
    }
  }
}

--

db.doc.find({'nums': { $exists: true, $ne: [] }}) returns correct results and is slightly faster, but the performance is still not ideal. It uses IXSCAN which only advances docs with an existing list field, but then has to filter out the empty lists one by one.

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $exists: true, $ne: [] }}).count()
5
MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $exists: true, $ne: [] }}).explain('executionStats').executionStats.executionStages
{
  "stage": "KEEP_MUTATIONS",
  "nReturned": 5,
  "executionTimeMillisEstimate": 0,
  "works": 1018,
  "advanced": 5,
  "needTime": 1011,
  "needFetch": 0,
  "saveState": 15,
  "restoreState": 15,
  "isEOF": 1,
  "invalidates": 0,
  "inputStage": {
    "stage": "FETCH",
    "filter": {
      "$and": [
        {
          "$not": {
            "nums": {
              "$eq": [ ]
            }
          }
        },
        {
          "nums": {
            "$exists": true
          }
        }
      ]
    },
    "nReturned": 5,
    "executionTimeMillisEstimate": 0,
    "works": 1017,
    "advanced": 5,
    "needTime": 1011,
    "needFetch": 0,
    "saveState": 15,
    "restoreState": 15,
    "isEOF": 1,
    "invalidates": 0,
    "docsExamined": 1005,
    "alreadyHasObj": 0,
    "inputStage": {
      "stage": "IXSCAN",
      "nReturned": 1005,
      "executionTimeMillisEstimate": 0,
      "works": 1016,
      "advanced": 1005,
      "needTime": 11,
      "needFetch": 0,
      "saveState": 15,
      "restoreState": 15,
      "isEOF": 1,
      "invalidates": 0,
      "keyPattern": {
        "nums": 1
      },
      "indexName": "nums_1",
      "isMultiKey": true,
      "direction": "forward",
      "indexBounds": {
        "nums": [
          "[MinKey, undefined)",
          "(undefined, [])",
          "([], MaxKey]"
        ]
      },
      "keysExamined": 1016,
      "dupsTested": 1015,
      "dupsDropped": 10,
      "seenInvalidated": 0,
      "matchTested": 0
    }
  }
}

--

db.doc.find({'nums': { $gt: [] }}) IS DANGEROUS BECAUSE DEPENDING ON THE INDEX USED IT MIGHT GIVE UNEXPECTED RESULTS. That's because of an invalid index scan which advances no documents.

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $gt: [] }}).count()
0
MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $gt: [] }}).hint({ nums: 1 }).count()
0
MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $gt: [] }}).hint({ _id: 1 }).count()
5

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $gt: [] }}).explain('executionStats').executionStats.executionStages
{
  "stage": "KEEP_MUTATIONS",
  "nReturned": 0,
  "executionTimeMillisEstimate": 0,
  "works": 1,
  "advanced": 0,
  "needTime": 0,
  "needFetch": 0,
  "saveState": 0,
  "restoreState": 0,
  "isEOF": 1,
  "invalidates": 0,
  "inputStage": {
    "stage": "FETCH",
    "filter": {
      "nums": {
        "$gt": [ ]
      }
    },
    "nReturned": 0,
    "executionTimeMillisEstimate": 0,
    "works": 1,
    "advanced": 0,
    "needTime": 0,
    "needFetch": 0,
    "saveState": 0,
    "restoreState": 0,
    "isEOF": 1,
    "invalidates": 0,
    "docsExamined": 0,
    "alreadyHasObj": 0,
    "inputStage": {
      "stage": "IXSCAN",
      "nReturned": 0,
      "executionTimeMillisEstimate": 0,
      "works": 1,
      "advanced": 0,
      "needTime": 0,
      "needFetch": 0,
      "saveState": 0,
      "restoreState": 0,
      "isEOF": 1,
      "invalidates": 0,
      "keyPattern": {
        "nums": 1
      },
      "indexName": "nums_1",
      "isMultiKey": true,
      "direction": "forward",
      "indexBounds": {
        "nums": [
          "([], BinData(0, ))"
        ]
      },
      "keysExamined": 0,
      "dupsTested": 0,
      "dupsDropped": 0,
      "seenInvalidated": 0,
      "matchTested": 0
    }
  }
}

--

db.doc.find({'nums.0’: { $gt: -Infinity }}) returns correct results, but has bad performance (uses a full collection scan).

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums.0': { $gt: -Infinity }}).count()
5
MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums.0': { $gt: -Infinity }}).explain('executionStats').executionStats.executionStages
{
  "stage": "COLLSCAN",
  "filter": {
    "nums.0": {
      "$gt": -Infinity
    }
  },
  "nReturned": 5,
  "executionTimeMillisEstimate": 0,
  "works": 2007,
  "advanced": 5,
  "needTime": 2001,
  "needFetch": 0,
  "saveState": 15,
  "restoreState": 15,
  "isEOF": 1,
  "invalidates": 0,
  "direction": "forward",
  "docsExamined": 2005
}

--

db.doc.find({'nums': { $gt: -Infinity }}) surprisingly, this works very well! It gives the right results and it's fast, advancing 5 docs from the index scan phase.

MacBook-Pro(mongod-3.0.14) test> db.doc.find({'nums': { $gt: -Infinity }}).explain('executionStats').executionStats.executionStages
{
  "stage": "FETCH",
  "nReturned": 5,
  "executionTimeMillisEstimate": 0,
  "works": 16,
  "advanced": 5,
  "needTime": 10,
  "needFetch": 0,
  "saveState": 0,
  "restoreState": 0,
  "isEOF": 1,
  "invalidates": 0,
  "docsExamined": 5,
  "alreadyHasObj": 0,
  "inputStage": {
    "stage": "IXSCAN",
    "nReturned": 5,
    "executionTimeMillisEstimate": 0,
    "works": 15,
    "advanced": 5,
    "needTime": 10,
    "needFetch": 0,
    "saveState": 0,
    "restoreState": 0,
    "isEOF": 1,
    "invalidates": 0,
    "keyPattern": {
      "nums": 1
    },
    "indexName": "nums_1",
    "isMultiKey": true,
    "direction": "forward",
    "indexBounds": {
      "nums": [
        "(-inf.0, inf.0]"
      ]
    },
    "keysExamined": 15,
    "dupsTested": 15,
    "dupsDropped": 10,
    "seenInvalidated": 0,
    "matchTested": 0
  }
}

Solution 5 - Mongodb

Starting with the 2.6 release, another way to do this is to compare the field to an empty array:

ME.find({pictures: {$gt: []}})

Testing it out in the shell:

> db.ME.insert([{pictures: [1,2,3]},
{pictures: []},
{pictures: ['']},
{pictures: [0]},
{pictures: 1},
{foobar: 1}
])

> db.ME.find({pictures: {$gt: []}})
{ "_id": ObjectId("54d4d9ff96340090b6c1c4a7"), "pictures": [ 1, 2, 3 ] }
{ "_id": ObjectId("54d4d9ff96340090b6c1c4a9"), "pictures": [ "" ] }
{ "_id": ObjectId("54d4d9ff96340090b6c1c4aa"), "pictures": [ 0 ] }

So it properly includes the docs where pictures has at least one array element, and excludes the docs where pictures is either an empty array, not an array, or missing.

Solution 6 - Mongodb

Retrieve all and only the documents where 'pictures' is an array and is not empty

ME.find({pictures: {$type: 'array', $ne: []}})

If using a MongoDb version prior to 3.2, use $type: 4 instead of $type: 'array'. Notice that this solution doesn't even use $size, so there's no problem with indexes ("Queries cannot use indexes for the $size portion of a query")

Other solutions, including these (accepted answer):

> ME.find({ pictures: { $exists: true, $not: {$size: 0} } }); > ME.find({ pictures: { $exists: true, $ne: [] } })

are wrong because they return documents even if, for example, 'pictures' is null, undefined, 0, etc.

Solution 7 - Mongodb

You can use any of the following to achieve this.
Both also take care of not returning a result for objects that don't have the requested key in them:

db.video.find({pictures: {$exists: true, $gt: {$size: 0}}})
db.video.find({comments: {$exists: true, $not: {$size: 0}}})

Solution 8 - Mongodb

Use the $elemMatch operator: according to the documentation

> The $elemMatch operator matches documents that contain an array field with at least one element that matches all the specified query criteria.

$elemMatches makes sure that the value is an array and that it is not empty. So the query would be something like

ME.find({ pictures: { $elemMatch: {$exists: true }}})

PS A variant of this code is found in MongoDB University's M121 course.

Solution 9 - Mongodb

db.find({ pictures: { $elemMatch: { $exists: true } } })

$elemMatch matches documents that contain an array field with at least one element that matches the specified query.

So you're matching all arrays with at least an element.

Solution 10 - Mongodb

You can also use the helper method Exists over the Mongo operator $exists

ME.find()
    .exists('pictures')
    .where('pictures').ne([])
    .sort('-created')
    .limit(10)
    .exec(function(err, results){
        ...
    });

Solution 11 - Mongodb

{ $where: "this.pictures.length > 1" }

use the $where and pass the this.field_name.length which return the size of array field and check it by comparing with number. if any array have any value than array size must be at least 1. so all the array field have length more than one, it means it have some data in that array

Solution 12 - Mongodb

This also works:

db.getCollection('collectionName').find({'arrayName': {$elemMatch:{}}})

Solution 13 - Mongodb

ME.find({pictures: {$exists: true}}) 

Simple as that, this worked for me.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionskeritView Question on Stackoverflow
Solution 1 - MongodbChris'View Answer on Stackoverflow
Solution 2 - MongodbskeritView Answer on Stackoverflow
Solution 3 - MongodbtenbatsuView Answer on Stackoverflow
Solution 4 - MongodbwojcikstefanView Answer on Stackoverflow
Solution 5 - MongodbJohnnyHKView Answer on Stackoverflow
Solution 6 - MongodbSC1000View Answer on Stackoverflow
Solution 7 - MongodbPaul ImisiView Answer on Stackoverflow
Solution 8 - MongodbAndres MorenoView Answer on Stackoverflow
Solution 9 - MongodbBananeView Answer on Stackoverflow
Solution 10 - MongodbEat at JoesView Answer on Stackoverflow
Solution 11 - MongodbPrabhat YadavView Answer on Stackoverflow
Solution 12 - MongodbDaveView Answer on Stackoverflow
Solution 13 - MongodbLuis FletesView Answer on Stackoverflow