MongoDB - paging

MongodbPaging

Mongodb Problem Overview


When using MongoDB, are there any special patterns for making e.g. a paged view? say a blog that lists the 10 latest posts where you can navigate backwards to older posts.

Or do one solve it with an index on e.g. blogpost.publishdate and just skip and limit the result?

Mongodb Solutions


Solution 1 - Mongodb

Using skip+limit is not a good way to do paging when performance is an issue, or with large collections; it will get slower and slower as you increase the page number. Using skip requires the server to walk though all the documents (or index values) from 0 to the offset (skip) value.

It is much better to use a range query (+ limit) where you pass in the last page's range value. For example if you are sorting by "publishdate" you would simple pass the last "publishdate" value as the criteria for the query to get the next page of data.

Solution 2 - Mongodb

  1. Range based paging is hard to implement if you need to sort items in many ways.
  2. Remember if the field value of the sort parameter is not unique , then Range based paging will become unrealiable.

Possible solution: try to simplify the desgin , thinking about if we can only sort by id or some unique value?

And if we can , then range based pageing can be used.

The common way is use sort() , skip() and limit() to implement paging what is described above.

Solution 3 - Mongodb

Thisis the solution I used when my collection grew too large to return in a single query. It takes advantage of the inherent ordering of the _id field and allows you to loop through a collection by specified batch size.

Here it is as an npm module, mongoose-paging, full code is below:

function promiseWhile(condition, action) {
  return new Promise(function(resolve, reject) {
    process.nextTick(function loop() {
      if(!condition()) {
        resolve();
      } else {
        action().then(loop).catch(reject);
      }
    });
  });
}

function findPaged(query, fields, options, iterator, cb) {
  var Model  = this,
    step     = options.step,
    cursor   = null,
    length   = null;

  promiseWhile(function() {
    return ( length===null || length > 0 );
  }, function() {
    return new Promise(function(resolve, reject) {
        
        if(cursor) query['_id'] = { $gt: cursor };

        Model.find(query, fields, options).sort({_id: 1}).limit(step).exec(function(err, items) {
          if(err) {
            reject(err);
          } else {
            length  = items.length;
            if(length > 0) {
              cursor  = items[length - 1]._id;
              iterator(items, function(err) {
                if(err) {
                  reject(err);
                } else {
                  resolve();
                }
              });
            } else {
              resolve();
            }
          }
        });
      });
  }).then(cb).catch(cb);

}

module.exports = function(schema) {
  schema.statics.findPaged = findPaged;
};

Attach it to your model like this:

MySchema.plugin(findPaged);

Then query like this:

MyModel.findPaged(
  // mongoose query object, leave blank for all
  {source: 'email'},
  // fields to return, leave blank for all
  ['subject', 'message'],
  // number of results per page
  {step: 100},
  // iterator to call on each set of results
  function(results, cb) {
    console.log(results);
    // this is called repeatedly while until there are no more results.
    // results is an array of maximum length 100 containing the
    // results of your query

    // if all goes well
    cb();

    // if your async stuff has an error
    cb(err);
  },
  // function to call when finished looping
  function(err) {
    throw err;
    // this is called once there are no more results (err is null),
    // or if there is an error (then err is set)
  }
);

Solution 4 - Mongodb

Range based paging is doable, but you need to be smart about how you min/max the query.

If you can afford to you should try caching the results of a query in a temporary file or collection. Thanks to TTL collections in MongoDB you can insert your results into two collections.

  1. Search+User+Parameters Query (TTL whatever)
  2. Results of query (TTL whatever + cleaning interval + 1)

Using both assures you will not get partial results when the TTL is near the current time. You can utilize a simple counter when you store the results to do a VERY simple range query at that point.

Solution 5 - Mongodb

Here is an example of retrieving a list of User documents order by CreatedDate (where pageIndex is zero-based) using the official C# driver.

public void List<User> GetUsers() 
{
  var connectionString = "<a connection string>";
  var client = new MongoClient(connectionString);
  var server = client.GetServer();
  var database = server.GetDatabase("<a database name>");

  var sortBy = SortBy<User>.Descending(u => u.CreatedDate);
  var collection = database.GetCollection<User>("Users");
  var cursor = collection.FindAll();
  cursor.SetSortOrder(sortBy);

  cursor.Skip = pageIndex * pageSize;
  cursor.Limit = pageSize;
  return cursor.ToList();
}

All the sorting and paging operations are done on server side. Although this is an example in C#, I guess the same can be applied to other language ports.

See http://docs.mongodb.org/ecosystem/tutorial/use-csharp-driver/#modifying-a-cursor-before-enumerating-it.

Solution 6 - Mongodb

	// file:ad-hoc.js
	// an example of using the less binary as pager in the bash shell
	//
	// call on the shell by:
	// mongo localhost:27017/mydb ad-hoc.js | less
	//
	// note ad-hoc.js must be in your current directory
	// replace the 27017 wit the port of your mongodb instance
	// replace the mydb with the name of the db you want to query
	//
	// create the connection obj
	conn = new Mongo();

	// set the db of the connection
	// replace the mydb with the name of the db you want to query
	db = conn.getDB("mydb");

	// replace the products with the name of the collection
	// populate my the products collection
	// this is just for demo purposes - you will probably have your data already
	for (var i=0;i<1000;i++ ) {
	db.products.insert(
		[
			{ _id: i, item: "lamp", qty: 50, type: "desk" },
		],
		{ ordered: true }
	)
	}


	// replace the products with the name of the collection
	cursor = db.products.find();

	// print the collection contents
	while ( cursor.hasNext() ) {
		printjson( cursor.next() );
	}
	// eof file: ad-hoc.js

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRoger JohanssonView Question on Stackoverflow
Solution 1 - MongodbScott HernandezView Answer on Stackoverflow
Solution 2 - MongodbjackalopeView Answer on Stackoverflow
Solution 3 - Mongodbmz3View Answer on Stackoverflow
Solution 4 - MongodbwhardierView Answer on Stackoverflow
Solution 5 - MongodbAlex HoView Answer on Stackoverflow
Solution 6 - MongodbYordan GeorgievView Answer on Stackoverflow