Why Is MongoDB So Fast

Mongodb

Mongodb Problem Overview


I was showing my co-worker performance benchmarks of MongoDB vs SQL 2008 and while he believes MongoDB is faster, he doesn't understand how its possible. His logic, was that SQL has been around for decades, and has some of the smartest people working on it, and how can MongoDB; a relatively new kid on the block be so superior in performance? I wasn't able to really provide a solid and technical answer, and I was hoping you guys could assist.

Mongodb Solutions


Solution 1 - Mongodb

MongoDB is fast because its web scale!

Its a fun video and well worth everyone watching, but it does answer your question - that most of the noSQL engines like MongoDB are not robust and not resilient to crashes and other outages. This security is what they sacrifice to gain speed.

Solution 2 - Mongodb

MongoDB isn't like a traditional relational database. It's noSQL or document based, it provides weak consistency guarantees, and it doesn't have to guarantee consistency like SQL.

Solution 3 - Mongodb

SQL has to do quite a lot, Mongo just has to drop bits onto disk (almost)

Solution 4 - Mongodb

As it has been mentioned MongoDB isn't created and shouldn't be used the same as a SQL database. SQL (and other relational databased) store relational data, that is that data in table X can be set up to have direct relations to information in table Y. MongoDB doesn't have this ability, and can therefore drop a lot of overhead. Hence why MongoDB is usually used to store lists, not relations.

Add in the fact that it isn't not quite ACID compliant yet (though it has taken large strides since it was first introduced) and that's the bulk of the speed differences.

Here are the differences outlined on the actual site between a full transactional model and their model.

In practice, the non-transactional model of MongoDB has the following implications:

  • No rollbacks. Your code must function without rollbacks. Check all programmatic conditions before performing the first database write operation. Order your write operations such that the most important operation occurs last.
  • Explicit locking. Your code may explicitly lock objects when performing operations. Thus, the application programmer has the capability to ensure "serializability" when required. Locking functionality will be available in late alpha / early beta release of MongoDB.
  • Database check on startup. Should the database abnormal terminate (rare), a database check procedure will automatically run on startup (similar to fschk).

Solution 5 - Mongodb

While the other answers are interesting I would add that one of the reasons MongoDB is "so fast", at least in benchmarks, is the write concern.

You can read more about the different write concerns here but basically you can define the level of "security" you want when writing data.

The default level used to be unacknowledged, which means the write operation is just triggered but the driver does not check if it performed successfully. It is faster, but way less reliable.

They changed it about one year ago to acknowledged. But I guess most of the benchmarks out there still use the 'unacknowledged` mode for better results.

If you want to see the difference in term of performance, you can check this article (a bit old but it still gives an idea).

Solution 6 - Mongodb

MongoDB is fast™ because:

  1. Not ACID: Availability is given preference over consistency.
  2. Asynchronous insert and update: What it means is, MongoDB doesn't insert the data to DB as soon as insert query is processed but it flushed the data after certain amount of time. Same is true for updates.
  3. No Joins overhead: When they say MongoDB is a document database, what they mean is, it is a database that contains data that is self sufficient and all the information is embedded like a real document.

Solution 7 - Mongodb

MongoDb is faster because:

  1. No transactions;
  2. No relations between tables;

If you will try to do exact the same logic on SQL server, for example :

  1. Do not use Select with locks ;
  2. No relations between tables; It will not be so big gap in speed between SQL Server and MongoDB. Only one place definitely will be faster , write and update records, because SQL doing insert and update table in the queue and in a transaction, on MondoDB it happens asynchronously. In my projections I could not gain any big differences in speed between SQL SERVER and MongoDB, because business logic was very similar between 2 projects. Real speed gain on MongoDb you can get on Analytical projects with bid data, or on big content management engines, like news papers, online stores and etc. Again no optimization on MongoDB and good optimization on SQL server can make these databases almost equal.

Solution 8 - Mongodb

Mongo's not ACID compliant, so it doesn't have to deal with nearly as much "cruft" to make sure that what you try to put into the DB can come back out again later.

If you don't mind losing some functionality and possibly losing data in exchange for speed, then Mongo's good. If you absolutely need to guarantee data integrity and/or have complex join requirements, then avoid Mongo-type systems like the plague.

Solution 9 - Mongodb

I will also add that another difference is less about speed and more about conceptualization (although I believe that it might help with speed because there is less room for joining issues) is the document-based storage is very similar to object oriented mindset.

The document-based might not be perfectly ACID, but I believe MongoDB is easier to get what you want by just getting the whole document rather than messing with all the joins of a SQL DB, risking some bad joins as well.

Apologies to any SQL die-hard fans.

Solution 10 - Mongodb

According to MongoDB's website, MongoDB is a document database with the scalability and flexibility that you want and with querying and indexing that you need.

Let's try tho understand what this actually means. So as we know MongoDB is a document-based so it stores data in documents which are field value paired data structures like JSON. So again, it stores data in these documents instead of rows in a table like in traditional relational databases. It's therefore a NoSQL database and not a relational one.

Also, MongoDB has built-in scalability, making it very easy to distribute data across multiple machines as your apps get more and more users and start generating a ton of data. So whatever you do, MongoDB will make it very easy for you to grow.

Another big feature of MongoDB is its great flexibility. There is no need to define a document data schema before filling it with data, meaning that each document can have a different number and type of fields. And we can also change these fields all the time. All this is really in line with some real-world business situations, therefore it can become pretty useful.

MongoDB is also a very performant database system, thanks to features like embedded data models, indexing, sharding, the flexible documents that you know I believe, native duplication and so much more. And it is a free and open-source database, published under the SSPL license.

In summary, we can say that MongoDB is a great database system to build many types of modern, scalable, and flexible web applications. In fact, Mongo is probably the most used database with node JS.

Now let's know about a bit deeper about these documents considering a blog post example, here is how that exact same data could look like as a row in a relational database like MySQL, or even in an Excel spreadsheet.

enter image description here

MongoDB uses a data format similar to JSON for data storage called BSON. IT looks basically the same as JSON, but it's typed, meaning that all values will have a data type such as String, Boolean, Date, and Object (such as Teacher Object, Double Object) and more. So what this means is that all MongoDB documents will actually be typed, which is different from JSON.

Now just like JSON, these BSON documents will also have fields, and data is stored in key-value pairs. On the other hand in a relational database, each field is called a column, and database arranges data in table structures while our JSON data is so much more flexible.

Take for example the tags field in the above picture, where we actually have an array, so we have basically multiple values for one field, but in relational databases, that's not really allowed, we cannot have multiple values in one field. So we would actually have to find workarounds for this in a relational database, which could then involve more work and even more overall complication.

Now another extremely important feature in MongoDB is the concept of embedded documents, which is something not present in relational databases. So in our comments field here we have an array that contains three objects, one for each document. So just imagine we have a comments collection which contained a bunch of comment documents, each of them could actually look exactly like this, so with an author and with the comment text, but instead of doing that, we include these comments right into that blog post document, so in other words, we embed the comment documents right into the post document, this is the process of embedding or de-normalizing which is basically to include some related data all into one single document.

In the above example the comments are related to the post and os they are included in the same document which makes a database more performant in some situations because this way it can be easier to read all the data that we need all at once.

Now the opposite of embedding or de-normalizing is normalizing, and that's how the data is always modeled in a relational database. In the above example case it's not possible to embed data in a relational system, solution is to create a whole new table for the comments and then join the tables by referencing the ID field of the comments table.

Two things about BSON documents you need to know:

First, the maximum size for each document is currently 16 MB

Second, each document contains a unique ID, which acts as a primary key of that document, it's automatically generated with the object ID data type each time there is a new document, you don't have to worry about it.

Solution 11 - Mongodb

Mongodb is a lot faster in inserts and updates, because it does not check the schema and perform foreign key checks, but in reading the data by attributes and searching, its not always faster, specially if you dont have index keys.

More about it here: https://www.youtube.com/watch?v=K8xsuFgCRkU

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJustinView Question on Stackoverflow
Solution 1 - MongodbWillView Answer on Stackoverflow
Solution 2 - MongodbYichaozView Answer on Stackoverflow
Solution 3 - MongodbMartin BeckettView Answer on Stackoverflow
Solution 4 - MongodbMike M.View Answer on Stackoverflow
Solution 5 - MongodbmaxdecView Answer on Stackoverflow
Solution 6 - MongodbAmit TripathiView Answer on Stackoverflow
Solution 7 - MongodbEugene BosikovView Answer on Stackoverflow
Solution 8 - MongodbMarc BView Answer on Stackoverflow
Solution 9 - Mongodbcody.tv.weberView Answer on Stackoverflow
Solution 10 - MongodbRafiqView Answer on Stackoverflow
Solution 11 - MongodbAlimoView Answer on Stackoverflow