When to use MongoDB

MysqlMongodbDatabase DesignRelational DatabaseDatabase

Mysql Problem Overview


I'm writing an application that doesn't necessarily need scaling abilities as it won't be collecting large amounts data at the beginning. (However, if I'm lucky, I could down the road potentially.)

I will be running my web server and database on the same box (for now).

That being said, I am looking for performance and efficiency.

The main part of my application will be loading blog articles. Using an RDBMS (MySQL) I will make 6 queries (2 of the queries being joins), just to load a single blog article page.

select blog
select blog_album
select blog_tags
select blog_notes
select blog_comments (join with users)
select blog_author_participants (join with users)

However, with MongoDB I can de-normalize and flatten 6 tables into just 2 tables/collections and minimizes my queries to potentially just one 1 query,

users
blogs
    ->blog_album
    ->blog_tags        
    ->blog_notes
    ->blog_comments
    ->blog_author_participants

Now, going with the MongoDB schema, there will be some data redundancy. However, hard drive space is cheaper than CPU/servers.

1.) Would this be a good scenario to use MongoDB?

2.) Do you only benefit in performance using MongoDB when scaling beyond a single server?

3.) Are there any durability risks using MongoDB? I hear that there is potential for loss of data while performing inserts - as insert are written to memory first, then to the database.

4.) Should this stop me from using MongoDB in production?

Mysql Solutions


Solution 1 - Mysql

You would use MongoDB when you have a use case that matches its strengths.

Do you need a schema-less document store? Nope, you have a stable schema.

Do you need automatic sharding? Nope, you don't have extraordinary data needs or budget for horizontally scaling hardware.

Do you need map/reduce data processing? Not for something like a blog.

So why are you even considering it?

Solution 2 - Mysql

> However, with MongoDB I can de-normalize and flatten 6 tables into just 2 tables/collections and minimizes my queries to potentially just one 1 query

But you can easily query MySQL for 6 tables worth of information related to a single blog post with a single properly crafted SQL statement.

> however hard drive space is cheaper than CPU/servers.

If performance and scaling is a priority then you are going to be concerned with having enough RAM to fit everything into main memory and enough CPU cores to run queries. An enterprise grade RAID 10 array is a requirement, don't get me wrong, but as soon as your database software (MongoDB or MySQL) needs to scan an index that can't fit into main memory you'll be in for a world of pain assuming a large active database. :)

I like MongoDB, but it's big strength in my mind is map/reduce and its document-orientation. You require neither of those features. MySQL is time-tested in large scale deployments and supports partitioning (but I would argue that your database would have to be in the order of 50-100 GB before you can realize substantial gain from partitioning vs a single (plus passive backup) server with tons (64 GB+) of RAM. I would also argue that if performance is truly a concern then MySQL would be preferable as you would have supreme control over your indexes.

That's not to say that MongoDB isn't high performance, but its place probably isn't serving blogs. Your concern with inserts is valid as well. MongoDB is not an ACID system. Google transactions in both systems and compare.

Solution 3 - Mysql

Here is a good explanation: http://mod.erni.st/nosql-if-only-it-was-that-easy/

The last paragraph summarizes it:

> What am I going to build my next app on? Probably Postgres. Will I use NoSQL? Maybe. I might also use Hadoop and Hive. I might keep everything in flat files. Maybe I’ll start hacking on Maglev. I’ll use whatever is best for the job. If I need reporting, I won’t be using any NoSQL. If I need caching, I’ll probably use Tokyo Tyrant. If I need ACIDity, I won’t use NoSQL. If I need a ton of counters, I’ll use Redis. If I need transactions, I’ll use Postgres. If I have a ton of a single type of documents, I’ll probably use Mongo. If I need to write 1 billion objects a day, I’d probably use Voldemort. If I need full text search, I’d probably use Solr. If I need full text search of volatile data, I’d probably use Sphinx.

Solution 4 - Mysql

NoSQL vs. RDBMS: Apples and Oranges?

I would advise you to read up a little on what NoSQL is and what it does before you decide whether you can use it. You can't take a normal database and turn it into a NoSQL thing just like that. The way you work with the data is completely different.

NoSQL definitely has its uses. But it's definitely not the answer for everything. The main advantage of NoSQL is the easily changeable data model.

Solution 5 - Mysql

Advantages of using mongodb ( as per Moshe Kaplan published in dzone article)

  1. Schema-less design
  2. Scalability in managing Tera bytes of data
  3. Rapid replicaSet with high availability feature
  4. Sharding enables linear and scale out growth w/o running out of budget
  5. Support high write load
  6. Use of Data locality for query processing

MongoDB meets Consistency & Partitioning requirements in CAP theory ( Consistency, Availability and Partitioning)

Related SE questions:

https://stackoverflow.com/questions/2117372/what-are-the-advantages-of-using-a-schema-free-database-like-mongodb-compared-to

https://stackoverflow.com/questions/5400163/when-to-redis-when-to-mongodb

Solution 6 - Mysql

I can't speak to the performance considerations, but for me, the first consideration of whether you want to use a SQL-DB vs MongoDB is the structure of the data you want to store.

MongoDB is "schema-less" in the sense that you don't need to know what "tables" and "columns" you want beforehand. It is very flexible. So, if you don't know what information you want to store in your "blogs" Collection for example, or if different blog posts may store different information, then MongoDB allows this flexibility. Whereas with SQL relational databases, you have to know your schema upfront.

But it sounds like you already know what information you want to store, in which case I might just stick with a SQL relational database. I don't think performance is the first consideration in your case - you're not building a real-time application where one or two milliseconds matter all that much.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestiondezView Question on Stackoverflow
Solution 1 - MysqlDan GrossmanView Answer on Stackoverflow
Solution 2 - MysqlMDaubsView Answer on Stackoverflow
Solution 3 - MysqlgusriddView Answer on Stackoverflow
Solution 4 - MysqlWolphView Answer on Stackoverflow
Solution 5 - MysqlRavindra babuView Answer on Stackoverflow
Solution 6 - MysqlJerry ChenView Answer on Stackoverflow