When should I use a NoSQL database instead of a relational database? Is it okay to use both on the same site?

MongodbCouchdbNosql

Mongodb Problem Overview


What are the advantages of using NoSQL databases? I've read a lot about them lately, but I'm still unsure why I would want to implement one, and under what circumstances I would want to use one.

Mongodb Solutions


Solution 1 - Mongodb

Relational databases enforces ACID. So, you will have schema based transaction oriented data stores. It's proven and suitable for 99% of the real world applications. You can practically do anything with relational databases.

But, there are limitations on speed and scaling when it comes to massive high availability data stores. For example, Google and Amazon have terabytes of data stored in big data centers. Querying and inserting is not performant in these scenarios because of the blocking/schema/transaction nature of the RDBMs. That's the reason they have implemented their own databases (actually, key-value stores) for massive performance gain and scalability.

NoSQL databases have been around for a long time - just the term is new. Some examples are graph, object, column, XML and document databases.

For your 2nd question: Is it okay to use both on the same site?

Why not? Both serves different purposes right?

Solution 2 - Mongodb

NoSQL solutions are usually meant to solve a problem that relational databases are either not well suited for, too expensive to use (like Oracle) or require you to implement something that breaks the relational nature of your db anyway.

Advantages are usually specific to your usage, but unless you have some sort of problem modeling your data in a RDBMS I see no reason why you would choose NoSQL.

I myself use MongoDB and Riak for specific problems where a RDBMS is not a viable solution, for all other things I use MySQL (or SQLite for testing).

If you need a NoSQL db you usually know about it, possible reasons are:

  • client wants 99.999% availability on a high traffic site.
  • your data makes no sense in SQL, you find yourself doing multiple JOIN queries for accessing some piece of information.
  • you are breaking the relational model, you have CLOBs that store denormalized data and you generate external indexes to search that data.

If you don't need a NoSQL solution keep in mind that these solutions weren't meant as replacements for an RDBMS but rather as alternatives where the former fails and more importantly that they are relatively new as such they still have a lot of bugs and missing features.

Oh, and regarding the second question it is perfectly fine to use any technology in conjunction with another, so just to be complete from my experience MongoDB and MySQL work fine together as long as they aren't on the same machine

Solution 3 - Mongodb

Martin Fowler has an excellent video which gives a good explanation of NoSQL databases. The link goes straight to his reasons to use them, but the whole video contains good information.

  1. You have large amounts of data - especially if you cannot fit it all on one physical server as NoSQL was designed to scale well.

  2. Object-relational impedance mismatch - Your domain objects do not fit well in a relaitional database schema. NoSQL allows you to persist your data as documents (or graphs) which may map much more closely to your data model.

Solution 4 - Mongodb

NoSQL is database system where data is organised into the document (MongoDB), key-value pair (MemCache, Redis), graph structure form(Neo4J).

Maybe here are possible questions and answer for "When to go for NoSQL":

  1. Require flexible schema or deal with tree like data?
    Generally, in agile development we start designing system without knowing all requirement in upfront, where later on throughout development database system may need accommodate frequent design changes, showcasing MVP (Minimal Viable product). Or you are dealing with data schema which is dynamic in nature. e.g. System logs, very precise example is AWS cloudwatch logs.

  2. Data set is vast/big?
    Yes NoSQL database are the better candidate for applications where database needs to manage million or even billions of records without compromising over performance.

  3. Trade off between scaling over consistency
    Unlike RDMS, NoSQL database may lose small data here and there(Note: probability is .x%), but its easy to scale in terms of performance. Example: This may good for storing people who are online in instant messaging app, tokens in db, logging web site traffic stats.

  4. Performing Geolocation Operations: MongoDB hash rich support for doing GeoQuerying & Geolocation operations. I really loved this feature of MongoDB.

In nutshell, MongoDB is great fit for applications where you can store dynamic structured data at large scale.

Solution 5 - Mongodb

Some essential information is missing to answer the question: Which use cases must the database be able to cover? Do complex analyses have to be performed from existing data (OLAP) or does the application have to be able to process many transactions (OLTP)? What is the data structure? That is far from the end of question time.

In my view, it is wrong to make technology decisions on the basis of bold buzzwords without knowing exactly what is behind them. NoSQL is often praised for its scalability. But you also have to know that horizontal scaling (over several nodes) also has its price and is not free. Then you have to deal with issues like eventual consistency and define how to resolve data conflicts if they cannot be resolved at the database level. However, this applies to all distributed database systems.

The joy of the developers with the word "schema less" at NoSQL is at the beginning also very big. This buzzword is quickly disenchanted after technical analysis, because it correctly does not require a schema when writing, but comes into play when reading. That is why it should correctly be "schema on read". It may be tempting to be able to write data at one's own discretion. But how do I deal with the situation if there is existing data but the new version of the application expects a different schema?

The document model (as in MongoDB, for example) is not suitable for data models where there are many relationships between the data. Joins have to be done on application level, which is additional effort and why should I program things that the database should do.

If you make the argument that Google and Amazon have developed their own databases because conventional RDBMS can no longer handle the flood of data, you can only say: You are not Google and Amazon. These companies are the spearhead, some 0.01% of scenarios where traditional databases are no longer suitable, but for the rest of the world they are.

What's not insignificant: SQL has been around for over 40 years and millions of hours of development have gone into large systems such as Oracle or Microsoft SQL. This has to be achieved by some new databases. Sometimes it is also easier to find an SQL admin than someone for MongoDB. Which brings us to the question of maintenance and management. A subject that is not exactly sexy, but that is a part of the technology decision.

Solution 6 - Mongodb

Handling A Large Number Of Read Write Operations

Look towards NoSQL databases when you need to scale fast. And when do you generally need to scale fast?

When there are a large number of read-write operations on your website & when dealing with a large amount of data, NoSQL databases fit best in these scenarios. Since they have the ability to add nodes on the fly, they can handle more concurrent traffic & big amount of data with minimal latency.

Flexibility With Data Modeling

The second cue is during the initial phases of development when you are not sure about the data model, the database design, things are expected to change at a rapid pace. NoSQL databases offer us more flexibility.

Eventual Consistency Over Strong Consistency

It’s preferable to pick NoSQL databases when it’s OK for us to give up on Strong consistency and when we do not require transactions.

A good example of this is a social networking website like Twitter. When a tweet of a celebrity blows up and everyone is liking and re-tweeting it from around the world. Does it matter if the count of likes goes up or down a bit for a short while?

The celebrity would definitely not care if instead of the actual 5 million 500 likes, the system shows the like count as 5 million 250 for a short while.

When a large application is deployed on hundreds of servers spread across the globe, the geographically distributed nodes take some time to reach a global consensus.

Until they reach a consensus, the value of the entity is inconsistent. The value of the entity eventually gets consistent after a short while. This is what Eventual Consistency is.

Though the inconsistency does not mean that there is any sort of data loss. It just means that the data takes a short while to travel across the globe via the internet cables under the ocean to reach a global consensus and become consistent.

We experience this behaviour all the time. Especially on YouTube. Often you would see a video with 10 views and 15 likes. How is this even possible?

It’s not. The actual views are already more than the likes. It’s just the count of views is inconsistent and takes a short while to get updated.

Running Data Analytics

NoSQL databases also fit best for data analytics use cases, where we have to deal with an influx of massive amounts of data.

Solution 7 - Mongodb

I came across this question while looking for convincing grounds to deviate from RDBMS design.

There is a great post by Julian Brown which sheds lights on constraints of distributed systems. The concept is called Brewer's CAP Theorem which in summary goes:

> The three requirements of distributed systems are : Consistency, Availability and Partition tolerance (CAP in short). But you can only have two of them at a time.

And this is how I summarised it for myself:

> You better go for NoSQL if Consistency is what you are sacrificing.

Solution 8 - Mongodb

I designed and implemented solutions with NoSQL databases and here is my checkpoint list to make the decision to go with SQL or document-oriented NoSQL.

DON'Ts

SQL is not obsolete and remains a better tool in some cases. It's hard to justify use of a document-oriented NoSQL when

  • Need OLAP/OLTP
  • It's a small project / simple DB structure
  • Need ad hoc queries
  • Can't avoid immediate consistency
  • Unclear requirements
  • Lack of experienced developers

DOs

If you don't have those conditions or can mitigate them, then here are 2 reasons where you may benefit from NoSQL:

  • Need to run at scale
  • Convenience of development (better integration with your tech stack, no need in ORM, etc.)

More info

In my blog posts I explain the reasons in more details:

Note: the above is applicable to document-oriented NoSQL only. There are other types of NoSQL, which require other considerations.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionsmfooteView Question on Stackoverflow
Solution 1 - MongodbRameshVelView Answer on Stackoverflow
Solution 2 - MongodbAsafView Answer on Stackoverflow
Solution 3 - MongodbDespertarView Answer on Stackoverflow
Solution 4 - MongodbHrishikeshView Answer on Stackoverflow
Solution 5 - MongodbStefan PruggView Answer on Stackoverflow
Solution 6 - MongodbSunny SultanView Answer on Stackoverflow
Solution 7 - MongodbJermin BazazianView Answer on Stackoverflow
Solution 8 - MongodbAlex KlausView Answer on Stackoverflow