Mongoose subdocuments vs nested schema
Javascriptnode.jsMongodbMongooseJavascript Problem Overview
I'm curious as to the pros and cons of using subdocuments vs a deeper layer in my main schema:
var subDoc = new Schema({
name: String
});
var mainDoc = new Schema({
names: [subDoc]
});
or
var mainDoc = new Schema({
names: [{
name: String
}]
});
I'm currently using subdocs everywhere but I am wondering primarily about performance or querying issues I might encounter.
Javascript Solutions
Solution 1 - Javascript
According to the docs, it's exactly the same.
However, using a Schema would add an _id
field as well (as long as you don't have that disabled), and presumably uses some more resources for tracking subdocs.
> Alternate declaration syntax > >New in v3 If you don't need access to the sub-document schema instance, you may also declare sub-docs by simply passing an object literal [...]
Solution 2 - Javascript
If you have schemas that are re-used in various parts of your model, then it might be useful to define individual schemas for the child docs so you don't have to duplicate yourself.
Solution 3 - Javascript
You should use embedded documents if that are static documents or that are not more than a few hundred because of performance impact. I have gone through about that issue for a while ago. Newly, Asya Kamsky who works as a solutions architect for MongoDB had written an article about "using subdocuments".
I hope that helps to who is looking for solutions or the best practice.
Original post on http://askasya.com/post/largeembeddedarrays . You can reach her stackoverflow profile on https://stackoverflow.com/users/431012/asya-kamsky
> First of all, we have to consider why we would want to do such a > thing. Normally, I would advise people to embed things that they > always want to get back when they are fetching this document. The flip > side of this is that you don't want to embed things in the document > that you don't want to get back with it. > > If you embed activity I perform into the document, it'll work great at > first because all of my activity is right there and with a single read > you can get back everything you might want to show me: "you recently > clicked on this and here are your last two comments" but what happens > after six months go by and I don't care about things I did a long time > ago and you don't want to show them to me unless I specifically go to > look for some old activity? > > First, you'll end up returning bigger and bigger document and caring > about smaller and smaller portion of it. But you can use projection to > only return some of the array, the real pain is that the document on > disk will get bigger and it will still all be read even if you're only > going to return part of it to the end user, but since my activity is > not going to stop as long as I'm active, the document will continue > growing and growing. > > The most obvious problem with this is eventually you'll hit the 16MB > document limit, but that's not at all what you should be concerned > about. A document that continuously grows will incur higher and higher > cost every time it has to get relocated on disk, and even if you take > steps to mitigate the effects of fragmentation, your writes will > overall be unnecessarily long, impacting overall performance of your > entire application. > > There is one more thing that you can do that will completely kill your > application's performance and that's to index this ever-increasing > array. What that means is that every single time the document with > this array is relocated, the number of index entries that need to be > updated is directly proportional to the number of indexed values in > that document, and the bigger the array, the larger that number will > be. > > I don't want this to scare you from using arrays when they are a good > fit for the data model - they are a powerful feature of the document > database data model, but like all powerful tools, it needs to be used > in the right circumstances and it should be used with care.
Solution 4 - Javascript
Basically, create a variable nestedDov
and put it here name: [nestedDov]
Simple Version:
var nestedDoc = new Schema({
name: String
});
var mainDoc = new Schema({
names: [nestedDoc]
});
JSON Example
{
"_id" : ObjectId("57c88bf5818e70007dc72e85"),
"name" : "Corinthia Hotel Budapest",
"stars" : 5,
"description" : "The 5-star Corinthia Hotel Budapest on the Grand Boulevard offers free access to its Royal Spa",
"photos" : [
"/photos/hotel/corinthiahotelbudapest/1.jpg",
"/photos/hotel/corinthiahotelbudapest/2.jpg"
],
"currency" : "HUF",
"rooms" : [
{
"type" : "Superior Double or Twin Room",
"number" : 20,
"description" : "These are some great rooms",
"photos" : [
"/photos/room/corinthiahotelbudapest/2.jpg",
"/photos/room/corinthiahotelbudapest/5.jpg"
],
"price" : 73000
},
{
"type" : "Deluxe Double Room",
"number" : 50,
"description" : "These are amazing rooms",
"photos" : [
"/photos/room/corinthiahotelbudapest/4.jpg",
"/photos/room/corinthiahotelbudapest/6.jpg"
],
"price" : 92000
},
{
"type" : "Executive Double Room",
"number" : 25,
"description" : "These are amazing rooms",
"photos" : [
"/photos/room/corinthiahotelbudapest/4.jpg",
"/photos/room/corinthiahotelbudapest/6.jpg"
],
"price" : 112000
}
],
"reviews" : [
{
"name" : "Tamas",
"id" : "/user/tamas.json",
"review" : "Great hotel",
"rating" : 4
}
],
"services" : [
"Room service",
"Airport shuttle (surcharge)",
"24-hour front desk",
"Currency exchange",
"Tour desk"
]
}
Example:
Solution 5 - Javascript
I think this is handled elsewhere by multiple post on SO.
Just a few:
- https://stackoverflow.com/questions/5373198
- https://stackoverflow.com/questions/4662530
- https://stackoverflow.com/questions/3038703
The big key is that there is no single answer here, only a set of rather complex trade-offs.
Solution 6 - Javascript
There are some difference between the two:
-
Using nested schema is helpful for validation.
-
Nested schema can be reused in other schemas.
-
Nested schema add '_id' field to the subdocument unless you used "_id:false"