Is it possible to query a tree structure table in MySQL in a single query, to any depth?

MysqlSqlDatabase DesignData StructuresHierarchical Data

Mysql Problem Overview


I'm thinking the answer is no, but I'd love it it anybody had any insight into how to crawl a tree structure to any depth in SQL (MySQL), but with a single query

More specifically, given a tree structured table (id, data, data, parent_id), and one row in the table, is it possible to get all descendants (child/grandchild/etc), or for that matter all ancestors (parent/grandparent/etc) without knowing how far down or up it will go, using a single query?

Or is using some kind of recursion require, where I keep querying deeper until there are no new results?

Specifically, I'm using Ruby and Rails, but I'm guessing that's not very relevant.

Mysql Solutions


Solution 1 - Mysql

Yes, this is possible, it's a called a Modified Preorder Tree Traversal, as best described here

Joe Celko's Trees and Hierarchies in SQL for Smarties

A working example (in PHP) is provided here

http://www.sitepoint.com/article/hierarchical-data-database/2/

Solution 2 - Mysql

Here are several resources:

Basically, you'll need to do some sort of cursor in a stored procedure or query or build an adjacency table. I'd avoid recursion outside of the db: depending on how deep your tree is, that could get really slow/sketchy.

Solution 3 - Mysql

Daniel Beardsley's answer is not that bad a solution at all when the main questions you are asking are 'what are all my children' and 'what are all my parents'.

In response to Alex Weinstein, this method actually results in less updates to nodes on a parent movement than in the Celko technique. In Celko's technique, if a level 2 node on the far left moves to under a level 1 node on the far right, then pretty much every node in the tree needs updating, rather than just the node's children.

What I would say however is that Daniel possibly stores the path back to root the wrong way around.

I would store them so that the query would be

SELECT FROM table WHERE ancestors LIKE "1,2,6%"

This means that mysql can make use of an index on the 'ancestors' column, which it would not be able to do with a leading %.

Solution 4 - Mysql

I came across this problem before and had one wacky idea. You could store a field in each record that is concatenated string of it's direct ancestors' ids all the way back to the root.

Imagine you had records like this (indentation implies heirarchy and the numbers are id, ancestors.

  • 1, "1"
  • 2, "2,1"
    • 5, "5,2,1"
    • 6, "6,2,1"
      • 7, "7,6,2,1"
      • 11, "11,6,2,1"
  • 3, "3,1"
    • 8, "8,3,1"
    • 9, "9,3,1"
    • 10, "10,3,1"

Then to select the descendents of id:6, just do this

SELECT FROM table WHERE ancestors LIKE "%6,2,1"

Keeping the ancestors column up to date might be more trouble than it's worth to you, but it's feasible solution in any DB.

Solution 5 - Mysql

Celko's technique (nested sets) is pretty good. I also have used an adjacency table with fields "ancestor" and "descendant" and "distance" (e.g. direct children/parents have a distance of 1, grandchildren/grandparents have a distance of 2, etc).

This needs to be maintained, but is fairly easy to do for inserts: you use a transaction, then put the direct link (parent, child, distance=1) into the table, then INSERT IGNORE a SELECTion of existing parent&children by adding distances (I can pull up the SQL when I have a chance), which wants an index on each of the 3 fields for performance. Where this approach gets ugly is for deletions... you basically have to mark all the items that have been affected and then rebuild them. But an advantage of this is that it can handle arbitrary acyclic graphs, whereas the nested set model can only do straight hierarchies (e.g. each item except the root has one and only one parent).

Solution 6 - Mysql

SQL isn't a Turing Complete language, which means you're not going to be able to perform this sort of looping. You can do some very clever things with SQL and tree structures, but I can't think of a way to describe a row which has a certain id "in its hierarchy" for a hierarchy of arbitrary depth.

Your best bet is something along the lines of what @Dan suggested, which is to just work your way through the tree in some other, more capable language. You can actually generate a query string in a general-purpose language using a loop, where the query is just some convoluted series of joins (or sub-queries) which reflects the depth of the hierarchy you are looking for. That would be more efficient than looping and multiple queries.

Solution 7 - Mysql

This can definitely be done and it isn't that complicated for SQL. I've answered this question and provided a working example using mysql procedural code here:

https://stackoverflow.com/questions/12796113/mysql-how-to-find-leaves-in-specific-node/12797585#12797585

Booth: If you are satisfied, you should mark one of the answers as accepted.

Solution 8 - Mysql

I used the "With Emulator" routine described in https://stackoverflow.com/questions/27013093/recursive-query-emulation-in-mysql (provided by https://stackoverflow.com/users/1726419/yossico). So far, I've gotten very good results (performance wise), but I don't have an abundance of data or a large number of descendents to search through/for.

Solution 9 - Mysql

You're almost definitely going to want to employ some recursion for that. And if you're doing that, then it would be trivial (in fact easier) to get the entire tree rather than bits of it to a fixed depth.

In really rough pseudo-code you'll want something along these lines:

getChildren(parent){
    children = query(SELECT * FROM table WHERE parent_id = parent.id)
    return children
}

printTree(root){
    print root
    children = getChildren(root)
    for child in children {
        printTree(child)
    }
}

Although in practice you'd rarely want to do something like this. It will be rather inefficient since it's making one request for every row in the table, so it'll only be sensible for either small tables, or trees that aren't nested too deeply. To be honest, in either case you probably want to limit the depth.

However, given the popularity of these kinds of data structure, there may very well be some MySQL stuff to help you with this, specifically to cut down on the numbers of queries you need to make.

Edit: Having thought about it, it makes very little sense to make all these queries. If you're reading the entire table anyway, then you can just slurp the whole thing into RAM - assuming it's small enough!

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionCameron BoothView Question on Stackoverflow
Solution 1 - MysqlDave CheneyView Answer on Stackoverflow
Solution 2 - MysqlAaron JensenView Answer on Stackoverflow
Solution 3 - MysqlKrayView Answer on Stackoverflow
Solution 4 - MysqlDaniel BeardsleyView Answer on Stackoverflow
Solution 5 - MysqlJason SView Answer on Stackoverflow
Solution 6 - MysqlDaniel SpiewakView Answer on Stackoverflow
Solution 7 - Mysqlguru_floridaView Answer on Stackoverflow
Solution 8 - MysqlBig_Al_TxView Answer on Stackoverflow
Solution 9 - MysqlDanView Answer on Stackoverflow