Finding all disconnected subgraphs in a graph

JavaAlgorithmGraphDisconnectedSubgraph

Java Problem Overview


I have a graph which contains an unknown number of disconnected subgraphs. What's a good algorithm (or Java library) to find them all?

Java Solutions


Solution 1 - Java

I think what you are looking for is generally called a Flood Fill. It is up to you whether you traverse the graph through a BFS or a DFS.

Basically you take an unlabeled (AKA uncoloured) node and assign a new label to it. You assign the same label to all nodes adjacent to that one, and so on to all nodes that are reachable from that node.

When no more reachable nodes can be labeled, you start over by picking another unlabeled node. Notice that the fact that this new node is unlabeled implies that it is not reachable from our earlier node and is thus in a different disconnected component.

When there are no more unlabeled nodes, the number of distinct labels you had to use is the number of components of the graph. The label for each node tells you which node is part of which component.

Solution 2 - Java

Not a Java implementation but perhaps it will be useful for someone, here is how to do it in Python:

import networkx as nx
g = nx.Graph()
# add nodes/edges to graph

d = list(nx.connected_components(g)) 
# d contains disconnected subgraphs
# d[0] contains the biggest subgraph

More information here.

Solution 3 - Java

There are many facets of this question that are not fully explained, so I'm going to give a somewhat exhaustive answer. My tendency to post walls-of-text notwithstanding. :/ Note also that I'm assuming this is a homework question or self-education question, so I'm not going to give a straight-up answer.

The two basic algorithms for detecting connectivity of a graph is Depth First Search and Breadth First Search. Those are really the 2 starting points you want to look at. Both will get you to the solution, but in different ways, and its hard to argue which is "better" without considering some pretty in-depth aspects of the problem. But let's move on.

As I mentioned before, you left out some important details, and I'll touch upon some possibilities here.

Is your graph directed or undirected? and do you consider connectivity in the "strong" sense (in which case, see oggy's answer), or connectivity in the "weak" sense? Depending on your answer, you will have to approach your algorithm in a subtly different way. Note that, for an undirected graph, weak and strong connectivity are equivalent, so that's nice. But you'll have to keep the structure of the graph in mind regardless, while implementing or finding an algorithm.

Furthermore, there is the question of what you mean by "finding the subgraphs" (paraphrase). Usually graph connectivity is a decision problem -- simply "there is one connected graph" or "there are two or more sub-graphs (aka, it's disconnected)". Having an algorithm for that requires the least amount of bookwork, which is nice. :) The next step up would be the count of graphs, literally the number of them, and that bookwork is also not so bad. Penultimately, you may want a list of nodes in each sub-graph. Lastly, you may want to literally copy out the sub-graphs, edges and all (so the return type would be a list of graphs, I suppose, with an implied invariant that each graph is connected). None of these different result-types would require a different algorithm, but will certainly require a different approach to the bookwork.

All of this may seem like a ridiculous amount of overkill for what is a pretty basic question, but I thought I'd just highlight all of the aspects involved in even such a simple graph question. As a sort of cliffhanger, note how none of this even yet touches upon runtime or memory usage! :)

  • Agor

Solution 4 - Java

http://jgrapht.sourceforge.net/">JGraphT</a> is a nice open source graphing library licensed under the LGPL license. I have used it in the past to deal with graphs and detecting cycles within the graphs. It is also fairly easy to use, and you can use http://www.jgraph.com/">JGraph</a> to visualize the graphs.

Solution 5 - Java

I'm assuming you want to find all the (strongly) connected components? For that you can use Tarjan's algorithm (a variant of DFS)

Solution 6 - Java

What about a breadth first search to find all connected nodes? Once you have the list of connected nodes, you subtract this list from the list of all nodes. You end up with a list of disconnected nodes

Solution 7 - Java

I probably should have found a standard algorithm (Wikipedia has some suggestions), but I came up with this on my own and it worked well for my purposes. My C# implementation takes a couple seconds to process a graph with 40,000 nodes and 44,000 edges to find 160 subgraphs and 20,000 unconnected nodes. I used a HashSet to store each subgraph group, so testing group membership is approximately O(1), and the overall algorithm becomes O(E*C) where E is the number of edges and C is the number of connected components in the graph. Wikipedia mentions an O(N) algorithm, linear in the number of nodes, so I'm sure mine is not maximally efficient, but it was plenty fast for my application. (Edit: I'm also glossing over the cost of merging groups, so don't put too much weight on my complexity analysis.)

Logic:

For each edge A->B
  If A is in a group and B is not, add B to group A
  If A is not in a group and B is, add A to group B
  If A and B are in different groups, merge the groups
  If neither A or B is in a group, create new group containing A and B

Pseudocode:

graph = {nodes, edges}
groups = {}
foreach edge A->B in graph.edges:
	groupA = findgroup(A)
	groupB = findgroup(B)
	if (groupA && !groupB)
		groupA.add(B)
	elif (!groupA && groupB)
		groupB.add(A)
	elif (groupA && groupB)
		if (groupA != groupB)
			groupA.union(groupB)
			groups.delete(groupB)
	else
		groups.add({A,B})
findgroup(node):
	for group in groups:
		if group.contains(node):
			return group
	return null

Note that this won't capture any nodes that aren't involved in edges. For my particular purposes this was fine. If you want to get all the single-node groups you can do a final pass:

foreach node in graph.nodes:
    group = findgroup(node)
    if !group:
        groups.add({node})

Solution 8 - Java

I ran into a similar problem where I wanted all the weakly connected subgraphs of a directed graph. I blogged about it here. I used the JUNG API and compare two approaches. My first approach could be used as a template to solve your problem.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionOllie GlassView Question on Stackoverflow
Solution 1 - JavaMAKView Answer on Stackoverflow
Solution 2 - JavaDatageekView Answer on Stackoverflow
Solution 3 - JavaagorenstView Answer on Stackoverflow
Solution 4 - JavaaperkinsView Answer on Stackoverflow
Solution 5 - JavaoggyView Answer on Stackoverflow
Solution 6 - JavaMedicineManView Answer on Stackoverflow
Solution 7 - JavayoyoView Answer on Stackoverflow
Solution 8 - JavaharschwareView Answer on Stackoverflow