# Search for isolated components in graph

Posted on

#### Question :

What will be the best algorithm to find isolated components in a graph? That is, components that do not lose information.

It is difficult to understand what is being asked, since both your question and some of the sources I have consulted use different names for the same concept, and vice versa. I will define some terms, some in Portuguese and others in English, according to what I could understand:

strongly connected component (“strongly connected components”): sets of vertices in which each has one path to the other. In your question, would be the 4 sets delineated ( `{a,b,e}` , `{c,d}` , `{f,g}` and `{h}` ).

• sink sink ): In the case of vertices, they are vertices that have no edges coming out of it, at most entering; in the case of a set of vertices, sets in which no vertex has edges coming out. In your case, the `{h}` set is a sink. The vertex `h` , on the other hand, has an entering edge – the one that comes out of itself.

• Isolated Set : A set of vertices that is both a font and a sink. In other words, sets that do not have edges going in or out – just between their own vertices. In your case, there are no non-trivial isolated sets (i.e. the empty set, and the set of all vertices).

If you already have a list of strongly connected components , and you want to know which of these are sinks, do as the @utluiz suggested in the comments: check all the vertices of a component if all the edges leaving only vertices in the component itself. If this is true, this component is a sink. If some vertex in the component leads to another outside the component, then the component is not a sink.

If your problem on the other hand is to find these strongly connected components , then it is more complicated. I will transcribe here (freely translating) the algorithm proposed in “Introduction to Algorithms” , but without going into too much detail because the content is quite extensive.

``````STRONGLY-CONNECTED-COMPONENTS(G)

1. Chame DFS(G) para encontrar os tempos de término f[u] de cada vértice u
2. Faça a tranposição de G (Gt)
3. Chame DFS(Gt), mas no loop principal itere na ordem decrescente de f[u]
(tal como computado no passo 1)
4. Os vértices de cada árvore na floresta do passo 3 são um strongly connected component``````

If you already have the end times `f[u]` (in your example figure, it’s the numbers inside each vertex, after the bar: `d[u]/f[u]` ), just call `DFS` in the transposed graph. I will not put the transposition algorithm here, as it should be trivial (just reverse the direction of all edges).

`DFS` is the in-depth search algorithm . `d[u]` and `f[u]` are byproducts of this algorithm, showing the order in which each vertex was visited ( `d[u]` is the moment when the vertex was found for the first time, and `f[u]` the moment when all its edges out were visited). I will not transcribe the algorithm of the same book, because the link above gives a more understandable implementation in C ++. I will only adapt it to include `f[u]` , since the original only includes `d[u]` (there called `pre` ).

``````/* Vamos supor que nossos digrafos têm no máximo maxV vértices. */
static int conta, d[maxV], f[maxV];

/* A função DIGRAPHdfs visita todos os vértices e todos os arcos do digrafo G.
A função atribui um número de ordem d[x] a cada vértice x:  o k-ésimo vértice visitado
recebe número de ordem k.  (Código inspirado no programa 18.3 de Sedgewick.) */
void DIGRAPHdfs( Digraph G)
{
Vertex v;
conta = 0;
for (v = 0; v < G->V; v++)
d[v] = -1;
for (v = 0; v < G->V; v++)
if (d[v] == -1)
dfsR( G, v);
}

/* A função dfsR supõe que o digrafo G é representado por uma matriz de adjacência.
(Inspirado no programas 18.1 de Sedgewick.) */
void dfsR( Digraph G, Vertex v)
{
Vertex w;
d[v] = conta++;
for (w = 0; w < G->V; w++)
if (G->adj[v][w] != 0 && d[w] == -1)
dfsR( G, w);
f[v] = conta++;
}``````

``````/* A função dfsR supõe que o digrafo G é representado por listas de adjacência.
(Inspirado no programas 18.2 de Sedgewick.) */
void dfsR( Digraph G, Vertex v)
{
If step 4 of the main algorithm is not clear, the “forest” you are interested in is formed by all vertices visited in `DIGRAPHdfs` . That is, if the first vertex has `d[u] = 0` and `f[u] = N` , all vertices that have `d[u] > 0 e f[u] < N` are part of this tree. The next tree would be the vertex with `d[u] = N+1` and `f[u] = M` , and so on until the vertices end.
Note: According to the above source, the above algorithm is efficient (linear time in `V + A` ) and although it does not seem correct, the proof is in 4 pages of theorems that I will not transcribe here .