Search for isolated components in graph

Posted on

Question :

What will be the best algorithm to find isolated components in a graph? That is, components that do not lose information.



Answer :

It is difficult to understand what is being asked, since both your question and some of the sources I have consulted use different names for the same concept, and vice versa. I will define some terms, some in Portuguese and others in English, according to what I could understand:

    strongly connected component (“strongly connected components”): sets of vertices in which each has one path to the other. In your question, would be the 4 sets delineated ( {a,b,e} , {c,d} , {f,g} and {h} ).

  • sink sink ): In the case of vertices, they are vertices that have no edges coming out of it, at most entering; in the case of a set of vertices, sets in which no vertex has edges coming out. In your case, the {h} set is a sink. The vertex h , on the other hand, has an entering edge – the one that comes out of itself.

  • Isolated Set : A set of vertices that is both a font and a sink. In other words, sets that do not have edges going in or out – just between their own vertices. In your case, there are no non-trivial isolated sets (i.e. the empty set, and the set of all vertices).

If you already have a list of strongly connected components , and you want to know which of these are sinks, do as the @utluiz suggested in the comments: check all the vertices of a component if all the edges leaving only vertices in the component itself. If this is true, this component is a sink. If some vertex in the component leads to another outside the component, then the component is not a sink.

If your problem on the other hand is to find these strongly connected components , then it is more complicated. I will transcribe here (freely translating) the algorithm proposed in “Introduction to Algorithms” , but without going into too much detail because the content is quite extensive.


1. Chame DFS(G) para encontrar os tempos de término f[u] de cada vértice u
2. Faça a tranposição de G (Gt)
3. Chame DFS(Gt), mas no loop principal itere na ordem decrescente de f[u]
                                           (tal como computado no passo 1)
4. Os vértices de cada árvore na floresta do passo 3 são um strongly connected component

If you already have the end times f[u] (in your example figure, it’s the numbers inside each vertex, after the bar: d[u]/f[u] ), just call DFS in the transposed graph. I will not put the transposition algorithm here, as it should be trivial (just reverse the direction of all edges).

DFS is the in-depth search algorithm . d[u] and f[u] are byproducts of this algorithm, showing the order in which each vertex was visited ( d[u] is the moment when the vertex was found for the first time, and f[u] the moment when all its edges out were visited). I will not transcribe the algorithm of the same book, because the link above gives a more understandable implementation in C ++. I will only adapt it to include f[u] , since the original only includes d[u] (there called pre ).

/* Vamos supor que nossos digrafos têm no máximo maxV vértices. */
static int conta, d[maxV], f[maxV];

/* A função DIGRAPHdfs visita todos os vértices e todos os arcos do digrafo G.
   A função atribui um número de ordem d[x] a cada vértice x:  o k-ésimo vértice visitado
   recebe número de ordem k.  (Código inspirado no programa 18.3 de Sedgewick.) */
void DIGRAPHdfs( Digraph G) 
   Vertex v;
   conta = 0;
   for (v = 0; v < G->V; v++) 
      d[v] = -1;
   for (v = 0; v < G->V; v++)
      if (d[v] == -1) 
         dfsR( G, v);

/* A função dfsR supõe que o digrafo G é representado por uma matriz de adjacência.
  (Inspirado no programas 18.1 de Sedgewick.) */
void dfsR( Digraph G, Vertex v) 
   Vertex w;
   d[v] = conta++; 
   for (w = 0; w < G->V; w++)
      if (G->adj[v][w] != 0 && d[w] == -1)
         dfsR( G, w); 
   f[v] = conta++;

Alternative for graphs represented by an adjacency list (also adapted):

/* A função dfsR supõe que o digrafo G é representado por listas de adjacência.
  (Inspirado no programas 18.2 de Sedgewick.) */
void dfsR( Digraph G, Vertex v) 
   link a; 
   d[v] = conta++; 
   for (a = G->adj[v]; a != NULL; a = a->next)
      if (d[a->w] == -1) 
         dfsR( G, a->w); 
   f[v] = conta++; 

If step 4 of the main algorithm is not clear, the “forest” you are interested in is formed by all vertices visited in DIGRAPHdfs . That is, if the first vertex has d[u] = 0 and f[u] = N , all vertices that have d[u] > 0 e f[u] < N are part of this tree. The next tree would be the vertex with d[u] = N+1 and f[u] = M , and so on until the vertices end.

Note: According to the above source, the above algorithm is efficient (linear time in V + A ) and although it does not seem correct, the proof is in 4 pages of theorems that I will not transcribe here .


Leave a Reply

Your email address will not be published. Required fields are marked *