While practicing some competitive programming, I came across an interesting subsection of Graph Theory - Strongly Connected Components (SCC).
In a directed graph, strongly connected components are nodes that have edges such that each node in the SCC is able to have a path to every other node in the SCC. To help you find these SCCs one can use something called Tarjan's Algorithm!
Tarjan's algorithm is an algorithm that can find SCCs in a directed graph with one DFS pass O(V+E) time and using just a stack O(V) space, where V is the number of nodes and E is the number of edges.
The way it works is that it keeps track of two lists discovery
and low_link
that are the length of the number of nodes. Tarjan's algorithm works correctly because it makes use of a stack to remember what nodes are counted in the SCC. Without this it is possible to run the DFS on the graph in a order such that you locate less SCCs than what actually is on the graph.
The stack is used to keep track of nodes that are already included in the SCC so far so if you encounter a node that is on the stack, it must be part of the current SCC.
A high level overview of the algorithm is as follows:
Traverse through all the nodes with a for loop, running DFS on the node if it is not visited.
As you visit nodes mark the node as visited, and have a counter that marks each node with ids and sets that node's discovery
and low_link
equal to the node id. Also, push the node onto the stack and remember that it is on the stack.
This algorithm uses the idea that a node's discovery time and low_link value is equal to the order that it got visited as this order is how the algorithm finds the lowest common ancestor.
Loop through all of the node's neighbors if and run DFS on the unvisited nodes.
After coming back from the recursion, set the low_link
value of the current node equal to the min between itself and the neighbor.
Also, check if the neighbor is on the stack and if it is, it means it is part of the SCC. Set the low_link
value to the min between itself and the discovery time of the neighbor.
After checking all neighbors, check if the current node's discovery
value is equal to the low_link
value as this means its the root node to the SCC. Pop of all the nodes up to and including that current node.
At the end, your low_link
list should have all of the SCCs!
Theoretical Properties
Apparently, this algorithm works off the fact that SCCs are maximal in a way, meaning if you find an SCC, it is guarenteed that there won't be another SCC within it. Another useful property of the algorithm is that no SCC will be identified before any of its successors.
Implementation
Here is my implementation of Tarjan's Algorithm!
class TarjansGraph {
public:
inline static int UNVISITED = -1;
vector<int> *adj_list;
int n;
TarjansGraph(int size) {
this -> n = size;
adj_list = new vector<int>[size];
}
void addEdge(int u, int v) {
adj_list[u].push_back(v);
}
void dfs(int at, vector<int> &ids, vector<int> &low_list, vector<bool> &onStack, stack<int> &myStack) {
ids[at] = at;
low_list[at] = at;
onStack[at] = true;
myStack.push(at);
for(auto iter : adj_list[at]) {
if(ids[iter] == UNVISITED){
dfs(iter, ids, low_list, onStack, myStack);
low_list[at] = min(low_list[at], low_list[iter]);
}
else if (onStack[iter]) {
low_list[at] = min(low_list[at], ids[iter]);
}
}
if (ids[at] == low_list[at]) {
while(myStack.top() != at){
int stack_node = myStack.top();
cout << stack_node << " ";
onStack[stack_node] = false;
myStack.pop();
}
int at_stack_node = myStack.top();
cout << at_stack_node << endl;
onStack[at_stack_node] = false;
myStack.pop();
}
}
void findSCCs() {
vector<int> ids(this -> n, TarjansGraph::UNVISITED);
vector<int> low_list(this -> n, 0);
vector<bool> onStack(this -> n, false);
stack<int> myStack;
for(int i = 0; i < this -> n; i++){
if (ids[i] == TarjansGraph::UNVISITED) {
dfs(i, ids, low_list, onStack, myStack);
}
}
cout << "\n\n\n LOW LIST VALUES: ";
for (int i = 0; i < this -> n; i++){
cout << low_list[i] << " ";
}
cout << endl;
}
};
Top comments (0)