Data Structures : Graph Traversal (DFS)

#java #algorithms #beginners #programming

Depth First Search (DFS) is a graph traversal technique which is used to visit the nodes in a graph.

Start from one node and visit it's adjacent node and then it's adjacent node, keep doing this until you reach the last node and then start backtracking.
While backtracking visit the rest of the adjacent nodes in the same way.

Let's understand this with an example:

Consider this graph and Node 1 as the starting point.
Now as we can see node 1 has two adjacent nodes, 2 & 3 and we can visit any one we want. In this example we'll start with Node 2.
Again node 2 has two adjacent nodes and we'll pick node 5.
This is an important part of the algorithm because there are no adjacent nodes for 5 which means we can't move forward. At this point we start backtracking and move back to node 2.
We notice that node 2 has another node which we didn't visit yet so we'll do just that, we'll visit Node 6.
Again node 6 doesn't have any adjacent node so we'll backtrack to node 2 and since we have visited all the adjacent nodes of 2 we'll backtrack to node 1.
Using the same steps we'll visit all the remaining nodes i.e. 3, 4, 8 & 7.
Now you might ask that there is an edge between node 7 and 3 so do we visit it again? The answer is No because we've already visited node 3 on our way to node 7 and the same logic applies when we backtrack to node 3. We'll see how this is taken care of in the code.
So we've visited all the nodes in the graph and the path is {1, 2, 5, 6, 3, 4, 8, 7}.

Now it's time to code it up but before that I should explain the situation with already visited nodes. Since we have visited the particular node before we don't have to do that again. Now you might be wondering how do we keep track of which node we have visited and which one we haven't and the simple answer is an Array.

We simply create an array of length equal to the number of vertices in the graph and fill it with 0 or false depending on the array type you want. Now as we visit a node we simply mark that index in the array as visited by setting it as 1 or true.

In the next dfs call we will first check if the node we are on is visited or not by simply checking the value at that index in the array. We will look at the code to better understand this.

Steps to sum up the code:

Visited array and a list are created.
Call the dfs on starting node.
Set vis[node] = true and list.add(node).
For each loop on adjacency list and find the adjacent node.
Check if the adjacent node is visited, if not call dfs on it.
The path is stored in list.

Space Complexity: O(N) + O(N) + O(N) Adjacency list, Array, Auxiliary space.

Time Complexity: O(N) + (2*E) Recursion call for each node + Summation of degrees.

That's it for the DFS traversal. I hope this was helpful.
Thanks for reading.

Top comments (3)

Vincent A. Cicirello • Oct 13 '22

I'm assuming the N in your time and space complexities is number of vertexes (usually V is used for that).

With that assumption in mind, the space complexity is just O(N). You don't need the + to add multiple O(N) together. O(N) + O(N) = O(N). And likewise in the time complexity you can drop the 2*. It is just O(N + E).

Gaurav • Oct 14 '22

Yes you're correct about the space complexity, I did that just for easier understanding and adding all of them will give nearly O(N) but in time complexity we have to multiply edges by 2 which is summation of edges in case of an Undirected graph, 2 can be dropped in case of a directed graph.

Vincent A. Cicirello • Oct 14 '22

No. The 2 can be dropped in the undirected case as well. It doesn't matter that the dfs examines each edge twice since the adjacency lists include it twice. Constant coefficients can always be dropped in Big-O.