Tamerlan Gudabayev

Posted on May 23, 2021 • Edited on Oct 20, 2021

Data Structures: Graphs

#beginners #programming #computerscience #algorithms

Introduction

Graphs.

Everyone talks about them.

They always ask them in interviews.

But you don't know them.

It doesn't have to be this way, that's why in this week we are tackling the graph data structure.

Learning Outcomes

Today, you will learn:

What is a graph.
Where are graphs used.
Types of Graphs
How to implement a graph
How to traverse a graph

What is a graph?
- Essential Terminology
Types of Graphs
- Directed Graph
- Simple Graph
Weighted Graphs
Applications
Types of Implementations
Code Implementation
Traversing a Graph
- Breadth First Search
- Depth First Search
Conclusion
Additional References

What is a graph?

A graph is a non-linear data structure that helps us describe entities and it's relationships to other entities.

For example, you may use the graph data structure to represent friends in a social media website.

Graphs are very useful because they can describe a great deal of problems we have in the real world.

But before we move on, we have to break down the graph even more.

Essential Terminology

A graph (V,E) is made out of two parts:

Collection of Vertices (V): Your basic node (i.e user, place, event, etc...)
Collection of Edges (E): They represent the relationships between nodes. In mathematical terms, they can be represented by coordinates of two nodes (u,v).

We can represent the graph above by:

V = {0, 1, 2, 3, 4}
E = {
     (0,1), (0,4), (1,0), 
         (1,2), (1,3), (1,4),
     (2,1), (2,3),
     (3,1), (3,2), (3,4),
     (4,0), (4,1), (4,3),
    }
G = {V, E} // A graph is a collection of vertices and edges

Types of Graphs

While researching, I've come across many different types of graphs, but to save you time I've narrowed it down to two basic types:

Directed Graph
Simple Graph

Directed Graphs

A graph where all edges are directed from one vertex to another vertex is called a directed graph.

The connections are not bidirectional, meaning that if we have a connection from A to B, that doesn't mean we have a connection from B to A.

But you might ask "Why will we use something like this?"

Well, there are many use cases for directed graphs, here are a few:

Twitter: When you follow someone, that's a direct connection, the other person doesn't need to follow you.
Flights: You might be able to fly from Paris to London, but there's no guarantee that you can fly from London to Paris.
Road Networks: Some roads are one-way, others are two-way, you get the gist.

Simple Graphs

Graphs that are always two-way connections are called simple graphs. Our first example (at the beginning of the post) was a simple graph (also called undirected graphs).

PS. Through out this post, whenever I say graphs, I'm gonna mean simple graphs.

Weighted Graphs

So far we assume that every edge is equal to each other. Unfortunately, life doesn't work that way, the distance between France and Germany is shorter than France and the USA. This is why we add a weight to each edge. A graph with weighted edges is called a weighted graph.

Applications

Graphs are very useful, they can represent many real-life models.

Here are a few notable use cases:

Social Graphs: This helps showcase the relationship between you and the people, places and thing you interact with online.
Knowledge Graphs: This is notably used at Google, to connect data to each other. For example, the painting "Mona Lisa" is associated to "Leonardo Davinci", in which he is associated to his other painting, and the rest of his information.
Recommendation Engines: Graphs helps us build recommendation engines, for example in Netflix, if you like "The Witcher", you will be recommended other shows by people who also liked "The Witcher". Everything is connected to each other.
Path Optimization Algorithms: What's the best way to get from Tokyo to London, or from your local grocery store to your nearest home. These types of questions can be modelled and solved using graphs. This is already being used in most map applications such as Google Maps.

Types of Implementations

Now that we have the theory covered, let us move into more practical stuff.

There are three main ways to implement graphs:

Adjacency Matrix
Adjacency Lists
Adjacency Sets

Let us cover each implementation with it's pros and cons.

Adjacency Matrix

In adjacency matrix, we represent the graph in a two-dimensional array of V x V vertices.

If any element of graph[i][j] is 1, this means that there is an edge connecting vertex i with vertex j.

This approach if very fast for look-ups. Because we can instantly get the value of graph[i][j]

But it's also very heavy on memory because we have to reserve a space in memory for every possible edge (V x V).

Adjacency List

Adjacency list represents graphs as an array of linked lists.

The index of the array represent the vertex and each element in the linked list represents the vertices it connects to, making an edge.

This approach saves us a lot of space, but it's slower than adjacency matrix on hookups.

Adjacency Set

This is a new alternative to adjacency lists, instead of having an array of linked lists, have it be an array of sets.

But why?

Well the reason is that sets are ordered and do not allow for duplicates.

We can use binary search, making our search times O(log n).

Code Implementation

For this example, we are gonna use the adjacency list implementation because it's the most common.

We will also use python because I like it.

# A class to represent the adjacency list of the node
class AdjNode:
    def __init__(self, data):
        self.vertex = data
        self.next = None

# A class to represent a graph. A graph
# is the list of the adjacency lists.
# Size of the array will be the no. of the
# vertices "V"
class Graph:
    def __init__(self, vertices):
        self.V = vertices
        self.graph = [None] * self.V

    # Function to add an edge in an undirected graph
    def add_edge(self, src, dest):
        # Adding the node to the source node
        node = AdjNode(dest)
        node.next = self.graph[src]
        self.graph[src] = node

        # Adding the source node to the destination as
        # it is the undirected graph
        node = AdjNode(src)
        node.next = self.graph[dest]
        self.graph[dest] = node

    # Function to print the graph
    def print_graph(self):
        for i in range(self.V):
            print("Adjacency list of vertex {}\n head".format(i), end="")
            temp = self.graph[i]
            while temp:
                print(" -> {}".format(temp.vertex), end="")
                temp = temp.next
            print(" \n")

# Driver program to the above graph class
if __name__ == "__main__":
    V = 5
    graph = Graph(V)
    graph.add_edge(0, 1)
    graph.add_edge(0, 4)
    graph.add_edge(1, 2)
    graph.add_edge(1, 3)
    graph.add_edge(1, 4)
    graph.add_edge(2, 3)
    graph.add_edge(3, 4)

    graph.print_graph()

Traversing a Graph

Most common interview questions that include graph will include some sort of traversal of it's very useful to know how to traverse one.

There are two main algorithms:

Breadth First Search
Depth First Search

Breadth First Search

Breadth-First Search (BFS) is a vertex-based technique for finding the shortest path. In a nutshell it everything in the topmost layer before moving down.

Here's the code:

def BFS(self, s):

    # Mark all the vertices as not visited
    visited = [False] * (max(self.graph) + 1)

    # Create a queue for BFS
    queue = []

    # Mark the source node as
    # visited and enqueue it
    queue.append(s)
    visited[s] = True

    while queue:

        # Dequeue a vertex from
        # queue and print it
        s = queue.pop(0)
        print (s, end = " ")

        # Get all adjacent vertices of the
        # dequeued vertex s. If a adjacent
        # has not been visited, then mark it
        # visited and enqueue it
        for i in self.graph[s]:
            if visited[i] == False:
                queue.append(i)
                visited[i] = True

Depth First Search

Depth-First Search (DFS) is an edge-based technique to find the shortest path. It uses the Stack data structure, performs two stages, first visited vertices are pushed into the stack and second if there is no vertices then visited vertices are popped.

        # A function used by DFS
    def DFSUtil(self, v, visited):

        # Mark the current node as visited
        # and print it
        visited.add(v)
        print(v, end=' ')

        # Recur for all the vertices
        # adjacent to this vertex
        for neighbour in self.graph[v]:
            if neighbour not in visited:
                self.DFSUtil(neighbour, visited)

    # The function to do DFS traversal. It uses
    # recursive DFSUtil()
    def DFS(self, v):

        # Create a set to store visited vertices
        visited = set()

        # Call the recursive helper function
        # to print DFS traversal
        self.DFSUtil(v, visited)

Conclusion

We covered a lot today, and I hope you got something new out of it. This is the ninth post in our data structures series, time flies by. Anyways, that we have covered most of the main data structures, it's time to move onto algorithms. So stay tuned, and I hope you have a nice day.

DEV Community