I took this course, Algorithms on Graphs by University of California San Diego & National Research University Higher School of Economics. It had wonderful explanations of basics of graphs.
Q.What is a Graph?
A Graph is a non-linear data structure consisting of nodes and edges. The nodes are sometimes also referred to as vertices and the edges are lines or arcs that connect any two nodes in the graph.
A graph data structure (V, E) consists of:
- A collection of vertices (V) or nodes.
- A collection of edges (E) or paths.
Use cases for Graph data structure:
- use a graph to find out whether two places on a road-map are connected and what is the shortest distance between them
- simulating electrical circuits to find out current flows and voltage drops at various points in the circuit.
- algorithms used in social media applications
Graph Terminology Used:
- Path: It is the sequence of vertices in which each pair of successive vertices is connected by an edge.
- Cycle: It is a path that starts and ends on the same vertex.
- Simple Path: It is a path that does not cross itself that is, no vertex is repeated (except the first and the last). Also, simple paths do not contain cycles.
- Length of a Path: It is the number of edges in the path. Sometimes it’s the sum of weights of the edges also in case of weighted graphs.
- Degree of a Vertex: It is the number of edges that are incident to the vertex.
- Indegree of a Vertex: The indegree of a vertex is the number of edges for which it is head i.e. the numbers edges coming to it.
- Outdegree of a vertex: The out degree of a vertex is the number of edges for which it is the tail i.e. the number of edges going out of it. A node whose outdegree is 0 is called a sink node.
- Adjacent vertices: If (Vi, Vj) is an edge in G, then we say that Vi and Vj are adjacent and the edge (Vi, Vj) is incident on Vi and Vj.
different types of graphs:
1. Undirected Graph:
A graph is an undirected graph if the pairs of vertices that make up the edges are unordered pairs.
i.e. an edge(Vi, Vj ) is the same as (Vj, Vi). The graph G1 shown above is an undirected graph.
2.Directed Graph:
In a directed graph, each edge is represented by a pair of ordered vertices.
i.e. an edge has a specific direction. In such a case, edge (Vi, Vj) ≠ (Vj, Vi)
3.Complete Graph:
If an undirected graph has n vertices, the maximum number of edges it can have is nC2 = n (n-1) / 2. If an undirected graph G has ‘n’ vertices and nC2 edges, it is called a complete graph.
If the graph G is directed and has n vertices, G is complete if it has n(n-1) edges.
4.Multigraph:
A multigraph is a graph in which the set of edges may have multiple occurrences of the same edge.
5.Cycle in graph?
Cycle: A cycle is a path whose first and last vertices are the same. For example, nodes abcd are cyclic.
Acyclic: A graph with no cycles is called an acyclic graph. A directed acyclic graph is called dag. For example, in second example there is no cycle
6.Connected Graph:
Two vertices Vi and Vj are said to be connected if there is a path in G from Vi to Vj.
1.Strongly Connected Graph: A directed graph G is said to be strongly connected if for every pair of distinct vertices Vi, Vj, there is a directed path from Vi to Vj and also from
Vj to Vi.
2.Weakly Connected Graph : A directed graph G is said to be weakly connected there exists at least one set of distinct vertices Vi, Vj, such that there is a directed path from Vi to Vj but no path from Vj to Vi.
7.Subgraph:
A subgraph of G is a graph G ′such that , V(G′) ⊆ V(G) and E(G′) ⊆ E(G)
8.Weighted Graph or Network:
A number (weight) may be associated with each edge of a graph. Such a graph is called a weighted graph or a network.
Example The number may represent the distance, cost, etc.
Ways to Represent graphs:
A.Edge List:
To list an edge, we have an array of two vertex numbers containing the vertex numbers of the vertices that the edges are incident on.
Consider the following graph example,
B. Adjacency Matrix:
The adjacency matrix A of a graph G is a two dimensional array of n × n elements where n is the numbers of vertices. The entries of A are defined as
- A[i][j] = 1, if there exist an edge between vertices i and j in G.
- A[i][j] = 0, if no edge exists between i and j.
- Undirected graph: The degree of vertex i is the number of 1’s in its row (or the sum of row i )
-
Directed graph
- (Indegree) : Total number of 1’s in column i.
- (Outdegree) : Number of 1’s in row i.
Code to implement Adjacency Matrix,
#include <iostream> #include <cstdlib> using namespace std; class Graph { protected: int value; int **graphm; public: Graph(int value) { this->value = value; graphm = new int*[value]; int k, j; for (k = 0; k < value; k++) { graphm[k] = new int[value]; for (j = 0; j < value; j++) { graphm[k][j] = 0; } } } void newEdge(int head, int end) { if (head > value || end > value || head < 0 || end < 0) { cout << "Invalid edge!\n"; } else { graphm[head - 1][end - 1] = 1; graphm[end - 1][head - 1] = 1; } } void display() { int i, p; for (i = 0; i < value; i++) { cout<<i+1<<"Row: \t"; for (p = 0; p < value; p++) { cout <<p+1<<"Column:" << graphm[i][p] << " "; } cout << endl; } } }; int main() { int vertex, numberOfEdges, i, head, end; cout << "Enter number of nodes: "; cin >> vertex; numberOfEdges = vertex * (vertex - 1); Graph g1(vertex); for (int i = 0; i < numberOfEdges; i++) { cout << "Enter edge ex.1 2 (-1 -1 to exit): \n"; cin >> head >> end; if ((head == -1) && (end == -1)) { break; } g1.newEdge(head, end); } cout<<"\n"; g1.display(); cout << endl; return 0; }
Consider the following graph example,
It uses arrays and hence is memory inefficient.
C.ADJACENCY LIST:
An adjacency list represents a graph as an array of linked lists.
The index of the array represents a vertex and each element in its linked list represents the other vertices that form an edge with the vertex.
- From the adjacency list we can calculate the outdegree of any vertex i. The total number of nodes in list ‘i’ is its outdegree.
- To obtain the indegree, we can construct an inverse adjacency list. The inverse adjacency list.
#include <iostream> #include <cstdlib> #include <list> #include <vector> using namespace std; int main() { int vertices, edges, v1, v2, weight; cout<<"Enter the Number of Vertices -\n"; cin>>vertices; edges = vertices * (vertices - 1); vector< list< pair<int, int> > > adjacencyList(vertices + 1); cout <<"Enter edges ex.V1 V2 W (-1 -1 -1 to exit): \n"; for (int i = 0; i < edges; ++i) { cin>>v1>>v2>>weight; if ((v1 == -1) && (v2 == -1) && (weight ==-1)) { break; } // Adding Edges to the Graph adjacencyList[v1].push_back(make_pair(v2, weight)); adjacencyList[v2].push_back(make_pair(v1, weight)); } cout<<"\nThe Adjacency List-\n"; // Printing Adjacency List for (int i = 1; i < adjacencyList.size(); ++i) { cout<<"AdjacencyList at "<< i<<"\t"; list< pair<int, int> >::iterator itr = adjacencyList[i].begin(); while (itr != adjacencyList[i].end()) { cout<<" -> "<<(*itr).first<<"("<<(*itr).second<<")"; ++itr; } cout<<"\n"; } return 0; }
Let us consider this example,
For more information,
- https://runestone.academy/runestone/books/published/cppds/Graphs/AnAdjacencyMatrix.html
- https://medium.com/the-programming-club-iit-indore/graphs-and-trees-using-c-stl-322e5779eef9
GRAPH TRAVERSALS:
1.DEPTH FIRST SEARCH:
Starting from vertex ‘v’, we follow a path in the graph as deeply as we can go marking the vertices in the path as ‘visited’. When there are no adjacent vertices that are not visited, we proceed backwards (back track) to a vertex in the path which has an ‘unvisited’ adjacent vertex and proceed from this vertex. The process continues till all vertices have been visited.
Algorithm: Depth-first search (Graph G, Souce_Vertex S)
- Create a stack STK to store the vertices.
- Push the source vertex S in the stack ‘STK’.
- While the stack STK is not empty
- Pop the vertex U from the top of the stack. i.e Vertex U = STK.top(), STK.pop()
- If the vertex U is not visited
- Explore the vertex U and mark U as visited.
- For every vertex V adjacent to vertex U
- Push the vertex V in the stack STK
The function for implementing it would be,
void Graph::depth_first_search(int start_vertex){ int current_v = start_vertex; this->discovered[current_v] = true; EdgeNode *p = this->edges[current_v]; process_vertex_early(current_v); while(p != NULL){ int neighbor_v = p->key; if(!this->discovered[neighbor_v]){ //this->discovered[neighbor_v] = true; this->parent[neighbor_v] = current_v; process_edge(current_v, neighbor_v); depth_first_search(neighbor_v); } //2nd case is needed because 1st case isn't enough to visit all edges (because 1st case sets all neighbors //to discovered which excludes some visits to edges later) else if(((!this->processed[neighbor_v]) && (this->parent[current_v] != neighbor_v)) || (this->directed)){ process_edge(current_v, neighbor_v); } p = p->next; } this->processed[current_v] = true; }
We can implement DFS using various other data structures for storing graph and for searching in it.
Applications of DFS:
- Minimum Spanning Tree
- Topological sorting
- To solve puzzle/maze which has one solution
For more information,
- https://brilliant.org/wiki/depth-first-search-dfs/
- https://algotree.org/algorithms/tree_graph_traversal/depth_first_search/
2.BREADTH FIRST SEARCH:
BFS is an algorithm for traversing an unweighted Graph or a Tree. BFS starts with the root node and explores each adjacent node before exploring node(s) at the next level.
BFS makes use of the adjacency list data structure to explore the nodes adjacent to the visited (current) node.
Algorithm: Breadth-first search (Graph G, Souce_Vertex S)
- Create a queue Q to store the vertices.
- Push the source vertex S in the queue Q.
- Mark S as visited.
- While the queue Q is not empty
- Remove vertex U from the front of the queue. i.e Vertex U = Q.front(), Q.pop()
- For every vertex V adjacent to the vertex U
- If the vertex V is not visited
- Explore the vertex V and mark V as visited.
- Push the vertex V in the queue Q.
The function for implementing it would be,
void Graph::breadth_first_search(int start_vertex){ initialize_search(); queue<int> q;//queue of vertices still left to visit int current_v; //current vertex EdgeNode *p; //temporary pointer q.push(start_vertex); this->discovered[start_vertex] = true; while(!q.empty()){ current_v = q.front(); q.pop(); process_vertex_early(current_v); p = this->edges[current_v]; while(p != NULL){ int neighbor_v = p->key; if(!this->discovered[neighbor_v]){ //add to Q and set to discovered q.push(neighbor_v); this->discovered[neighbor_v] = true; this->parent[neighbor_v] = current_v; } if(!this->processed[neighbor_v] || this->directed){ process_edge(current_v, neighbor_v); } p = p->next; } this->processed[current_v] = true; } initialize_search(); }
We can implement BFS using various other data structures for storing graph and for searching in it.
Application of BFS :
- Minimum Spanning Tree or shortest path
- Peer to peer networking
- Crawler for search engine
- Social network websites - to find people with a given distance k from a person using BFS till k levels.
For more information,
- https://algotree.org/algorithms/tree_graph_traversal/breadth_first_search/
- https://brilliant.org/wiki/breadth-first-search-bfs/
Difference between DFS and BFS
For more information,
- https://stackoverflow.com/questions/687731/breadth-first-vs-depth-first
- https://visualgo.net/en/mst?slide=1
SPANNING TREE:
If G is a graph containing n vertices, then the minimum number of edges needed to connect the n vertices = n-1. All connected graphs with n-1 edges are trees. These are called spanning trees.
Definition of Spanning Tree: Let G = (V, E) be an undirected connected graph. A subgraph t = (V, E′) of G is a spanning tree of G iff t is a tree.
A minimum spanning tree is a spanning tree with the lowest number of edges.
MINIMUM COST SPANNING TREE: The spanning tree having the minimum sum of weights of edges is called minimum cost spanning tree.
These weights may represent the lengths, distances, cost, etc.
Such trees are widely used in practical applications such as network of road lines between cities, etc.
For example.
Algorithms to find minimum spanning tree:
1. PRIM’S ALGORITHM:
A few key points to remember about Prim algorithm,
- Prim’s algorithm is a greedy algorithm.
- It finds a minimum spanning tree for a weighted undirected graph.
- This means it finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized.
- The algorithm operates by building this tree one vertex at a time, from an arbitrary starting vertex, at each step adding the cheapest possible connection from the tree to another vertex.
Let's consider this graph,
All the steps to find Minimum spanning tree using prim's algorithm,
The end result of minimum spanning tree should look like,
void find_MST(int V) { int parent[V], key[V]; bool visited[V]; // Initialize all the arrays for (int i = 0; i< V; i++) { key[i] = 999; // 99 represents an Infinite value visited[i] = false; parent[i]=-1; } key[0] = 0; // Include first vertex in MST by setting its key vaue to 0. parent[0] = -1; // First node is always root of MST // The MST will have maximum V-1 vertices for (int x = 0; x < V - 1; x++) { // Finding the minimum key vertex from the //set of vertices not yet included in MST int u = min_Key(key, visited,V); visited[u] = true; // Add the minimum key vertex to the MST // Update key and parent arrays for (int v = 0; v < V; v++) { // cost[u][v] is non zero only for adjacent vertices of u // visited[v] is false for vertices not yet included in MST // key[] gets updated only if cost[u][v] is smaller than key[v] if (cost[u][v]!=0 && visited[v] == false && cost[u][v] < key[v]) { parent[v] = u; key[v] = cost[u][v]; } } } // print the final MST print_MST(parent,V); }
2.KRUSKAL’S ALGORITHM:
A few key points to remember about Kruskal algorithm,
- Kruskal’s algorithm for finding the Minimum Spanning Tree(MST), which finds an edge of the least possible weight that connects any two trees in the forest
- It is a greedy algorithm.
- It finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized.
- If the graph is not connected, then it finds a minimum spanning forest (a minimum spanning tree for each connected component).
- Number of edges in MST: V-1 (V – no of vertices in Graph).
Let's consider this graph,
All the steps to find Minimum spanning tree using kruskal's algorithm,
The end result of minimum spanning tree should look like,
void kruskalMST(int V) { int mincost = 0; // Cost of min MST. // Initialize sets of disjoint sets. for (int i = 0; i < V; i++) parent[i] = i; // Include minimum weight edges one by one int edge_count = 0; while (edge_count < V - 1) { int min = __INT_MAX__, a = -1, b = -1; for (int i = 0; i < V; i++) { for (int j = 0; j < V; j++) { if (find(i) != find(j) && cost[i][j] < min) { min = cost[i][j]; a = i; b = j; } } } union1(a, b); // printf("Edge %d:(%d, %d) cost:%d \n",edge_count++, a, b, min); cout<<"Edge"<<edge_count++<<"("<<a<<","<<b<<")"<<"\t cost:"<<min<<"\n"; mincost += min; } cout<<"\n Minimum cost="<<mincost<<"\n"; }
Difference between Kruskal's and Prim's Algorithm
Applications of finding minimum spanning tree:
- Network design
- Traveling salesman problem
SHORTEST PATH
A person wishing to travel from city A to city B could require the following information.
- Is there a path from A to B?
- If there is more than one path from A to B, which is the shortest?
DIJKSTRA IS A “SINGLE SOURCE ALL DESTINATIONS” ALGORITHM
- Dijkstra’s algorithm finds the shortest path in a weighted graph containing only positive edge weights from a single source.
- It uses a priority based dictionary or a queue to select a node / vertex nearest to the source that has not been edge relaxed.
- Since Dijkstra’s algorithm cannot handle negative edge weights, Bellman-Ford algorithm is used for finding the shortest path in a graph containing negative edge weights
Algorithm: Dijkstra’s Shortest Path
- Start with the empty Shortest Path Tree (SPT).
- Maintain a set SPT[] to keep track to vertices included in SPT.
- Assign a distance value to all the vertices, (say distance []) and initialize all the distances with +∞ (Infinity) except the source vertex. This will be used to keep track of the distance of vertices from the source vertex. The distance of source vertex to source vertex will be 0.
- Repeat the following steps until all vertices are processed.
- Pick the vertex u which is not in SPT[] and has minimum distance. Here we will loop through the vertices and find the vertex with minimum distance.
- Add vertex u to SPT[].
- Loop over all the adjacent vertices
- For adjacent vertex v, if v is not in SPT[] and distance[v] > distance[u] + edge u-v weight then update distance[v] = distance[u] + edge u-v weight
Let's consider the graph,
Then Dijkstra algorithm using a matrix results,
void dijkstra(int graph[10][10], int src, int V) { int dist[V]; bool sptSet[V]; for (int i = 0; i < V; i++) dist[i] = INT_MAX, sptSet[i] = false; dist[src] = 0; for (int count = 0; count < V-1; count++) { int u = minDistance(dist, sptSet, V); sptSet[u] = true; for (int v = 0; v < V; v++) if (!sptSet[v] && graph[u][v] && dist[u] != INT_MAX && dist[u]+graph[u][v] < dist[v]) dist[v] = dist[u] + graph[u][v]; } printSolution(dist, V); }
2.Limitation of Dijkstra algorithm:
Dijkstra’s algorithm relies on the fact that the shortest path from s to t goes only through vertices that are closer to s. This is no longer the case for graphs with negative edges, One of the use cases of this is currency cycle with a negative cycle.
3.Bellman-Ford Algorithm:
Bellman-Ford algorithm finds the shortest path (in terms of distance/cost) from a single source in a directed, weighted graph containing positive and negative edge weights.
Algorithm for Bellman Ford,
- Initialize the distance from the source node S to all other nodes as infinite (999999999) and to itself as 0.
- For every node in the graph do
- For every edge E in the EdgeList do
- Node_u = E.first, Node_v = E.second
- Weight_u_v = EdgeWeight ( Node_u, Node_v )
- If ( Distance [v] > Distance [u] + Weight_u_v )
- Distance [v] = Distance [u] + Weight_u_v
In a graph with only positive edge weights, Dijkstra’s algorithm with a priority queue / set implementation runs faster in O ((E+V) log V) than Bellman-Ford O (E.V).
TOPOLOGICAL SORT:
AOV network: A directed graph in which the vertices represent activities or tasks and the edges represent the precedence is called an AOV network. There should be no cycles in an AOV network i.e there should be atleast one activity which does not have a predecessor(start activity).
Topological sort: If we have an AOV network, then we would like to find out whether the project is feasible or not and if it is feasible, then in what order should the activities be performed so that the project can be completed. The process of converting a set of precedences represented by an AOV network into a linear list in which no later activity precedes an earlier one is called Topological Sorting. This method identifies the order of the activities.
Let us consider this graph,
All the steps to perform for topological sort,
The topological order is :V1, V3, V2, V4, V5
topologicalSort() { stack<int> Stack; // Mark all the vertices as not visited bool* visited = new bool[V]; for (int i = 0; i < V; i++) visited[i] = false; // Call the recursive helper function to store Topological // Sort starting from all vertices one by one for (int i = 0; i < V; i++) if (visited[i] == false) topologicalSortUtil(i, visited, Stack); // Print contents of stack while (Stack.empty() == false) { cout << Stack.top() << " "; Stack.pop(); } }
Applications of Topological sort:
- Scheduling jobs from given dependencies among Jobs. F
- Instruction Scheduling
- Determining the order of compilation tasks to perform in makefiles, data serializations and resolving symbol dependencies in linkers.
References
- Research papers of Daniel Kane
- Research papers from Princeton
- Research papers from the University of Washington
- Reference from AlgoTree, Khan Academy and Stackoverflow
- GitHub repository for code
GitHub repository for code, https://github.com/kakabisht/Data-structures
I have used a lot of images from a lot of sources, and i don't own images used. I just found them online, and used them to explain concepts. As, images are far better for understanding concepts.
Top comments (0)