1 month

# Graph Data Structure And Algorithms

Graph theory may be said to have its beginning in 1736 when EULER considered the (general case of the) Königsberg bridge problem: Does there exist a walk crossing each of the seven bridges of Königsberg exactly once? The Königsberg Bridge Problem was solved by the Swiss mathematician Leonhard Euler (pronounced ‘Oiler ’) in 1736. (Solutio Problematis ad geometriam situs pertinentis, Commentarii Academiae Scientiarum Imperialis Petropolitanae 8 (1736), pp. 128-140.) It took 200 years before the first book on graph theory was written. This was “Theorie der endlichen und unendlichen Graphen” ( Teubner, Leipzig, 1936) by KÖNIG in 1936. Since then graph theory has developed into an extensive and popular branch of mathematics, which has been applied to many problems in mathematics, computer science, and other scientific and not-so-scientific areas. For the history of early graph theory, see N.L. BIGGS, R.J. LLOYD, AND R.J. WILSON, “Graph Theory 1736 – 1936”, Clarendon Press, 1986. There are no standard notations for graph theoretical objects. This is natural because the names one uses for the objects reflect the applications. Thus, for instance, if we consider a communications network (say, for email) as a graph, then the computers taking part in this network, are called nodes rather than vertices or points. On the other hand, other names are used for molecular structures in chemistry, flow charts in programming, human relations in social sciences, and so on.

What is a graph?

A graph is a pictorial representation of a set of objects where some pairs of objects are connected by links. The interconnected objects are represented by points termed as vertices, and the links that connect the vertices are called edges.

A graph G = (V, E) consists of a set V of vertices (also called nodes) and a set E of edges.

If an edge connects to a vertex we say the edge is incident to the vertex and say the vertex is an endpoint of the edge.

If an edge has only one endpoint then it is called a loop edge.

If two or more edges have the same endpoints then they are called multiple or parallel edges.

Two vertices that are joined by an edge are called adjacent vertices.

A pendant vertex is a vertex that is connected to exactly one other vertex by a single edge.

A directed graph is a graph in which the edges may only be traversed in one direction. Edges in a simple directed graph may be specified by an ordered pair (vi , vj) of the two vertices that the edge connects. We say that vi is adjacent to vj and vj is adjacent from vi .

The degree of a vertex is the number of edges incident to the vertex and is denoted deg(v).

In a directed graph, the in-degree of a vertex is the number of edges incident to the vertex and the out-degree of a vertex is the number of edges incident from the vertex.

A graph H is a subgraph of a graph G if all vertices and edges in H are also in G.

A connected component of G is a connected subgraph H of G such that no other connected subgraph of G contains H.

A graph is called Eulerian if it contains an Eulerian circuit.

A tree is a connected, simple graph that has no cycles. Vertices of degree 1 in a tree are called the leaves of the tree.

A weighted graph is a graph G = (V, E) along with a function w : E → R that associates a numerical weight to each edge. If G is a weighted graph, then T is a minimal spanning tree of G if it is a spanning tree and no other spanning tree of G has a smaller total weight.

A graph is called bipartite if its set of nodes can be partitioned into two disjoint sets S1 and S2 so that every edge in the graph has one endpoint in S1 and one endpoint in S2.

The complete bipartite graph on n, m nodes, denoted Kn,m, is the simple bipartite graph with nodes S1 = {a1, . . . , an} and S2 = {b1, . . . , bm} and with edges connecting each node in S1 to every node in S2.

Simple graphs G and H are called isomorphic if there is a bijection f from the nodes of G to the nodes of H such that {v,w} is an edge in G if and only if {f (v), f (w)} is an edge of H. The function f is called an isomorphism.

A simple, connected graph is called planar if there is a way to draw it on a plane so that no edges cross. Such a drawing is called an embedding of the graph in the plane.

For a planar graph G embedded in the plane, a face of the graph is a region of the plane created by the drawing. The area of the plane outside the graph is also a face, called the unbounded face.

## What are graph algorithms?

Graph algorithms, having a non-linear data structure of edges and nodes,  are a set of instructions that traverse (visits nodes of) a graph and find specific nodes, paths, or a path between two nodes. The edges are arcs or lines that connect any two nodes in a graph.

These data structures have a lot of practical applications. For instance, Facebook’s Graph API is a perfect example of the application of graphs to real-life problems. Everything is a vertice or an edge on the Graph API. A vertice is anything that has some characteristic properties and can store data. The vertices of Facebook Graph API are Pages, Places, Users, Events, Comments, etc. On the other hand, every connection is an edge. Examples of Graph API edges are a user posting a comment, photo, video, etc.

The common graph representations are:  * Adjacency matrix, * Adjacency list, * Hash table of hash tables

Each node can have an edge with every other N-1 node in an undirected graph. Thus, the total number of edges possible will be N*(N-1), but here, in the undirected graph, the two edges between two nodes will be the same. So we need to remove one of them. Therefore the total number of edges possible is N * (N - 1)/2.

You can manage the graph data structures using the common operations mentioned below.

• contains - It checks if a graph has a certain value.
• addNode - It adds vertices to the graphs.
• removeNode - It removes the vertices from the graphs.
• hasEdge - It checks if a path or a connection exists between any two vertices in a graph.
• addEdge - It adds paths or links between vertices in graphs.
• removeEdge - It removes the paths or connections between vertices in a graph.

The following are some of the most common graph algorithms:

### 1. Breadth-First Search (BFS):

It is a graph traversal algorithm that traverses the graph from the nearest node (root node) and explores all unexplored (neighboring) nodes. You can consider any node in the graph as a root node when using BFS for traversal.

You can think of BFS as a recursive algorithm that searches all the vertices of a graph or tree data structure. It puts every vertex of the graph into the following categories.

• Visited
• Non-visited

The breadth-first search uses Queue data structure to find the shortest path in a given graph and makes sure that every node is visited not more than once.

import collections
def bfs(graph, root):
visited, queue = set(), collections.deque([root])
while queue:
# Dequeuing a vertex from queue
vertex = queue.popleft()
print(str(vertex) + " ", end="")
# If not visited, marking it as visited, and
# enqueuing it
for neighbor in graph[vertex]:
if neighbor not in visited:
queue.append(neighbor)

#### Applications of the BFS algorithm

• BFS may be used to discover the locations nearest to a specific origin point.
• The BFS algorithm is used in peer-to-peer networks as a search technique to discover all neighboring nodes. uTorrent, BitTorrent, and other similar torrent clients use this method to look for “peers” and “seeds” in the network.
• BFS is used in web crawlers to build web page indexes. It starts at the source page and works its way through all of the links connected with it. Every web page is treated as a node in the network graph.

2. Depth First Search (DFS):

The depth-first search (DFS) algorithm is a graph-traversing algorithm that works its way down the graph data structure beginning from the root node through to the adjacent nodes. The DFS algorithm is a recursive algorithm that uses the idea of backtracking. It involves exhausting searches of all nodes by moving ahead if possible until there are no more nodes to explore in the current path, at which time it begins backtracking. The algorithm will follow the current route until all of the unvisited nodes have been visited, at which point a new path will be chosen.

The Depth-first search algorithm uses Stack data structure. It traverses from an arbitrary node, marks the node, and moves to the adjacent unmarked node.

def dfs(graph, start, visited=None):
if visited is None:
visited = set()
print(start)
for next in graph[start] - visited:
dfs(graph, next, visited)

return visited
graph = {'0': set(['1', '2']),
'1': set(['0', '3', '4']),  '2': set(['0']),  '3': set(['1']), '4': set(['2', '3'])}
dfs(graph, '0')

#### Applications of DFS algorithm

• Depth-first search is employed in all of these situations: in scheduling problems, cycle detection in graphs, topological sorting, and finding solutions for puzzles that have only one solution, e.g., sudoku and mazes.
• The depth-first search method is used in network analysis, for example, to test if a graph is bipartite. It is frequently employed as a subroutine in the Ford-Fulkerson algorithm and other network flow algorithms.
• The depth-first search is frequently employed as a subroutine in more complicated algorithms. For instance, Hopcroft–Karp incorporates DFS to assist in discovering a matching in a graph.
• DFS algorithm is used in scheduling, finding spanning trees, and mapping routes.

3. Dijkstra Algorithm:

Dijkstra algorithm is a shortest path algorithm that computes the shortest path between the source node (given node) and all other neighboring nodes in a given graph. It uses the edges’ weights to find a path that minimizes the weight (total distance) between the source node and all other nodes. The most commonly used data structure for the Dijkstra algorithm is Minimum Priority Queue. It is because the operations of this algorithm match with the specialty of a minimum priority queue. The minimum priority queue is a data structure that manages a list of values (keys) and prioritizes the elements with minimum value. It supports the operations like getMin(), extractMin(), insert(element) etc.

#### Applications of Dijkstra’s algorithm

• Dijkstra’s algorithm is used in network routing protocols, such as RIP, OSPF, and BGP, to calculate the best route between two nodes.
• It is used in algorithms for solving the shortest path problem, such as the A* algorithm.
• The Bellman-Ford algorithm uses Dijkstra’s algorithm to find the shortest path from a source node to all other nodes in a graph.
• Dijkstra’s algorithm is used in many artificial intelligence applications, such as game playing and search engines.

4. Bellman-Ford Algorithm:

Like Dijkstra’s algorithm but slower, it is also a single-source shortest path algorithm. It computes the shortest distance from a single vertex to all other vertices in a weighted graph. Bellman ford’s algorithm guarantees the correct answer even if the weighted graph has negatively weighted edges. However, the Dijkstra algorithm can not guarantee an accurate result in the case of negative edge weights.

class Graph:
def __init__(self, vertices):
self.V = vertices # Vertices in the graph
self.graph = [] # Array of edges
def add_edge(self, s, d, w):
self.graph.append([s, d, w])
# Printing the solution
def print_solution(self, dist):
print("Vertex Distance from Source")
for i in range(self.V):
print("{0}\t\t{1}".format(i, dist[i]))
def bellman_ford(self, src):
# Filling the distance array and predecessor array
dist = [float("Inf")] * self.V
# Marking the source vertex
dist[src] = 0
# Relaxing edges |V| - 1 times
for _ in range(self.V - 1):
for s, d, w in self.graph:
if dist[s] != float("Inf") and dist[s] + w < dist[d]:
dist[d] = dist[s] + w
# Step 3: Detecting negative cycle in a graph
for s, d, w in self.graph:
if dist[s] != float("Inf") and dist[s] + w < dist[d]:
print("Graph contains negative weight cycle")
return
# No negative weight cycle found!

# Printing the distance and predecessor array
self.print_solution(dist)
g = Graph(5)
g.bellman_ford(0)

#### Applications of Bellman-Ford

• It is used in distance-vector routing protocols, e.g., in Routing Information Protocols (RIPs).
• It is used in the Maximum Weighted Matching problem, a problem in graph theory.
• It is used in the All-Pairs Shortest Paths problem.
• In chemical reactions computing the smallest possible heat loss/gain.
• Identifying the currency conversion method that’d be most efficient

5. Floyd Warshall Algorithm:

Other names for the Floyd Warshall algorithm are the Roy-Warshall algorithm and the Roy-Floyd algorithm. The Floyd–Warshall algorithm is a graph theory algorithm used to find the shortest path between all pairs of vertices in a graph. The algorithm works by constructing a table of shortest paths from each vertex to every other vertex in the graph. Like the Dijkstra’s and Bellman-Ford algorithms, it calculates the shortest path in a graph. However, unlike the two algorithms that use only one source to compute the shortest distance, the Floyd–Warshall algorithm calculates the shortest distances for all pairs of vertices in a graph.

vertices = 4
INF = 999
# implementing the floyd-warshall algorithm
def floyd_warshall(Graph):
distance = list(map(lambda a: list(map(lambda b: b, a)), Graph))
# Adding the vertices individually
for k in range(vertices):
for a in range(vertices):
for b in range(vertices):
distance[a][b] = min(distance[a][b], distance[a][k] + distance[k][b])
solution(distance)
# Printing the desired solution
def solution(distance):
for a in range(vertices):
for b in range(vertices):
if(distance[a][b] == INF):
print("INF", end=" ")
else:
print(distance[a][b], end=" ")
print(" ")
Graph = [[0, 3, INF, 5], [2, 0, INF, 4], [INF, 1, 0, INF], [INF, INF, 2, 0]]
floyd_warshall(Graph)

#### Applications of Floyd–Warshall

• Computer science - Floyd–Warshall can be used to find the best path between two vertices in a graph.
• Networks - Is used to find the shortest path between two points in a network.
• Optimal routing - Finds the path with the maximum flow between two vertices.
• Pathfinder networks - Can be used for fast computations of Pathfinder networks.

6. Greedy algorithm

A greedy algorithm is an algorithm that always chooses the best possible option at each step without considering future steps. The algorithm typically starts with a small solution and then improves it by making local changes that do not affect the global optimum.

#### Applications of greedy algorithms

• Optimizing search results by ordering them according to some heuristic.
• Scheduling tasks for a limited number of machines so that the most important tasks are completed first.
• Choosing a subset of items from a larger set so that the resulting set has the largest possible value.
• Packing items into a container in a way that minimizes transportation costs.

7. Prim’s Algorithm

Prim’s algorithm is a graph theory algorithm used to find the shortest path between a given source vertex and all other vertices in a graph. The algorithm works by constructing a table of distances from the source vertex to all other vertices in the graph. The shortest path from the source vertex to any other vertex is then found by looking at the minimum distance in the table.

• Spanning Tree - It is the subgraph of an undirected connected graph.
• Minimum Spanning Tree - It is the spanning tree with the minimum sum of the weights of the edges.

#### Applications of Prim’s algorithm

• In scenarios where finding the shortest path would lower costs, e.g., landing cables, LAN networks, electric grid, natural gas pipes, drinking water pipe network, etc.
• In single-link clusters to find the pair of elements closest to each other.

8. Kruskal’s Algorithm

It finds the minimum spanning tree for a connected weighted graph. Its main objective is to find the subset of the edges through which we can traverse every graph’s vertex. This algorithm uses the greedy approach to find the optimum solution at every stage rather than focusing on a global optimum.

Kruskal’s algorithm starts from the edges having the lowest weights and keeps adding the edges until you achieve the goal. Below are the steps to implement this algorithm.

• Sort all edges in the ascending order of weight (low to high).
• Take the lowest weight edge and add it to the spanning tree. Reject the edge that creates a cycle when added.
• Keep adding the edges until we cover all vertices and make a minimum spanning tree.

#### Applications of Kruskal

• Designing rail and road networks to connect several cities
• Placing microwave towers
• Designing irrigation channels
• Designing fiber-optic grids

9. Boyer-Moore algorithm:

The Boyer-Moore algorithm is a string search algorithm that is used to find a substring in a larger string. The algorithm works by trying all possible matches for the substring and then checking whether each match is a prefix of the larger string. If the match is not a prefix, then the algorithm moves on to the next possible match.

#### Applications of Boyer-Moore

• String searching, such as finding occurrences of a word in a text document.
• Text editing, such as spell-checking and correcting typos.
• Pattern recognition, such as recognizing characters in an image.

10.  Topological Sort Algorithm

The topological sort algorithm is a graph theory algorithm used to find the order in which the vertices of a graph should be visited. The algorithm works by starting with the first vertex in the graph and visiting each vertex in order until all the vertices have been visited.

#### Applications of topological sorting

• Computer science - Topological sorting can be used to solve the DAG problem, also known as the Directed Acyclic Graph Problem.
• Networks - Can be used to create a network flow diagram.
• Operations research - Is used to solve Network Flow Problems.

11.  Labeled graph

A labeled graph is a graph with a set of labels or tags associated with each vertex and edge in the graph. The purpose of labeling a graph is to make it easier to identify the vertices and edges in the graph.

#### Applications of labeled graphs

• Computer science - Labeled graphs can be used to represent data structures, such as arrays, lists, and trees.
• Networks - Can be used to represent networks, such as social networks and transportation networks.
• Mathematics - Are used to represent mathematical structures, such as groups and rings.

12. A* algorithm

The A* algorithm finds the shortest path between two nodes in a graph. It is similar to Dijkstra’s algorithm, but it is a more sophisticated algorithm that considers the cost of each edge in the graph. The cost of an edge is usually determined by its length or by some other measure of distance, such as time or money.

#### Applications of A* algorithm

• It is commonly used in web-based maps and games to find the shortest path at the highest possible efficiency.
• A* is used in many artificial intelligence applications, such as search engines.
• It is used in other algorithms such as the Bellman-Ford algorithm to solve the shortest path problem.
• The A* algorithm is used in network routing protocols, such as RIP, OSPF, and BGP, to calculate the best route between two nodes.

13. Shortest path algorithm

The shortest path algorithm is a graph search algorithm that calculates the shortest path between two nodes in a graph. It is similar to the A* algorithm, but it is a simpler algorithm that does not take into account the cost of each edge in the graph.

#### Applications of shortest path algorithm

• Computer games - Finding the shortest/best route from one point to another is a common task in video games.
• Maps - Finding the shortest and/or most affordable route for a car from one city to another.
• Satellite navigation systems - to show drivers the fastest path they can drive from one point in a city to the other.

14. Minimum spanning tree algorithm

The minimum spanning tree algorithm entails connecting all vertices with the fewest total edge weights. In other words, it is a spanning tree that has the lowest possible sum of edge weights.

#### Applications of minimum spanning tree algorithm

• Network designs - cable and telephone networks use minimum spanning tree algorithms to create a list of possible connections with the lowest cost.
• Find approximate solutions to complex mathematical problems, e.g., the Traveling Salesman Problem.
• Real-time tracking and verification of human faces.
• Use to avoid network cycles in computer science protocols.

15.  Maximum flow algorithm

The maximum flow algorithm is a graph theory algorithm used to find the maximum possible flow between two nodes in a graph. It is similar to the Ford–Fulkerson algorithm, but it is a more sophisticated algorithm that considers each edge’s capacity in the graph.

#### Applications of maximum flow algorithm

• Network routing - The maximum flow algorithm can be used to calculate the maximum possible traffic that can flow through a network.
• Computer science - Can be used to find a solution to the Maximum Flow Problem, also known as the Max Flow/Min Cut problem.
• Operations research - Is used to solve Network Flow Problems.

16. Johnson’s Algorithm

Johnson’s algorithm finds the shortest paths between every pair of vertices in an edge-weighted directed graph. Edge weights can have negative numbers, but negative-weight cycles are not allowed. Johnson’s algorithm aims to reweigh all edges, make them positive, then run Dijkstra’s algorithm on each vertex.

#### Applications of Johnson’s algorithm

• Computer science - Johnson’s algorithm can be used to find a solution to the Shortest Path Problem.
• Networks - Finding all the shortest paths between nodes in a network.
• Operations research - Used to solve Network Flow Problems.

17.  Kosaraju’s Algorithm

Kosaraju’s algorithm is similar to Tarjan’s algorithm in that both are used to find strongly connected components of a graph. While Tarjan’s and Kosaraju’s algorithms have a linear time complexity, their SCC computing methodologies differ. Tarjan’s technique partitions the graph using nodes’ timestamps in a DFS, but Kosaraju’s method partitions the graph using two DFSs and is comparable to the method for discovering topological sorting.

#### Applications of Kosaraju’s algorithm

• Computer science - Kosaraju’s algorithm can be used to find a solution to the Strongly Connected Components Problem, also known as the Connected Components Problem.
• Used to obtain the Dulmage–Mendelsohn decomposition, a bipartite graph edges classification method.
• Used in social networks to discover groups of strongly connected people and make recommendations based on shared interests.

18. Connected components algorithm

The connected components algorithm is a graph theory algorithm used to find the connected components of a graph. It is a simple algorithm that takes into account the edges and vertices of a graph.

#### Applications of connected components algorithm

• Computer science - The connected components algorithm can be used to find a solution to the Connected Components Problem, also known as the Graph Isomorphism Problem.
• Networks - Can be used to find all the isolated nodes in a network.
• VLSI design - The algorithm can be used to find all the connections in a chip layout.
• Electronics - Can be used to find all the circuits in an electronic schematic.

19.  Fleury’s Algorithm

Fleury’s algorithm displays the Euler circuit or Euler path from a graph. The algorithm starts at one edge and moves adjacent vertices by removing previous ones. The graph gets less complicated in each step towards finding the Euler or circuit path.

#### Applications of Fleury’s algorithm

• Computer science - Fleury’s algorithm can be used to find a solution to the Euler Circuit Problem, also known as the Euler Path Problem.
• Networks - Can be used to find all the circuits in a network.

20.  Tarjan’s algorithm

Tarjan’s algorithm is a graph theory algorithm used to find the strongly connected components of a graph. A graph is considered strongly connected if each vertex can be reached from every other vertex.

#### Applications of Tarjan’s algorithm

• Computer science - Tarjan’s algorithm can be used to find a solution to the Strongly Connected Components Problem, also known as the Connected Components Problem.
• Used to obtain the Dulmage–Mendelsohn decomposition, a bipartite graph edges classification method.
• Used in social networks to discover groups of strongly connected people and make recommendations based on shared interests.

## Conclusion:

We can use graphs to solve many real-life problems. For instance, they are used in social networking sites like Facebook, Linkedin, etc. Each person can be denoted with a vertex on Facebook. Likewise, each node can contain information like name, gender, personID, etc. Further, we discussed some well-known graph algorithms like:

• Breadth-First Search - Computes the shortest path in a given graph using a Queue data structure.
• Depth-First Search - Uses a Stack data structure to find the shortest path in a given graph.
• Dijkstra Algorithm - Uses a Minimum priority queue data structure to find the shortest path between the source node and other given nodes in a graph.
• Bellman-Ford Algorithm - Single source shortest path algorithm like Dijkstra’s algorithm.
• Floyd Warshall Algorithm - Solves the All Pairs shortest path problem.
• Prim’s Algorithm - It finds the minimum spanning tree for an undirected weighted graph.
• Kruskal’s Algorithm - It finds the minimum spanning tree for a connected weighted graph.
• Topological Sort Algorithm - Represents a linear ordering of vertices in a directed acyclic graph.
• Johnson’s Algorithm - Finds the shortest path between every pair of vertices where weights can also be negative.
• Kosaraju’s Algorithm - Finds the strongly connected components in a graph.

#### Responses(0)  