A graph models a set of connections. Graphs are made up of nodes and edges. A node can be directly connected to many other nodes. Those nodes are called its neighbors.

A tree is a special type of graph, where no edges ever point back.

BFS can help answer two types of questions:

  1. Is there a path from node A to node B?

  2. What is the shortest path from node A to node B?

Even though we are talking about BFS of Graph here, the traversal technique applies to Trees as well. We will try to understand Tree traversal first and build on top of that strategy to understand Graph traversal. There are only two classic ways to traverse a Binary Tree iteratively - Breadth First and Depth First. BFS uses a Queue while DFS uses a Stack. We will get into the details of why it is that way in a minute.

In BFS or Breadth First Search traversal, the aim is to visit or process your neighbouring nodes first followed by it’s neighbouring nodes. To say it in another way, visit your friends first followed by FOF’s (friends friend).

  (1)
  / \
(2) (3)
      \
      (4)

Consider the tree above with root node (1), when we do BFS traversal, the aim is to visit the immediate neighbours first, So we start from (1) which is the root node, followed by it’s two immediate neighbours (2) and (3) followed by (4), which is (3)’s neighbour. Now the question is whether to use Queue or stack for this?

When we do a graph or tree traversal, what we are essentially doing is, visit a node, collect all it’s adjacent nodes and put it into a bag for later processing.

1. Add root node to bag.
2. Pick the next node from bag for processing.
3. Process the node.
3. Add all it's childerns to the bag.
4. Continue to #2.

If you think about our requirement for BFS, our aim is to process (2) and (3) ahead of (4). Basically we want to process the nodes in the same order that we encountered them… which means, First Encountered node must be Processed First. We need a FIFO datastructure for the bag which is Queue.

If you can understand this, then you can easily understand why DFS requires a Stack. In DFS we want to process the last encountered node first. So the processing order will be (4) followed by (3) followed by (2) followed by (1).

Coming back to the Graph from where we started, the only difference between Graph and Tree traversal is that, Graph traversal requires an additional datastructure to keep track of visited nodes. Otherwise we will end up in infinite loops. Usually a hashtable will be used to mark a node as visited to avoid duplicate processing and cycles. We will understand this more when we look at the actual code.

Now we will look at some code for BFS of Tree. BFS is also known as Level Order Traversal. As the name indicates, in this traversal, the nodes in a Level are processed first before moving to the next level.

class Node:
    def __init__(self, data):
        self.data = data
        self.left = None
        self.right = None
        
def BFS(root):
    if root is None:
        return
    q = []
    q.append(root)
    
    while q:
        nodeToBeProcessed = q.pop(0)
        print(nodeToBeProcessed.data)
        if nodeToBeProcessed.left:
            q.append(nodeToBeProcessed.left)
        if nodeToBeProcessed.right:    
            q.append(nodeToBeProcessed.right)

#Main
root = Node(1)
root.left=Node(2)
root.right=Node(3)
root.right.left=Node(4)

BFS(root)

You may tryout this code and visualize the execution here.

The output of executing this code is given below.

1
2
3
4

You may have noticed that it is not really printing the output in level order. Ideal Level order traversal output is this one.

1
2,3
4

Let’s try to modify the above code to print the output in level order. In order to do that, we need to keep track of the level associated with each node. Which means, instead of just putting the node into the queue, we will have to put the node and it’s level into the queue.

class Node:
    def __init__(self, data):
        self.data = data
        self.left = None
        self.right = None
        
def BFS(root):
    if root is None:
        return
    q = []
    level = 0
    q.append((root, level))
    previousLevel = 0
    
    while q:
        nodeToBeProcessed, poppedLevel = q.pop(0)
        
        #Logic to print a newline after each level
        if poppedLevel != previousLevel:
            print("")
            previousLevel = poppedLevel
            
        print(nodeToBeProcessed.data, end =",")
        if nodeToBeProcessed.left:
            q.append((nodeToBeProcessed.left, poppedLevel+1))
        if nodeToBeProcessed.right:    
            q.append((nodeToBeProcessed.right, poppedLevel+1))

#Main
root = Node(1)
root.left=Node(2)
root.right=Node(3)
root.right.left=Node(4)

BFS(root)

The above code produces an output of this form. To keep things simple let’s ignore the extra “,” at the end for now.

1,
2,3,
4,

You may tryout this code and visualize the execution here.

I know even though we started with Graph we took a Detour and reached Tree. Now we will get back to the Graph. Let’s start with Graph representation. A Graph consists of Nodes and connections to neighbouring Nodes.

 (Bob)     (Mary)
      ^     ^
       \   /  
        \ /
       (You)
         |
         |
         \/
       (Alice)

In the example Graph above, you are connected to Bob, Mary and Alice. How do you represent You -> Bob? That’s exactly what a hashtable or dictionary is for. Remember a hashtable or dictionary allows you to map a key to a value.

Here is how you write the above Graph in python :

graph = {}
graph["You"] = ["Alice", "Bob", "Mary"]

The key “You” is mapped to the value which is an Array containing all neighbours of “You”.

We will introduce BFS in Graph using an interesting problem. The problem is to find out whether there is any Mango seller in your friends circle. If one of your friend is a Mango seller his name should be printed. If there are no Mango sellers in your immediate friends circle, then you can expand the search to Friends of friends (FOF’s). Basically, you have a Graph with you and your friends and friends friends as nodes and you have to traverse the Graph in some order and find a Mango seller. Now, the question is, in what order do you traverse the Nodes? Our aim is to find the most immediate neighbour/friend who is a Mango seller. So do we use BFS or DFS?

In BFS, we visit the most immediate neighbour Node first. So, if we start BFS from the Node “You” (the root node), we are sure to meet the most immediate Mango seller friend of “You”.

I didn’t mention how we are going to identify a Mango seller. For the purpose of this exercise, we can assume that if the person’s name starts with M, then he/she is a mango seller. Below, is the algorithm that we will use for BFS on the friends Graph.

1. Create a Queue and seed it with all your immediate friends.
2. Pop a person off the queue.
3. Check if he is a mango seller (Name starts with M).
4. YES - You are done. Print the name and exit.
5. NO - Add all his friends to the queue.
6. Loop back to Step 2.
def find_mango_seller(graph):
    q = []
    #Add all your neighbours to the search Queue
    q.extend(graph["You"])
    while len(q) != 0:
        person = q.pop(0)
        if person_is_mango_seller(person):
            print("Found a Mango seller", person)
            break
        else:
            # Add all his friends to the q
            q.extend(graph.get(person, []))

def person_is_mango_seller(person):
    return person[0] == 'M'
    
#Main
graph = {}
graph["You"] = ["Alice", "Bob", "Mary"]
find_mango_seller(graph)

You may try out the code here to get a visual idea of the execution steps.

BFS on Adjacency matrix using Java

Now we will try BFS on an adjacency matrix Graph using Java. The Graph used in this example is taken from Geeks For Geeks.

package graph;

import java.util.Deque;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.Set;

// https://www.geeksforgeeks.org/breadth-first-search-or-bfs-for-a-graph/
public class BFSOnAdjacencyMatrix {
    public static void main(String[] args) {
        int[][] G = new int[][]{
                {0, 1, 1, 0},
                {0, 0, 1, 0},
                {1, 0, 0, 1},
                {0, 0, 0, 1}
        };

        BFS(G, 2);
    }

    private static void BFS(int[][] G, int startNode) {
        Deque<Integer> queue = new LinkedList<>();
        Set<Integer> visitedNodes = new HashSet<>();
        queue.add(startNode);

        while(!queue.isEmpty()) {
            Integer node = queue.poll();
            if (visitedNodes.contains(node)) {
                continue;
            }

            visitCurrentNode(node, visitedNodes);

            addNeighboursToQueue(G, queue, visitedNodes, node);
        }
    }

    private static void visitCurrentNode(Integer node, Set<Integer> visitedNodes) {
        System.out.println("visited "+ node);
        visitedNodes.add(node);
    }

    private static void addNeighboursToQueue(int[][] G,
                                             Deque<Integer> queue,
                                             Set<Integer> visitedNodes,
                                             Integer node) {
        // Since the Graph also contains a node 0, it's a 1:1 mapping.
        int nodeNdx = node;
        for(int i=0;i<G[nodeNdx].length;i++) {
            if (G[nodeNdx][i] == 1 && !visitedNodes.contains(i)) {
                queue.add(i);
            }
        }
    }
}

Next step

Attempt a BFS problem on Hacker Rank. If you need some help with the problem checkout this youtube video by Gayle McDowell.

Inspired by this blog.