Depth First Search

depth-first-search: The Ultimate Cheatsheet

1. Detailed Explanation

Depth-First Search (DFS) is a graph traversal and search algorithm that explores a graph as deeply as possible along each branch before backtracking. It starts at a root node (or an arbitrary node in an unrooted graph) and explores as far as possible along each branch before backtracking. This contrasts with Breadth-First Search (BFS), which explores all neighbors at the current depth before moving to the next depth.

Why is it important? DFS is crucial for solving various problems involving graphs and trees, including:

Finding paths: Determining if a path exists between two nodes.
Topological sorting: Ordering nodes in a directed acyclic graph (DAG).
Detecting cycles: Identifying cycles in a graph.
Connected components: Finding all connected components in an undirected graph.
Spanning trees: Constructing a spanning tree for a graph.
Solving puzzles: Like mazes or Sudoku.

Core concepts:

Graph: A collection of nodes (vertices) and edges connecting them.
Tree: A hierarchical graph with a single root node.
Node/Vertex: A data point in the graph.
Edge: A connection between two nodes.
Visited/Unvisited: A marker indicating whether a node has been explored.
Stack: DFS inherently uses a stack (either explicitly or implicitly through recursion) to manage the order of node exploration.

2. When to Use depth-first-search (and When Not To)

Use DFS when:

You need to explore a graph deeply along a single branch before exploring other branches.
You’re dealing with problems that require finding paths, cycles, or topological orderings.
The depth of the graph is relatively small compared to its breadth.

Don’t use DFS when:

You need to find the shortest path between two nodes (BFS is generally better for this).
The graph is extremely wide and shallow (BFS is more efficient in these cases).
Memory is a critical constraint and the graph’s depth is very large (recursion can lead to stack overflow).

3. Core Algorithm / Data Structure Template

The basic DFS algorithm can be implemented recursively or iteratively using a stack. Here’s a recursive pseudocode template:

function DFS(node):
  mark node as visited
  for each neighbor of node:
    if neighbor is not visited:
      DFS(neighbor)

The iterative version uses a stack:

function DFS_iterative(start_node):
  stack = [start_node]
  while stack is not empty:
    node = pop(stack)
    if node is not visited:
      mark node as visited
      push all unvisited neighbors of node onto stack

4. Code Implementations

Python

def dfs(graph, start, visited=None):
    """
    Performs Depth-First Search on a graph.

    Args:
        graph: A dictionary representing the graph where keys are nodes and values are lists of their neighbors.
        start: The starting node for the search.
        visited: A set to keep track of visited nodes (optional, for recursive calls).

    Returns:
        A list of visited nodes in DFS order.
    """
    if visited is None:
        visited = set()
    visited.add(start)
    print(f"Visited: {start}") #For demonstration purposes

    for neighbor in graph.get(start, []):  # Handle cases where a node has no neighbors
        if neighbor not in visited:
            dfs(graph, neighbor, visited)
    return visited

# Example graph
graph = {
    'A': ['B', 'C'],
    'B': ['D', 'E'],
    'C': ['F'],
    'D': [],
    'E': ['F'],
    'F': []
}

dfs(graph, 'A')

Java

import java.util.*;

public class DFS {

    public static void dfs(Map<Character, List<Character>> graph, char start, Set<Character> visited) {
        visited.add(start);
        System.out.print(start + " "); //For demonstration

        for (char neighbor : graph.getOrDefault(start, new ArrayList<>())) {
            if (!visited.contains(neighbor)) {
                dfs(graph, neighbor, visited);
            }
        }
    }

    public static void main(String[] args) {
        Map<Character, List<Character>> graph = new HashMap<>();
        graph.put('A', Arrays.asList('B', 'C'));
        graph.put('B', Arrays.asList('D', 'E'));
        graph.put('C', Arrays.asList('F'));
        graph.put('D', new ArrayList<>());
        graph.put('E', Arrays.asList('F'));
        graph.put('F', new ArrayList<>());

        Set<Character> visited = new HashSet<>();
        dfs(graph, 'A', visited);
    }
}

C++

#include <iostream>
#include <vector>
#include <map>
#include <set>

using namespace std;

void dfs(map<char, vector<char>>& graph, char start, set<char>& visited) {
    visited.insert(start);
    cout << start << " "; //For demonstration

    for (char neighbor : graph[start]) {
        if (visited.find(neighbor) == visited.end()) {
            dfs(graph, neighbor, visited);
        }
    }
}

int main() {
    map<char, vector<char>> graph;
    graph['A'] = {'B', 'C'};
    graph['B'] = {'D', 'E'};
    graph['C'] = {'F'};
    graph['D'] = {};
    graph['E'] = {'F'};
    graph['F'] = {};

    set<char> visited;
    dfs(graph, 'A', visited);
    cout << endl;
    return 0;
}

5. Complexity Analysis

Operation	Time Complexity	Space Complexity
DFS (recursive)	O(V + E)	O(V) (recursive stack)
DFS (iterative)	O(V + E)	O(V) (explicit stack)

Where:

V = number of vertices (nodes)
E = number of edges

Best, Average, and Worst Case: The time and space complexity remain the same in best, average, and worst cases because DFS visits every node and edge once regardless of the graph’s structure. The worst case occurs when the graph is a complete graph or a very deep tree.

6. Pro Tips, Tricks, and Common Pitfalls

Pro Tips:

Iterative vs. Recursive: The iterative approach using a stack avoids potential stack overflow errors for very deep graphs.
Cycle Detection: Maintain a “visited” and a “recursion stack” to detect cycles in a graph. Nodes in the recursion stack are currently being processed.
Optimization: For large graphs, consider using adjacency lists instead of adjacency matrices for better space efficiency.

Common Pitfalls:

Stack Overflow (Recursive): Recursive DFS can lead to stack overflow errors for very deep graphs. Use iterative DFS to mitigate this.
Missing Base Case (Recursive): Ensure your recursive function has a proper base case to stop the recursion.
Incorrectly Handling Visited Nodes: Failing to mark nodes as visited can lead to infinite loops in graphs with cycles.
Forgetting to handle disconnected components: If your graph is not connected, you may need to start DFS from multiple nodes to visit all nodes.

7. Classic Problem Examples

Example: Word Search

Description: Given an m x n grid of characters board and a string word, return true if word exists in the grid. The word can be constructed from letters of sequentially adjacent cells, where adjacent cells are horizontally or vertically neighboring. The same letter cell may not be used more than once.

High-level Approach using DFS:

Iterate through the board: Start at each cell as a potential starting point for the word.
Recursive DFS: For each starting cell, perform a depth-first search to explore adjacent cells.
Check for word match: As you traverse the board using DFS, build the current word. If it matches the target word, return true.
Backtracking: If a path doesn’t lead to the target word, backtrack and explore other paths.
Mark visited cells: Mark cells as visited to avoid revisiting them.

The DFS approach efficiently explores all possible paths from each starting cell to check for the word. The backtracking ensures that all possible paths are considered without getting stuck in infinite loops. You would use a helper function that recursively explores the neighbors, checking for valid moves and the word match.