Shortest Path

shortest-path: The Ultimate Cheatsheet

1. Detailed Explanation

What is shortest-path? Shortest-path algorithms aim to find the path with the minimum weight (cost, distance, time, etc.) between two vertices (nodes) in a graph. The weight can be associated with edges or vertices, depending on the problem.
Why is it important and what kind of problems does it solve? Shortest-path algorithms are fundamental in many applications, including:
- Navigation: Finding the shortest route between two locations.
- Network Routing: Determining the most efficient path for data packets to travel in a network.
- Resource Allocation: Optimizing the allocation of resources by finding the least costly path.
- Game AI: Pathfinding for game characters.
- Social Networks: Finding connections between people.
Core concepts, underlying principles, and key terminology:
- Graph: A collection of vertices (nodes) and edges connecting them.
- Weighted Graph: A graph where each edge has a weight associated with it.
- Directed Graph: A graph where edges have a direction (e.g., one-way street).
- Undirected Graph: A graph where edges have no direction (e.g., two-way street).
- Path: A sequence of vertices connected by edges.
- Shortest Path: The path with the minimum total weight between two given vertices.
- Source Vertex: The starting vertex for finding the shortest path.
- Destination Vertex: The ending vertex for finding the shortest path.
- Relaxation: The process of updating the estimated shortest distance to a vertex if a shorter path is found.
- Negative Cycle: A cycle in a graph where the sum of the edge weights is negative. The presence of negative cycles means that a shortest path is not well-defined, as one can traverse the cycle infinitely many times to reduce the path weight.
- Single-Source Shortest Path (SSSP): Finding the shortest paths from a single source vertex to all other vertices in the graph.
- All-Pairs Shortest Path: Finding the shortest paths between all pairs of vertices in the graph.

2. When to Use shortest-path (and When Not To)

Identify problem patterns that suggest shortest-path is a good fit:
- Problems that explicitly ask for the “shortest,” “fastest,” “cheapest,” or “most efficient” path between two points.
- Problems that involve finding the minimum cost to reach a destination, where the cost can be represented as edge weights in a graph.
- Problems that require traversing a graph-like structure (e.g., a grid, a network) with varying costs associated with each step.
Discuss scenarios where a different data structure or algorithm would be more appropriate:
- Simple Tree Traversal: If the graph is a tree, a simple depth-first search (DFS) or breadth-first search (BFS) may be sufficient if edge weights are uniform (or irrelevant).
- Unweighted Graph: If all edges have the same weight, BFS is often the most efficient way to find the shortest path because it explores vertices in order of their distance from the source.
- No Path Required: If the goal is simply to determine if a path exists, rather than finding the shortest one, BFS or DFS are sufficient.
- Topological Sort: If the graph is a Directed Acyclic Graph (DAG) and you want to process vertices in a specific order based on dependencies, topological sort is more relevant.
- Minimum Spanning Tree: If the goal is to connect all vertices in a graph with the minimum total edge weight, without necessarily finding the shortest path between any two specific vertices, algorithms like Kruskal’s or Prim’s are more appropriate.

3. Core Algorithm / Data Structure Template

General Shortest-Path Algorithm Template (Dijkstra’s Algorithm):

Initialization:
- Create a distance array dist to store the shortest distance from the source vertex to each vertex. Initialize all elements to infinity (or a very large number) except for the source vertex, which is initialized to 0.
- Create a priority queue (min-heap) pq to store vertices, prioritized by their current shortest distance from the source.
- Add the source vertex to the priority queue.
Iteration:
- While the priority queue is not empty:
  - Extract the vertex u with the minimum distance from the priority queue.
  - For each neighbor v of u:
    - Calculate the distance from the source to v through u: new_dist = dist[u] + weight(u, v).
    - If new_dist is less than the current shortest distance to v (dist[v]):
      - Update dist[v] to new_dist.
      - Add v to the priority queue (or update its priority if it’s already in the queue, depending on the priority queue implementation).
Result:
- The dist array now contains the shortest distances from the source vertex to all other vertices.

Note: Dijkstra’s algorithm only works for graphs with non-negative edge weights. For graphs with negative edge weights, use the Bellman-Ford algorithm.

4. Code Implementations (Python, Java, C++)

Python

import heapq

def dijkstra(graph, start):
    """
    Finds the shortest paths from a start node to all other nodes in a graph using Dijkstra's algorithm.

    Args:
        graph: A dictionary representing the graph, where keys are nodes and values are lists of (neighbor, weight) tuples.
        start: The starting node.

    Returns:
        A dictionary mapping each node to its shortest distance from the start node.
    """

    distances = {node: float('inf') for node in graph}
    distances[start] = 0
    pq = [(0, start)]  # (distance, node)

    while pq:
        dist, u = heapq.heappop(pq)

        if dist > distances[u]:
            continue  # Optimization: Skip if we've already processed a shorter path to u

        for v, weight in graph[u]:
            new_dist = dist + weight
            if new_dist < distances[v]:
                distances[v] = new_dist
                heapq.heappush(pq, (new_dist, v))

    return distances

# Example usage:
graph = {
    'A': [('B', 5), ('C', 2)],
    'B': [('D', 1), ('E', 4)],
    'C': [('B', 8), ('F', 7)],
    'D': [('E', 3)],
    'E': [('F', 6)],
    'F': []
}

start_node = 'A'
shortest_paths = dijkstra(graph, start_node)
print(f"Shortest paths from {start_node}: {shortest_paths}")

Java

import java.util.*;

public class Dijkstra {

    public static Map<String, Integer> dijkstra(Map<String, List<Pair<String, Integer>>> graph, String start) {
        Map<String, Integer> distances = new HashMap<>();
        for (String node : graph.keySet()) {
            distances.put(node, Integer.MAX_VALUE);
        }
        distances.put(start, 0);

        PriorityQueue<Pair<Integer, String>> pq = new PriorityQueue<>(Comparator.comparingInt(Pair::getKey));
        pq.offer(new Pair<>(0, start));

        while (!pq.isEmpty()) {
            Pair<Integer, String> current = pq.poll();
            int dist = current.getKey();
            String u = current.getValue();

            if (dist > distances.get(u)) {
                continue; // Optimization
            }

            for (Pair<String, Integer> neighbor : graph.get(u)) {
                String v = neighbor.getKey();
                int weight = neighbor.getValue();
                int newDist = dist + weight;

                if (newDist < distances.get(v)) {
                    distances.put(v, newDist);
                    pq.offer(new Pair<>(newDist, v));
                }
            }
        }

        return distances;
    }

    public static void main(String[] args) {
        Map<String, List<Pair<String, Integer>>> graph = new HashMap<>();
        graph.put("A", Arrays.asList(new Pair<>("B", 5), new Pair<>("C", 2)));
        graph.put("B", Arrays.asList(new Pair<>("D", 1), new Pair<>("E", 4)));
        graph.put("C", Arrays.asList(new Pair<>("B", 8), new Pair<>("F", 7)));
        graph.put("D", Arrays.asList(new Pair<>("E", 3)));
        graph.put("E", Arrays.asList(new Pair<>("F", 6)));
        graph.put("F", new ArrayList<>());

        String startNode = "A";
        Map<String, Integer> shortestPaths = dijkstra(graph, startNode);
        System.out.println("Shortest paths from " + startNode + ": " + shortestPaths);
    }

    // Helper class for storing pairs
    static class Pair<K, V> {
        private final K key;
        private final V value;

        public Pair(K key, V value) {
            this.key = key;
            this.value = value;
        }

        public K getKey() {
            return key;
        }

        public V getValue() {
            return value;
        }
    }
}

C++

#include <iostream>
#include <vector>
#include <queue>
#include <limits>
#include <map>

using namespace std;

map<char, int> dijkstra(map<char, vector<pair<char, int>>>& graph, char start) {
    map<char, int> distances;
    for (auto const& [node, neighbors] : graph) {
        distances[node] = numeric_limits<int>::max();
    }
    distances[start] = 0;

    priority_queue<pair<int, char>, vector<pair<int, char>>, greater<pair<int, char>>> pq;
    pq.push({0, start});

    while (!pq.empty()) {
        int dist = pq.top().first;
        char u = pq.top().second;
        pq.pop();

        if (dist > distances[u]) {
            continue; //Optimization
        }

        for (auto& neighbor : graph[u]) {
            char v = neighbor.first;
            int weight = neighbor.second;
            int new_dist = dist + weight;

            if (new_dist < distances[v]) {
                distances[v] = new_dist;
                pq.push({new_dist, v});
            }
        }
    }

    return distances;
}

int main() {
    map<char, vector<pair<char, int>>> graph;
    graph['A'] = {{'B', 5}, {'C', 2}};
    graph['B'] = {{'D', 1}, {'E', 4}};
    graph['C'] = {{'B', 8}, {'F', 7}};
    graph['D'] = {{'E', 3}};
    graph['E'] = {{'F', 6}};
    graph['F'] = {};

    char start_node = 'A';
    map<char, int> shortest_paths = dijkstra(graph, start_node);

    cout << "Shortest paths from " << start_node << ":\n";
    for (auto const& [node, distance] : shortest_paths) {
        cout << node << ": " << distance << endl;
    }

    return 0;
}

5. Complexity Analysis

Dijkstra’s Algorithm

Operation	Time Complexity	Space Complexity	Best Case	Average Case	Worst Case
Initialization	O(V)	O(V)	O(V)	O(V)	O(V)
Build Priority Queue	O(V log V)	O(V)	O(1) (if source is already the closest)	O(V log V)	O(V log V)
Extract Min	O(log V)	O(1)	O(log 1) = O(1) (first extraction)	O(log V)	O(log V)
Relaxation	O(E log V)	O(1)	O(1) (no updates)	O(E log V)	O(E log V)
Total	O(E log V)	O(V)	O(V + E) (E close to 0)	O(E log V)	O(E log V)

V: Number of vertices
E: Number of edges

Note: The time complexity can be improved to O(E + V log V) if a Fibonacci heap is used as the priority queue, but this is rarely used in practice due to its high constant factors. The space complexity is primarily determined by the distance array and the priority queue, both of which store information for each vertex.

6. Pro Tips, Tricks, and Common Pitfalls

Use Appropriate Data Structures: Choose the right data structure for the priority queue. A binary heap (like heapq in Python or PriorityQueue in Java/C++) is a good default choice.
Handle Disconnected Graphs: If the graph is not connected, some vertices may remain with infinite distance after running Dijkstra’s. Check for this and handle it appropriately (e.g., return -1 or a special “no path” value).
Dijkstra’s Doesn’t Work with Negative Edge Weights: Use Bellman-Ford or other algorithms designed for negative weights.
Bellman-Ford Detects Negative Cycles: Bellman-Ford can be used to detect negative cycles. After running Bellman-Ford for V-1 iterations, perform one more iteration. If any distance is updated, it indicates the presence of a negative cycle.
Optimization: The line if dist > distances[u]: continue in the Dijkstra implementations is a crucial optimization. Without it, the algorithm can become significantly slower, especially in dense graphs. This check avoids processing vertices that have already been processed with a shorter distance.
Integer Overflow: When calculating new_dist, be mindful of potential integer overflow, especially when dealing with large edge weights. Consider using a larger data type (e.g., long in Java/C++ or int in Python) or using appropriate overflow checks.

7. Classic Problem Examples

Example: Evaluate Division

High-Level Approach:

Graph Representation: Represent the equations as a directed weighted graph. Each variable (e.g., “a”, “b”, “c”) is a node. If equations[i] = ["a", "b"] and values[i] = 2.0, create a directed edge from “a” to “b” with weight 2.0, and a directed edge from “b” to “a” with weight 1/2.0.
Shortest Path Algorithm (Modified): For each query ["Cj", "Dj"], find the shortest path (product of edge weights) from Cj to Dj. You can use a modified version of Dijkstra’s or BFS. Instead of summing edge weights, you multiply them.
Handle No Path: If there is no path from Cj to Dj, return -1.0.
Handle Undefined Variables: If either Cj or Dj is not present in the graph (i.e., not a variable in any equation), return -1.0.

import collections

def calcEquation(equations, values, queries):
    graph = collections.defaultdict(dict)

    for (a, b), val in zip(equations, values):
        graph[a][b] = val
        graph[b][a] = 1 / val

    def bfs(start, end):
        if start not in graph or end not in graph:
            return -1.0

        q = collections.deque([(start, 1.0)])
        visited = set()

        while q:
            node, product = q.popleft()
            if node == end:
                return product

            visited.add(node)
            for neighbor, val in graph[node].items():
                if neighbor not in visited:
                    q.append((neighbor, product * val))

        return -1.0

    results = [bfs(q[0], q[1]) for q in queries]
    return results