Conquering LeetCode 417: A Deep Dive into Pacific Atlantic Water Flow with DFS

Welcome, fellow coding adventurers! Today, we're setting our compass for a fascinating journey into LeetCode problem 417: "Pacific Atlantic Water Flow." This problem is a brilliant test of your graph traversal skills, specifically your mastery of Depth-First Search (DFS) or Breadth-First Search (BFS). It's a common pattern in technical interviews, making it a crucial concept to grasp.

Imagine a continent represented by a grid of varying heights. On one side, the mighty Pacific Ocean laps at its shores. On the other, the vast Atlantic Ocean stretches out. Our challenge? To find all the cells on this continent from which water can flow to both oceans.

Let's dive in and explore the terrain!

Unpacking the Problem Statement

We are given an m x n integer matrix, heights, where heights[r][c] represents the height of the cell at coordinate (r, c).

Here's how the oceans are defined:

  • Pacific Ocean: Touches the top (r=0) and left (c=0) edges of the matrix.
  • Atlantic Ocean: Touches the bottom (r=m-1) and right (c=n-1) edges of the matrix.

Water can flow from a cell to an adjacent cell (up, down, left, right) if the adjacent cell's height is less than or equal to the current cell's height. Think of it like a topographic map – water always flows downhill or stays level.

Our ultimate goal is to return a list of [r, c] coordinates for all cells from which water can reach both the Pacific and Atlantic oceans.

Why This Problem is Tricky

A common initial thought might be to iterate through every cell, perform a DFS/BFS from that cell, and check if it can reach both oceans. While conceptually sound, this approach would be highly inefficient. If the grid is m x n, and each DFS/BFS takes O(m*n) time, the total complexity would be O((m*n)^2), which is too slow for larger grids. We need a smarter strategy!

The Eureka Moment: Flowing Upstream

The key insight to efficiently solve this problem lies in reversing the perspective of water flow. Instead of trying to determine where water can flow to from a source cell, let's determine which cells can be reached from the oceans.

Think about it: if water can flow from cell A to cell B (meaning height[A] >= height[B]), then conversely, we can say that cell A is "reachable" from cell B if we imagine water flowing uphill.

This leads us to our core strategy:

  1. Identify all cells reachable by the Pacific Ocean: Start a traversal (DFS or BFS) from all cells bordering the Pacific Ocean. During this traversal, mark all cells from which water could have originated and flowed into the Pacific.
  2. Identify all cells reachable by the Atlantic Ocean: Do the same for the Atlantic Ocean, starting from its borders and marking all cells that could flow into the Atlantic.
  3. Find the Intersection: The cells that are marked as reachable by both the Pacific and Atlantic oceans are our answer!

Algorithm Breakdown: DFS to the Rescue

We'll use Depth-First Search (DFS) for this traversal, but BFS would work equally well.

Step-by-Step Implementation

  1. Initialization:

    • Get the dimensions m (rows) and n (columns) of the heights matrix.
    • Create two boolean matrices, can_reach_pacific and can_reach_atlantic, both of size m x n, initialized to False. These will store whether a cell can reach each respective ocean.
    • Define directions for DFS: [(0, 1), (0, -1), (1, 0), (-1, 0)] (right, left, down, up).
  2. DFS Function:

    • dfs(r, c, visited_matrix, prev_height):
      • Base Cases:
        • If (r, c) is out of bounds, return.
        • If (r, c) has already been visited (marked True in visited_matrix), return.
        • If heights[r][c] is less than prev_height, return (water cannot flow uphill from a lower height, which is the reverse logic).
      • Mark Current Cell: Set visited_matrix[r][c] = True.
      • Recurse: For each (dr, dc) in directions:
        • Calculate new_r = r + dr, new_c = c + dc.
        • Call dfs(new_r, new_c, visited_matrix, heights[r][c]). Note: heights[r][c] becomes the prev_height for the next call, ensuring the "flow upstream" condition (current_height >= next_cell_height) is maintained.
  3. Populate can_reach_pacific:

    • Iterate through the first row (r=0) and the first column (c=0). These are the direct borders of the Pacific.
    • For each cell (0, c) and (r, 0): call dfs(r, c, can_reach_pacific, -1). We use -1 as prev_height to ensure water can always flow "into" the ocean from any bordering cell.
  4. Populate can_reach_atlantic:

    • Iterate through the last row (r=m-1) and the last column (c=n-1). These are the direct borders of the Atlantic.
    • For each cell (m-1, c) and (r, n-1): call dfs(r, c, can_reach_atlantic, -1).
  5. Collect Results:

    • Initialize an empty list result.
    • Iterate through the entire heights matrix from r=0 to m-1 and c=0 to n-1.
    • If can_reach_pacific[r][c] is True AND can_reach_atlantic[r][c] is True, add [r, c] to the result list.
  6. Return result.

Python Code Implementation

class Solution:
    def pacificAtlantic(self, heights: list[list[int]]) -> list[list[int]]:
        m = len(heights)
        n = len(heights[0])

        # Two boolean matrices to track reachability from Pacific and Atlantic
        can_reach_pacific = [[False] * n for _ in range(m)]
        can_reach_atlantic = [[False] * n for _ in range(m)]

        # Directions for DFS: right, left, down, up
        directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]

        def dfs(r, c, visited_matrix, prev_height):
            # Base cases for DFS:
            # 1. Out of bounds
            # 2. Already visited
            # 3. Current cell height is less than previous cell's height (water can't flow "upstream" if current is lower)
            if not (0 <= r < m and 0 <= c < n) or visited_matrix[r][c] or heights[r][c] < prev_height:
                return

            # Mark current cell as reachable
            visited_matrix[r][c] = True

            # Explore neighbors
            for dr, dc in directions:
                new_r, new_c = r + dr, c + dc
                # The 'prev_height' for the next recursive call is the current cell's height
                # This ensures we only "flow upstream" (or stay level)
                dfs(new_r, new_c, visited_matrix, heights[r][c])

        # 1. Start DFS from cells bordering the Pacific Ocean
        # Top row (r=0)
        for c in range(n):
            dfs(0, c, can_reach_pacific, -1) # -1 ensures any height is valid to start from the ocean

        # Left column (c=0), excluding (0,0) which is covered by the top row loop
        for r in range(1, m):
            dfs(r, 0, can_reach_pacific, -1)

        # 2. Start DFS from cells bordering the Atlantic Ocean
        # Bottom row (r=m-1)
        for c in range(n):
            dfs(m - 1, c, can_reach_atlantic, -1)

        # Right column (c=n-1), excluding (m-1,n-1) which is covered by the bottom row loop
        for r in range(m - 1): # Note: iterate up to m-2 to avoid (m-1, n-1) duplicate if m-1 >= 0
            dfs(r, n - 1, can_reach_atlantic, -1)


        # 3. Collect the result: cells reachable by both oceans
        result = []
        for r in range(m):
            for c in range(n):
                if can_reach_pacific[r][c] and can_reach_atlantic[r][c]:
                    result.append([r, c])

        return result

Complexity Analysis

  • Time Complexity: O(m * n)

    • We perform two independent DFS traversals. Each DFS visits every cell at most once. For each cell, we perform constant work (checking neighbors).
    • The final iteration to collect results also takes O(m * n).
    • Therefore, the total time complexity is proportional to the number of cells in the grid.
  • Space Complexity: O(m * n)

    • We use two boolean matrices (can_reach_pacific and can_reach_atlantic), each of size m x n.
    • The recursion stack for DFS in the worst case could go O(m * n) deep (if the grid is a long, winding path).
    • Thus, the total space complexity is O(m * n).

Why This Approach Shines

This "flow upstream" or "reverse traversal" technique is incredibly powerful for problems where you need to find reachability from multiple sources. Instead of (m*n) separate traversals, you consolidate it into a fixed number of traversals (in this case, two: one for Pacific, one for Atlantic). It dramatically reduces the time complexity and is a common optimization strategy in graph problems.

Conclusion

LeetCode 417, "Pacific Atlantic Water Flow," is a fantastic problem that reinforces the power of strategic thinking in algorithm design. By transforming the problem from a "flow downhill" to a "flow upstream" perspective, we unlock an efficient DFS/BFS solution. This problem is more than just coding; it's about understanding how to model real-world scenarios (like water flow) into graph problems and applying optimal traversal techniques.

Keep practicing, keep exploring different graph traversal variations, and you'll be well-prepared to conquer similar challenges in your next coding interview! Happy coding!