Paul Ngugi

Posted on

# Finding the Closest Pair of Points Using Divide-and-Conquer

This section presents efficient algorithms for finding the closest pair of points using divide-and-conquer. Given a set of points, the closest-pair problem is to find the two points that are nearest to each other. As shown in Figure below, a line is drawn to connect the two nearest points in the closest-pair animation.

Case Study: Finding the Closest Pair, presented a brute-force algorithm for finding the closest pair of points. The algorithm computes the distances between all pairs of points and finds the one with the minimum distance. Clearly, the algorithm takes O(n^2) time. Can we design a more efficient algorithm?

We will use an approach called divide-and-conquer to solve this problem. The approach divides the problem into subproblems, solves the subproblems, then combines the solutions of the subproblems to obtain the solution for the entire problem. Unlike the dynamic programming approach, the subproblems in the divide-and-conquer approach don’t overlap. A subproblem is like the original problem with a smaller size, so you can apply recursion to solve the problem. In fact, all the solutions for recursive problems follow the divide-and-conquer approach.

The steps below describes how to solve the closest pair problem using the divide-and-conquer approach.

• Step 1: Sort the points in increasing order of x-coordinates. For the points with the same x-coordinates, sort on y-coordinates. This results in a sorted list S of points.
• Step 2: Divide S into two subsets, S1 and S2, of equal size using the midpoint in the sorted list. Let the midpoint be in S1. Recursively find the closest pair in S1 and S2. Let d1 and d2 denote the distance of the closest pairs in the two subsets, respectively.
• Step 3: Find the closest pair between a point in S1 and a point in S2 and denote their distance as d3. The closest pair is the one with the distance min(d1, d2, d3).

Selection sort takes O(n^2) time. Step 1 can be done in O(n log n) time. Step 3 can be done in O(n) time. Let d = min(d1, d2). We already know that the closest pair distance cannot be larger than d. For a point in S1 and a point in S2 to form the closest pair in S, the left point must be in stripL and the right point in stripR, as illustrated in Figure below (a).

For a point p in stripL, you need only consider a right point within the d X 2d rectangle, as shown in below (b). Any right point outside the rectangle cannot form the closest pair with p. Since the closest-pair distance in S2 is greater than or equal to d, there can be at most six
points in the rectangle. Thus, for each point in stripL, at most six points in stripR need to be considered.

For each point p in stripL, how do you locate the points in the corresponding d X 2d rectangle area in stripR? This can be done efficiently if the points in stripL and stripR are sorted in increasing order of their y-coordinates. Let pointsOrderedOnY be the list of the points sorted in increasing order of y-coordinates. pointsOrderedOnY can be obtained beforehand in the algorithm. stripL and stripR can be obtained from pointsOrderedOnY in Step 3 as shown in code below.

```for each point p in pointsOrderedOnY if (p is in S1 and mid.x – p.x <= d) append p to stripL; else if (p is in S2 and p.x - mid.x <= d) append p to stripR;```

Let the points in stripL and stripR be {p0, p1, ... , pk} and {q0, q1, ... , qt}, as shown in Figure above (c). The closest pair between a point in stripL and a point in stripR can be found using the algorithm described in code below.

``````d = min(d1, d2);
r = 0; // r is the index of a point in stripR
for (each point p in stripL) {
// Skip the points in stripR below p.y - d
while (r < stripR.length && q[r].y <= p.y - d)
r++;

let r1 = r;
while (r1 < stripR.length && |q[r1].y – p.y| <= d) {
// Check if (p, q[r1]) is a possible closest pair
if (distance(p, q[r1]) < d) {
d = distance(p, q[r1]);
(p, q[r1]) is now the current closest pair;
}

r1 = r1 + 1;
}
}
``````

The points in stripL are considered from p0, p1, ... , pk in this order. For a point p in stripL, skip the points in stripR that are below p.y – d (lines 5–6). Once a point is skipped, it will no longer be considered. The while loop (lines 9–17) checks whether (p, q[r1]) is a possible closest pair. There are at most six such q[r1] pairs, because the distance between two points in stripR cannot be less than d. So the complexity for finding the closest pair in Step 3 is O(n).

Note that Step 1 in steps above is performed only once to presort the points. Assume that all the points are presorted. Let T(n) denote the time complexity for this algorithm. Thus,

Therefore, the closest pair of points can be found in O(n log n) time.