Binary Search: O(log n) Search on Sorted Data

Binary search finds a target in a sorted collection by repeatedly halving the search range: look at the middle element, and because the data is ordered, you know whether the target lies in the left half or the right half — so you throw the other half away. Each comparison eliminates half of what remains, giving O(log n) time. It is the single most important algorithm in this section.

The prerequisite: the data must be sorted

Binary search relies on order. If a[mid] is less than the target, every element to the left is also less, so the target (if present) must be on the right. That deduction only holds when the array is sorted. On unsorted data, use linear search instead.

The canonical implementation

We track an inclusive range [lo, hi]. While the range is non-empty (lo <= hi), test the midpoint and shrink toward the target.

int binarySearch(const std::vector<int>& a, int target) {
    int lo = 0, hi = (int)a.size() - 1;
    while (lo <= hi) {
        int mid = lo + (hi - lo) / 2;   // overflow-safe midpoint
        if (a[mid] == target) return mid;
        else if (a[mid] < target) lo = mid + 1;  // target in right half
        else hi = mid - 1;                        // target in left half
    }
    return -1;   // not found
}

int binarySearch(int[] a, int target) {
    int lo = 0, hi = a.length - 1;
    while (lo <= hi) {
        int mid = lo + (hi - lo) / 2;   // overflow-safe midpoint
        if (a[mid] == target) return mid;
        else if (a[mid] < target) lo = mid + 1;  // target in right half
        else hi = mid - 1;                        // target in left half
    }
    return -1;   // not found
}

function binarySearch(a, target) {
  let lo = 0, hi = a.length - 1;
  while (lo <= hi) {
    const mid = lo + Math.floor((hi - lo) / 2); // overflow-safe midpoint
    if (a[mid] === target) return mid;
    else if (a[mid] < target) lo = mid + 1;  // target in right half
    else hi = mid - 1;                        // target in left half
  }
  return -1;  // not found
}

def binary_search(a, target):
    lo, hi = 0, len(a) - 1
    while lo <= hi:
        mid = lo + (hi - lo) // 2      # overflow-safe midpoint
        if a[mid] == target:
            return mid
        elif a[mid] < target:
            lo = mid + 1               # target in right half
        else:
            hi = mid - 1               # target in left half
    return -1                          # not found

Why `lo + (hi - lo) / 2` and not `(lo + hi) / 2`

In fixed-width integer languages (C++, Java), lo + hi can overflow when both are near the maximum int, producing a negative midpoint and a wild crash or wrong answer. Computing lo + (hi - lo) / 2 gives the identical value while keeping every intermediate inside the valid range. It’s the same answer, computed safely — make it a habit.

For beginners: Integer division floors the result, so mid always lands at or left of the true centre. That’s fine — the loop still converges.

Why it is correct

Two ideas guarantee correctness:

Invariant: if the target is in the array, it is always within [lo, hi]. Every branch discards only elements that cannot equal the target (everything < target on a wrong-left, everything > target on a wrong-right).
Progress: each iteration sets lo = mid + 1 or hi = mid - 1, so the range strictly shrinks. It can never get stuck, so the loop always terminates — either returning a match or collapsing to an empty range (lo > hi).

Complexity

Time: O(log n) — the range halves each step, so it takes about log₂(n) comparisons. One billion elements need only ~30.
Space: O(1) for this iterative version. A recursive version is also O(log n) time but uses O(log n) stack space.

n            comparisons (~log2 n)
16           4
1,000        10
1,000,000    20
1,000,000,000 30

Common pitfalls

Forgetting the data is sorted. Binary search on unsorted data returns garbage. Sort first, or use linear search.
Wrong loop condition. With an inclusive [lo, hi] range you need lo <= hi. Using lo < hi skips checking the final single-element range and can miss matches.
Updating bounds to mid instead of mid ± 1. With the inclusive style, re-including mid (e.g. hi = mid) can loop forever, because the range stops shrinking.
Off-by-one on hi. Initialize hi = n - 1 (last valid index), not n, for the inclusive form above.

Going deeper: Binary search is far more than “find an element”. Tweaking the comparison turns it into lower/upper bound and first/last occurrence, and applying it to a range of candidate answers gives the powerful binary search on the answer. It’s also the lens behind the whole binary search pattern.

Next: handle duplicates and bounds with binary search variations, or practice on the searching exercises.