Binary Search: O(log n) Search on Sorted Data
Binary search finds a target in a sorted collection by repeatedly halving the search range: look at the middle element, and because the data is ordered, you know whether the target lies in the left half or the right half — so you throw the other half away. Each comparison eliminates half of what remains, giving O(log n) time. It is the single most important algorithm in this section.
The prerequisite: the data must be sorted
Binary search relies on order. If a[mid] is less than the target, every element to the left is also less, so the target (if present) must be on the right. That deduction only holds when the array is sorted. On unsorted data, use linear search instead.
The canonical implementation
We track an inclusive range [lo, hi]. While the range is non-empty (lo <= hi), test the midpoint and shrink toward the target.
int binarySearch(const std::vector<int>& a, int target) {
int lo = 0, hi = (int)a.size() - 1;
while (lo <= hi) {
int mid = lo + (hi - lo) / 2; // overflow-safe midpoint
if (a[mid] == target) return mid;
else if (a[mid] < target) lo = mid + 1; // target in right half
else hi = mid - 1; // target in left half
}
return -1; // not found
}
int binarySearch(int[] a, int target) {
int lo = 0, hi = a.length - 1;
while (lo <= hi) {
int mid = lo + (hi - lo) / 2; // overflow-safe midpoint
if (a[mid] == target) return mid;
else if (a[mid] < target) lo = mid + 1; // target in right half
else hi = mid - 1; // target in left half
}
return -1; // not found
}
function binarySearch(a, target) {
let lo = 0, hi = a.length - 1;
while (lo <= hi) {
const mid = lo + Math.floor((hi - lo) / 2); // overflow-safe midpoint
if (a[mid] === target) return mid;
else if (a[mid] < target) lo = mid + 1; // target in right half
else hi = mid - 1; // target in left half
}
return -1; // not found
}
def binary_search(a, target):
lo, hi = 0, len(a) - 1
while lo <= hi:
mid = lo + (hi - lo) // 2 # overflow-safe midpoint
if a[mid] == target:
return mid
elif a[mid] < target:
lo = mid + 1 # target in right half
else:
hi = mid - 1 # target in left half
return -1 # not found
Why lo + (hi - lo) / 2 and not (lo + hi) / 2
In fixed-width integer languages (C++, Java), lo + hi can overflow when both are near the maximum int, producing a negative midpoint and a wild crash or wrong answer. Computing lo + (hi - lo) / 2 gives the identical value while keeping every intermediate inside the valid range. It’s the same answer, computed safely — make it a habit.
For beginners: Integer division floors the result, so
midalways lands at or left of the true centre. That’s fine — the loop still converges.
Why it is correct
Two ideas guarantee correctness:
- Invariant: if the target is in the array, it is always within
[lo, hi]. Every branch discards only elements that cannot equal the target (everything< targeton a wrong-left, everything> targeton a wrong-right). - Progress: each iteration sets
lo = mid + 1orhi = mid - 1, so the range strictly shrinks. It can never get stuck, so the loop always terminates — either returning a match or collapsing to an empty range (lo > hi).
Complexity
- Time: O(log n) — the range halves each step, so it takes about
log₂(n)comparisons. One billion elements need only ~30. - Space: O(1) for this iterative version. A recursive version is also O(log n) time but uses O(log n) stack space.
n comparisons (~log2 n)
16 4
1,000 10
1,000,000 20
1,000,000,000 30
Common pitfalls
- Forgetting the data is sorted. Binary search on unsorted data returns garbage. Sort first, or use linear search.
- Wrong loop condition. With an inclusive
[lo, hi]range you needlo <= hi. Usinglo < hiskips checking the final single-element range and can miss matches. - Updating bounds to
midinstead ofmid ± 1. With the inclusive style, re-includingmid(e.g.hi = mid) can loop forever, because the range stops shrinking. - Off-by-one on
hi. Initializehi = n - 1(last valid index), notn, for the inclusive form above.
Going deeper: Binary search is far more than “find an element”. Tweaking the comparison turns it into lower/upper bound and first/last occurrence, and applying it to a range of candidate answers gives the powerful binary search on the answer. It’s also the lens behind the whole binary search pattern.
Next: handle duplicates and bounds with binary search variations, or practice on the searching exercises.