Merge Sort Explained: Divide & Conquer, Code & Complexity
Merge sort is a divide-and-conquer algorithm: it splits the array into halves, recursively sorts each half, then merges the two sorted halves into one sorted whole. Its great strength is a guaranteed O(n log n) running time in every case plus stability — the price is O(n) extra memory for the merge step.
The idea
- Divide: split the array at the middle into left and right halves.
- Conquer: recursively merge-sort each half. A subarray of length 0 or 1 is already sorted (the base case).
- Combine: merge the two sorted halves by repeatedly taking the smaller front element of the two.
The merge is the heart of it. Given [1, 4] and [2, 3], you compare fronts: 1 < 2 → take 1; 4 vs 2 → take 2; 4 vs 3 → take 3; left has 4 → take 4. Result [1, 2, 3, 4], done in a single linear pass.
Why it is O(n log n)
The recursion halves the array each level, so there are log n levels. Each level merges a total of n elements. n work per level × log n levels = O(n log n) — and this holds for best, average, and worst case, because the split is always even regardless of the data.
Implementation
A clear top-down version. merge combines two sorted halves into a scratch buffer, then copies back.
void merge(std::vector<int>& a, int lo, int mid, int hi) {
std::vector<int> tmp;
tmp.reserve(hi - lo + 1);
int i = lo, j = mid + 1;
while (i <= mid && j <= hi)
tmp.push_back(a[i] <= a[j] ? a[i++] : a[j++]); // <= keeps it stable
while (i <= mid) tmp.push_back(a[i++]);
while (j <= hi) tmp.push_back(a[j++]);
for (int k = 0; k < (int)tmp.size(); k++) a[lo + k] = tmp[k];
}
void mergeSort(std::vector<int>& a, int lo, int hi) {
if (lo >= hi) return; // 0 or 1 element
int mid = lo + (hi - lo) / 2;
mergeSort(a, lo, mid);
mergeSort(a, mid + 1, hi);
merge(a, lo, mid, hi);
}
void merge(int[] a, int lo, int mid, int hi) {
int[] tmp = new int[hi - lo + 1];
int i = lo, j = mid + 1, k = 0;
while (i <= mid && j <= hi)
tmp[k++] = (a[i] <= a[j]) ? a[i++] : a[j++]; // <= keeps it stable
while (i <= mid) tmp[k++] = a[i++];
while (j <= hi) tmp[k++] = a[j++];
for (int t = 0; t < tmp.length; t++) a[lo + t] = tmp[t];
}
void mergeSort(int[] a, int lo, int hi) {
if (lo >= hi) return; // 0 or 1 element
int mid = lo + (hi - lo) / 2;
mergeSort(a, lo, mid);
mergeSort(a, mid + 1, hi);
merge(a, lo, mid, hi);
}
function mergeSort(a, lo = 0, hi = a.length - 1) {
if (lo >= hi) return; // 0 or 1 element
const mid = lo + ((hi - lo) >> 1);
mergeSort(a, lo, mid);
mergeSort(a, mid + 1, hi);
// merge
const tmp = [];
let i = lo, j = mid + 1;
while (i <= mid && j <= hi)
tmp.push(a[i] <= a[j] ? a[i++] : a[j++]); // <= keeps it stable
while (i <= mid) tmp.push(a[i++]);
while (j <= hi) tmp.push(a[j++]);
for (let k = 0; k < tmp.length; k++) a[lo + k] = tmp[k];
}
def merge_sort(a, lo=0, hi=None):
if hi is None:
hi = len(a) - 1
if lo >= hi:
return # 0 or 1 element
mid = (lo + hi) // 2
merge_sort(a, lo, mid)
merge_sort(a, mid + 1, hi)
# merge
tmp = []
i, j = lo, mid + 1
while i <= mid and j <= hi:
if a[i] <= a[j]: # <= keeps it stable
tmp.append(a[i]); i += 1
else:
tmp.append(a[j]); j += 1
tmp.extend(a[i:mid + 1])
tmp.extend(a[j:hi + 1])
a[lo:hi + 1] = tmp
Complexity
| Case | Time |
|---|---|
| Best | O(n log n) |
| Average | O(n log n) |
| Worst | O(n log n) |
- Space: O(n) for the merge buffer (plus
O(log n)recursion stack). This is the main downside versus in-place sorts. - Stable: yes — using
<=when picking from the left half preserves the order of equal elements.
Gotcha: Changing the merge comparison from
<=to<quietly breaks stability — when elements are equal it would take from the right half first, reversing their original order. Keep<=.
When to use it
- When you need a guaranteed
O(n log n)(no bad-pivot worst case like quick sort) and stability — e.g. sorting objects by a secondary key. Java’sArrays.sortfor objects and Python’ssortedboth use a merge-sort variant (Timsort) for exactly these reasons. - For linked lists and external sorting (data too big for RAM), where its sequential, merge-friendly access pattern beats quick sort.
Going deeper: A bottom-up (iterative) merge sort avoids recursion; for linked lists, merge sort can even be done with
O(1)extra space by relinking nodes.
Next: Quick Sort.