Merge Sort Explained: Divide & Conquer, Code & Complexity

Merge sort is a divide-and-conquer algorithm: it splits the array into halves, recursively sorts each half, then merges the two sorted halves into one sorted whole. Its great strength is a guaranteed O(n log n) running time in every case plus stability — the price is O(n) extra memory for the merge step.

The idea

Divide: split the array at the middle into left and right halves.
Conquer: recursively merge-sort each half. A subarray of length 0 or 1 is already sorted (the base case).
Combine: merge the two sorted halves by repeatedly taking the smaller front element of the two.

The merge is the heart of it. Given [1, 4] and [2, 3], you compare fronts: 1 < 2 → take 1; 4 vs 2 → take 2; 4 vs 3 → take 3; left has 4 → take 4. Result [1, 2, 3, 4], done in a single linear pass.

Why it is O(n log n)

The recursion halves the array each level, so there are log n levels. Each level merges a total of n elements. n work per level × log n levels = O(n log n) — and this holds for best, average, and worst case, because the split is always even regardless of the data.

Implementation

A clear top-down version. merge combines two sorted halves into a scratch buffer, then copies back.

void merge(std::vector<int>& a, int lo, int mid, int hi) {
    std::vector<int> tmp;
    tmp.reserve(hi - lo + 1);
    int i = lo, j = mid + 1;
    while (i <= mid && j <= hi)
        tmp.push_back(a[i] <= a[j] ? a[i++] : a[j++]); // <= keeps it stable
    while (i <= mid) tmp.push_back(a[i++]);
    while (j <= hi)  tmp.push_back(a[j++]);
    for (int k = 0; k < (int)tmp.size(); k++) a[lo + k] = tmp[k];
}

void mergeSort(std::vector<int>& a, int lo, int hi) {
    if (lo >= hi) return;            // 0 or 1 element
    int mid = lo + (hi - lo) / 2;
    mergeSort(a, lo, mid);
    mergeSort(a, mid + 1, hi);
    merge(a, lo, mid, hi);
}

void merge(int[] a, int lo, int mid, int hi) {
    int[] tmp = new int[hi - lo + 1];
    int i = lo, j = mid + 1, k = 0;
    while (i <= mid && j <= hi)
        tmp[k++] = (a[i] <= a[j]) ? a[i++] : a[j++]; // <= keeps it stable
    while (i <= mid) tmp[k++] = a[i++];
    while (j <= hi)  tmp[k++] = a[j++];
    for (int t = 0; t < tmp.length; t++) a[lo + t] = tmp[t];
}

void mergeSort(int[] a, int lo, int hi) {
    if (lo >= hi) return;            // 0 or 1 element
    int mid = lo + (hi - lo) / 2;
    mergeSort(a, lo, mid);
    mergeSort(a, mid + 1, hi);
    merge(a, lo, mid, hi);
}

function mergeSort(a, lo = 0, hi = a.length - 1) {
  if (lo >= hi) return;             // 0 or 1 element
  const mid = lo + ((hi - lo) >> 1);
  mergeSort(a, lo, mid);
  mergeSort(a, mid + 1, hi);
  // merge
  const tmp = [];
  let i = lo, j = mid + 1;
  while (i <= mid && j <= hi)
    tmp.push(a[i] <= a[j] ? a[i++] : a[j++]); // <= keeps it stable
  while (i <= mid) tmp.push(a[i++]);
  while (j <= hi)  tmp.push(a[j++]);
  for (let k = 0; k < tmp.length; k++) a[lo + k] = tmp[k];
}

def merge_sort(a, lo=0, hi=None):
    if hi is None:
        hi = len(a) - 1
    if lo >= hi:
        return                        # 0 or 1 element
    mid = (lo + hi) // 2
    merge_sort(a, lo, mid)
    merge_sort(a, mid + 1, hi)
    # merge
    tmp = []
    i, j = lo, mid + 1
    while i <= mid and j <= hi:
        if a[i] <= a[j]:              # <= keeps it stable
            tmp.append(a[i]); i += 1
        else:
            tmp.append(a[j]); j += 1
    tmp.extend(a[i:mid + 1])
    tmp.extend(a[j:hi + 1])
    a[lo:hi + 1] = tmp

Complexity

Case	Time
Best	O(n log n)
Average	O(n log n)
Worst	O(n log n)

Space: O(n) for the merge buffer (plus O(log n) recursion stack). This is the main downside versus in-place sorts.
Stable: yes — using <= when picking from the left half preserves the order of equal elements.

Gotcha: Changing the merge comparison from <= to < quietly breaks stability — when elements are equal it would take from the right half first, reversing their original order. Keep <=.

When to use it

When you need a guaranteed O(n log n) (no bad-pivot worst case like quick sort) and stability — e.g. sorting objects by a secondary key. Java’s Arrays.sort for objects and Python’s sorted both use a merge-sort variant (Timsort) for exactly these reasons.
For linked lists and external sorting (data too big for RAM), where its sequential, merge-friendly access pattern beats quick sort.

Going deeper: A bottom-up (iterative) merge sort avoids recursion; for linked lists, merge sort can even be done with O(1) extra space by relinking nodes.

Next: Quick Sort.