Collision Resolution: Chaining vs Open Addressing

Collision resolution is how a hash table stores two keys that hash to the same bucket. Because collisions are unavoidable, every real hash table needs a strategy. The two classics are separate chaining (each bucket holds a list of entries) and open addressing (collided entries spill into other buckets). Both keep operations O(1) on average as long as the load factor stays low.

Separate chaining

Each bucket points to a small list of all entries that hashed there. To find a key, jump to its bucket and scan that short list.

#include <vector>
#include <list>
#include <utility>

struct ChainedMap {
    std::vector<std::list<std::pair<int,int>>> buckets;
    ChainedMap(int n) : buckets(n) {}
    int idx(int key) { return ((key % (int)buckets.size()) + buckets.size()) % buckets.size(); }

    void put(int key, int val) {
        auto& chain = buckets[idx(key)];
        for (auto& kv : chain) if (kv.first == key) { kv.second = val; return; }
        chain.push_back({key, val}); // collision? append to the chain
    }
    bool get(int key, int& out) {
        for (auto& kv : buckets[idx(key)]) if (kv.first == key) { out = kv.second; return true; }
        return false;
    }
};

import java.util.*;

class ChainedMap {
    List<List<int[]>> buckets;
    ChainedMap(int n) {
        buckets = new ArrayList<>();
        for (int i = 0; i < n; i++) buckets.add(new LinkedList<>());
    }
    int idx(int key) { return Math.floorMod(key, buckets.size()); }

    void put(int key, int val) {
        List<int[]> chain = buckets.get(idx(key));
        for (int[] kv : chain) if (kv[0] == key) { kv[1] = val; return; }
        chain.add(new int[]{key, val}); // collision? append to the chain
    }
    Integer get(int key) {
        for (int[] kv : buckets.get(idx(key))) if (kv[0] == key) return kv[1];
        return null;
    }
}

class ChainedMap {
  constructor(n) {
    this.buckets = Array.from({ length: n }, () => []);
  }
  idx(key) { return ((key % this.buckets.length) + this.buckets.length) % this.buckets.length; }

  put(key, val) {
    const chain = this.buckets[this.idx(key)];
    for (const kv of chain) if (kv[0] === key) { kv[1] = val; return; }
    chain.push([key, val]); // collision? append to the chain
  }
  get(key) {
    for (const kv of this.buckets[this.idx(key)]) if (kv[0] === key) return kv[1];
    return undefined;
  }
}

class ChainedMap:
    def __init__(self, n):
        self.buckets = [[] for _ in range(n)]

    def idx(self, key):
        return key % len(self.buckets)

    def put(self, key, val):
        chain = self.buckets[self.idx(key)]
        for kv in chain:
            if kv[0] == key:
                kv[1] = val
                return
        chain.append([key, val])  # collision? append to the chain

    def get(self, key):
        for kv in self.buckets[self.idx(key)]:
            if kv[0] == key:
                return kv[1]
        return None

With a good hash and load factor α, each chain has length ~α, so operations are O(1 + α) — effectively O(1). Chaining is simple, degrades gracefully, and is what Java’s HashMap and C++‘s std::unordered_map use. (Java even upgrades a long chain into a balanced tree, capping the worst case at O(log n).)

Open addressing (linear probing)

Here every entry lives directly in the bucket array — no side lists. On a collision, you probe the next slots until you find an empty one. Linear probing just checks index, index+1, index+2, … (wrapping around).

#include <vector>

struct LinearProbeMap {
    std::vector<int> keys, vals, used; // used: 0 empty, 1 occupied
    LinearProbeMap(int n) : keys(n), vals(n), used(n, 0) {}

    void put(int key, int val) {
        int n = keys.size(), i = ((key % n) + n) % n;
        while (used[i] && keys[i] != key) i = (i + 1) % n; // probe forward
        keys[i] = key; vals[i] = val; used[i] = 1;
    }
    bool get(int key, int& out) {
        int n = keys.size(), i = ((key % n) + n) % n, steps = 0;
        while (used[i] && steps++ < n) {
            if (keys[i] == key) { out = vals[i]; return true; }
            i = (i + 1) % n;
        }
        return false;
    }
};

class LinearProbeMap {
    int[] keys, vals; boolean[] used;
    LinearProbeMap(int n) { keys = new int[n]; vals = new int[n]; used = new boolean[n]; }

    void put(int key, int val) {
        int n = keys.length, i = Math.floorMod(key, n);
        while (used[i] && keys[i] != key) i = (i + 1) % n; // probe forward
        keys[i] = key; vals[i] = val; used[i] = true;
    }
    Integer get(int key) {
        int n = keys.length, i = Math.floorMod(key, n), steps = 0;
        while (used[i] && steps++ < n) {
            if (keys[i] == key) return vals[i];
            i = (i + 1) % n;
        }
        return null;
    }
}

class LinearProbeMap {
  constructor(n) {
    this.keys = new Array(n);
    this.vals = new Array(n);
    this.used = new Array(n).fill(false);
  }
  put(key, val) {
    const n = this.used.length;
    let i = ((key % n) + n) % n;
    while (this.used[i] && this.keys[i] !== key) i = (i + 1) % n; // probe forward
    this.keys[i] = key; this.vals[i] = val; this.used[i] = true;
  }
  get(key) {
    const n = this.used.length;
    let i = ((key % n) + n) % n, steps = 0;
    while (this.used[i] && steps++ < n) {
      if (this.keys[i] === key) return this.vals[i];
      i = (i + 1) % n;
    }
    return undefined;
  }
}

class LinearProbeMap:
    def __init__(self, n):
        self.keys = [None] * n
        self.vals = [None] * n

    def put(self, key, val):
        n = len(self.keys)
        i = key % n
        while self.keys[i] is not None and self.keys[i] != key:
            i = (i + 1) % n  # probe forward
        self.keys[i] = key
        self.vals[i] = val

    def get(self, key):
        n = len(self.keys)
        i = key % n
        steps = 0
        while self.keys[i] is not None and steps < n:
            if self.keys[i] == key:
                return self.vals[i]
            i = (i + 1) % n
            steps += 1
        return None

Open addressing is cache-friendly (everything is in one contiguous array, no pointer chasing) and uses less memory per entry. Its weakness is clustering: runs of occupied slots grow and slow down probes, and it must stay well below full (resize around α ≈ 0.7). Python’s dict and set use open addressing.

Deletion gotcha: In open addressing you cannot just blank a slot — that would break the probe chain for keys stored after it. Use a tombstone marker (a “deleted” flag) so lookups keep probing past it.

Chaining vs open addressing

Aspect	Separate chaining	Open addressing
Where collisions go	Side list per bucket	Other slots in the array
Memory	Extra per-node pointers	Compact array
Cache behavior	Worse (pointer chasing)	Better (contiguous)
Tolerates high load?	Yes (degrades gently)	No (must keep α low)
Deletion	Easy (remove from list)	Needs tombstones
Used by	Java HashMap, C++ unordered_map	Python dict, open-addressed sets

Complexity summary

Both strategies are O(1) average for insert, lookup, and delete, and O(n) worst case when everything collides. Keeping a good hash function and a low load factor is what makes the average case the case you actually get.

For beginners: You rarely implement these yourself — you use the built-in map. But knowing why a hash table can occasionally be slow (bad hash, high load, pathological keys) helps you debug performance and answer interview questions.

Next: put theory to work with the built-in maps in four languages.