Skip to content
DSA hashing 8 min read

Collision Resolution: Chaining vs Open Addressing

Collision resolution is how a hash table stores two keys that hash to the same bucket. Because collisions are unavoidable, every real hash table needs a strategy. The two classics are separate chaining (each bucket holds a list of entries) and open addressing (collided entries spill into other buckets). Both keep operations O(1) on average as long as the load factor stays low.

Separate chaining

Each bucket points to a small list of all entries that hashed there. To find a key, jump to its bucket and scan that short list.

#include <vector>
#include <list>
#include <utility>

struct ChainedMap {
    std::vector<std::list<std::pair<int,int>>> buckets;
    ChainedMap(int n) : buckets(n) {}
    int idx(int key) { return ((key % (int)buckets.size()) + buckets.size()) % buckets.size(); }

    void put(int key, int val) {
        auto& chain = buckets[idx(key)];
        for (auto& kv : chain) if (kv.first == key) { kv.second = val; return; }
        chain.push_back({key, val}); // collision? append to the chain
    }
    bool get(int key, int& out) {
        for (auto& kv : buckets[idx(key)]) if (kv.first == key) { out = kv.second; return true; }
        return false;
    }
};

With a good hash and load factor α, each chain has length ~α, so operations are O(1 + α) — effectively O(1). Chaining is simple, degrades gracefully, and is what Java’s HashMap and C++‘s std::unordered_map use. (Java even upgrades a long chain into a balanced tree, capping the worst case at O(log n).)

Open addressing (linear probing)

Here every entry lives directly in the bucket array — no side lists. On a collision, you probe the next slots until you find an empty one. Linear probing just checks index, index+1, index+2, … (wrapping around).

#include <vector>

struct LinearProbeMap {
    std::vector<int> keys, vals, used; // used: 0 empty, 1 occupied
    LinearProbeMap(int n) : keys(n), vals(n), used(n, 0) {}

    void put(int key, int val) {
        int n = keys.size(), i = ((key % n) + n) % n;
        while (used[i] && keys[i] != key) i = (i + 1) % n; // probe forward
        keys[i] = key; vals[i] = val; used[i] = 1;
    }
    bool get(int key, int& out) {
        int n = keys.size(), i = ((key % n) + n) % n, steps = 0;
        while (used[i] && steps++ < n) {
            if (keys[i] == key) { out = vals[i]; return true; }
            i = (i + 1) % n;
        }
        return false;
    }
};

Open addressing is cache-friendly (everything is in one contiguous array, no pointer chasing) and uses less memory per entry. Its weakness is clustering: runs of occupied slots grow and slow down probes, and it must stay well below full (resize around α ≈ 0.7). Python’s dict and set use open addressing.

Deletion gotcha: In open addressing you cannot just blank a slot — that would break the probe chain for keys stored after it. Use a tombstone marker (a “deleted” flag) so lookups keep probing past it.

Chaining vs open addressing

AspectSeparate chainingOpen addressing
Where collisions goSide list per bucketOther slots in the array
MemoryExtra per-node pointersCompact array
Cache behaviorWorse (pointer chasing)Better (contiguous)
Tolerates high load?Yes (degrades gently)No (must keep α low)
DeletionEasy (remove from list)Needs tombstones
Used byJava HashMap, C++ unordered_mapPython dict, open-addressed sets

Complexity summary

Both strategies are O(1) average for insert, lookup, and delete, and O(n) worst case when everything collides. Keeping a good hash function and a low load factor is what makes the average case the case you actually get.

For beginners: You rarely implement these yourself — you use the built-in map. But knowing why a hash table can occasionally be slow (bad hash, high load, pathological keys) helps you debug performance and answer interview questions.

Next: put theory to work with the built-in maps in four languages.

Last updated June 25, 2026
Was this helpful?