Skip to content
DSA hashing 6 min read

Hash Sets: Fast Membership and Deduplication

A hash set is a hash table that stores only keys, no values — a collection of unique elements with O(1) average add, remove, and “is this in the set?” membership tests. It is the go-to tool for two everyday jobs: checking whether you have seen something before, and removing duplicates. The built-ins are C++ std::unordered_set, Java HashSet, JavaScript Set, and Python set.

The core operations

#include <unordered_set>

std::unordered_set<int> s;
s.insert(7);            // add — O(1) average
bool has = s.count(7);  // contains? — O(1) average (1 if present)
s.erase(7);             // remove — O(1) average
size_t n = s.size();

Deduplicate a list

Building a set from a collection drops duplicates instantly. Remember: the result is unordered, so do not rely on element order.

#include <unordered_set>
#include <vector>

std::vector<int> dedup(const std::vector<int>& a) {
    std::unordered_set<int> seen(a.begin(), a.end());
    return std::vector<int>(seen.begin(), seen.end()); // order not guaranteed
}

”Have I seen this before?” — detect duplicates

A set turns the O(n²) nested-loop duplicate check (see Big-O notation) into a single O(n) pass: add as you go, and the first time add/insert fails — or the value is already present — you found a repeat.

#include <unordered_set>
#include <vector>

bool hasDuplicate(const std::vector<int>& a) {
    std::unordered_set<int> seen;
    for (int x : a) {
        if (seen.count(x)) return true; // seen before
        seen.insert(x);
    }
    return false;
}

Set algebra: union, intersection, difference

Sets shine for comparing collections — “what’s in both?”, “what’s in A but not B?”.

#include <unordered_set>
#include <vector>

// Intersection: elements present in both sets.
std::vector<int> intersect(const std::unordered_set<int>& a,
                           const std::unordered_set<int>& b) {
    std::vector<int> out;
    const auto& small = a.size() < b.size() ? a : b;
    const auto& big   = a.size() < b.size() ? b : a;
    for (int x : small) if (big.count(x)) out.push_back(x);
    return out;
}

For beginners: A set is just a map whose values you ignore. If you ever need to attach data to each unique key (a count, a position, a name), switch from a set to a hash map.

Going deeper: Need the elements in sorted order or range queries? Use an ordered set — C++ std::set / Java TreeSet (balanced trees, O(log n)). JS and Python have no built-in sorted set; sort on demand or use a library. A hash set trades ordering for speed.

Complexity at a glance

OperationAverageWorst case
add / remove / containsO(1)O(n)
build set from n itemsO(n)O(n²)
union / intersection / differenceO(n)O(n²)

Next: put maps and sets to work on real problems — hashing applications & patterns.

Last updated June 25, 2026
Was this helpful?