Union-Find (Disjoint Set Union) Explained
Union-Find, also called Disjoint Set Union (DSU), is a data structure that keeps a collection of elements partitioned into non-overlapping sets and answers one question fast: are these two elements in the same set? It supports two operations — find (which set does x belong to?) and union (merge the two sets containing x and y). With two optimizations it runs in near-constant time per operation, which makes it the go-to tool for connectivity problems.
The core idea: parent pointers
Each element points to a parent. Follow parents upward and you reach a root — the representative of the whole set. Two elements are in the same set exactly when they share a root. Initially every element is its own parent (its own one-element set).
struct DSU {
std::vector<int> parent;
DSU(int n) : parent(n) {
for (int i = 0; i < n; i++) parent[i] = i; // each its own root
}
int find(int x) { // walk up to the root
while (parent[x] != x) x = parent[x];
return x;
}
};
class DSU {
int[] parent;
DSU(int n) {
parent = new int[n];
for (int i = 0; i < n; i++) parent[i] = i; // each its own root
}
int find(int x) { // walk up to the root
while (parent[x] != x) x = parent[x];
return x;
}
}
class DSU {
constructor(n) {
this.parent = Array.from({ length: n }, (_, i) => i); // each its own root
}
find(x) { // walk up to the root
while (this.parent[x] !== x) x = this.parent[x];
return x;
}
}
class DSU:
def __init__(self, n):
self.parent = list(range(n)) # each its own root
def find(self, x): # walk up to the root
while self.parent[x] != x:
x = self.parent[x]
return x
Without optimizations, the parent chains can grow into a tall line, making find O(n). Two tricks fix that.
Optimization 1: union by rank
When merging two trees, attach the shorter tree under the taller one so the result stays shallow. We track each root’s rank (an upper bound on its height). This keeps trees logarithmic in height on their own.
Optimization 2: path compression
During find, repoint every node we pass straight to the root. Future lookups then take one hop. Combined with union by rank, the amortized cost per operation is O(α(n)), where α is the inverse Ackermann function — effectively constant (≤ 4 for any input you’ll ever see).
struct DSU {
std::vector<int> parent, rank_;
DSU(int n) : parent(n), rank_(n, 0) {
for (int i = 0; i < n; i++) parent[i] = i;
}
int find(int x) {
if (parent[x] != x) parent[x] = find(parent[x]); // path compression
return parent[x];
}
bool unite(int a, int b) {
int ra = find(a), rb = find(b);
if (ra == rb) return false; // already connected
if (rank_[ra] < rank_[rb]) std::swap(ra, rb);
parent[rb] = ra; // union by rank
if (rank_[ra] == rank_[rb]) rank_[ra]++;
return true;
}
};
class DSU {
int[] parent, rank;
DSU(int n) {
parent = new int[n];
rank = new int[n];
for (int i = 0; i < n; i++) parent[i] = i;
}
int find(int x) {
if (parent[x] != x) parent[x] = find(parent[x]); // path compression
return parent[x];
}
boolean union(int a, int b) {
int ra = find(a), rb = find(b);
if (ra == rb) return false; // already connected
if (rank[ra] < rank[rb]) { int t = ra; ra = rb; rb = t; }
parent[rb] = ra; // union by rank
if (rank[ra] == rank[rb]) rank[ra]++;
return true;
}
}
class DSU {
constructor(n) {
this.parent = Array.from({ length: n }, (_, i) => i);
this.rank = new Array(n).fill(0);
}
find(x) {
if (this.parent[x] !== x) this.parent[x] = this.find(this.parent[x]); // path compression
return this.parent[x];
}
union(a, b) {
let ra = this.find(a), rb = this.find(b);
if (ra === rb) return false; // already connected
if (this.rank[ra] < this.rank[rb]) [ra, rb] = [rb, ra];
this.parent[rb] = ra; // union by rank
if (this.rank[ra] === this.rank[rb]) this.rank[ra]++;
return true;
}
}
class DSU:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # path compression
return self.parent[x]
def union(self, a, b):
ra, rb = self.find(a), self.find(b)
if ra == rb:
return False # already connected
if self.rank[ra] < self.rank[rb]:
ra, rb = rb, ra
self.parent[rb] = ra # union by rank
if self.rank[ra] == self.rank[rb]:
self.rank[ra] += 1
return True
For beginners:
unionreturnsfalsewhen the two elements were already in the same set. That return value is handy — for example, it tells Kruskal’s MST that an edge would form a cycle.
Complexity
| Operation | Time |
|---|---|
find | O(α(n)) amortized |
union | O(α(n)) amortized |
| Build (n singletons) | O(n) |
| Space | O(n) |
Common pitfalls
- Skipping
findbefore comparing. Compare roots, not the raw elements —parent[x]may not be the root. - Uniting without checking roots. Always
findboth before linking, or you can attach a root to itself or create a cycle. - Recursion depth. Recursive path compression is clean, but on adversarial input the first call can recurse deeply; an iterative two-pass
findavoids stack overflow in languages with small stacks.
When to use it
Reach for Union-Find whenever you merge groups and test connectivity: counting connected components in a graph, detecting cycles while building a minimum spanning tree, grouping accounts, or the classic “number of provinces” problem. If you need to split sets again, DSU is the wrong tool — it only merges.
Ready to practice? Try the advanced exercises.