Hashing Applications & Patterns: Counting, Two-Sum, Grouping
Once you have a hash map and a hash set, the same handful of patterns solve a huge share of coding problems. The unifying idea: store something you have seen so a later step can answer a question in O(1) instead of re-scanning. This page covers the three you will use most — frequency counting, the complement (two-sum) lookup, and grouping by a key.
Pattern 1: Frequency counting
Tally how often each value appears in one O(n) pass. This underlies “most frequent element”, “first non-repeating”, “is anagram”, and many more.
#include <unordered_map>
#include <vector>
// Return the element that appears most often.
int mostFrequent(const std::vector<int>& a) {
std::unordered_map<int,int> freq;
int best = a[0], bestCount = 0;
for (int x : a) {
int c = ++freq[x];
if (c > bestCount) { bestCount = c; best = x; }
}
return best;
}
import java.util.*;
int mostFrequent(int[] a) {
Map<Integer,Integer> freq = new HashMap<>();
int best = a[0], bestCount = 0;
for (int x : a) {
int c = freq.merge(x, 1, Integer::sum);
if (c > bestCount) { bestCount = c; best = x; }
}
return best;
}
function mostFrequent(a) {
const freq = new Map();
let best = a[0], bestCount = 0;
for (const x of a) {
const c = (freq.get(x) ?? 0) + 1;
freq.set(x, c);
if (c > bestCount) { bestCount = c; best = x; }
}
return best;
}
from collections import Counter
def most_frequent(a):
freq = Counter(a)
return max(freq, key=freq.get)
Complexity: O(n) time, O(k) space for k distinct values.
Pattern 2: Complement lookup (two-sum)
Instead of checking every pair (O(n²)), remember each value you have passed in a map. For each new value x, ask the map whether the thing it needs — the complement target - x — was already seen. One pass, O(n).
#include <unordered_map>
#include <vector>
std::vector<int> twoSum(const std::vector<int>& nums, int target) {
std::unordered_map<int,int> seen; // value -> index
for (int i = 0; i < (int)nums.size(); i++) {
int need = target - nums[i];
if (seen.count(need)) return {seen[need], i};
seen[nums[i]] = i;
}
return {};
}
import java.util.*;
int[] twoSum(int[] nums, int target) {
Map<Integer,Integer> seen = new HashMap<>(); // value -> index
for (int i = 0; i < nums.length; i++) {
int need = target - nums[i];
if (seen.containsKey(need)) return new int[]{seen.get(need), i};
seen.put(nums[i], i);
}
return new int[]{};
}
function twoSum(nums, target) {
const seen = new Map(); // value -> index
for (let i = 0; i < nums.length; i++) {
const need = target - nums[i];
if (seen.has(need)) return [seen.get(need), i];
seen.set(nums[i], i);
}
return [];
}
def two_sum(nums, target):
seen = {} # value -> index
for i, x in enumerate(nums):
need = target - x
if need in seen:
return [seen[need], i]
seen[x] = i
return []
Complexity: O(n) time, O(n) space. The same “look up the complement” idea powers pair/triple-sum problems and “does a subarray exist with sum k” (using prefix sums — see prefix sums).
Pitfall: Insert
xafter checking for its complement, or a single element could match itself.
Pattern 3: Grouping by a key
Bucket items that share some computed property by using that property as the map key. The classic is group anagrams: words that share a sorted-letter signature belong together.
#include <unordered_map>
#include <vector>
#include <string>
#include <algorithm>
std::vector<std::vector<std::string>> groupAnagrams(const std::vector<std::string>& words) {
std::unordered_map<std::string, std::vector<std::string>> groups;
for (const auto& w : words) {
std::string key = w;
std::sort(key.begin(), key.end()); // signature: sorted letters
groups[key].push_back(w);
}
std::vector<std::vector<std::string>> out;
for (auto& [k, v] : groups) out.push_back(v);
return out;
}
import java.util.*;
List<List<String>> groupAnagrams(String[] words) {
Map<String, List<String>> groups = new HashMap<>();
for (String w : words) {
char[] c = w.toCharArray();
Arrays.sort(c);
String key = new String(c); // signature: sorted letters
groups.computeIfAbsent(key, k -> new ArrayList<>()).add(w);
}
return new ArrayList<>(groups.values());
}
function groupAnagrams(words) {
const groups = new Map();
for (const w of words) {
const key = [...w].sort().join(""); // signature: sorted letters
if (!groups.has(key)) groups.set(key, []);
groups.get(key).push(w);
}
return [...groups.values()];
}
from collections import defaultdict
def group_anagrams(words):
groups = defaultdict(list)
for w in words:
key = "".join(sorted(w)) # signature: sorted letters
groups[key].append(w)
return list(groups.values())
Complexity: O(n · k log k) for n words of length up to k (the per-word sort dominates).
Recognizing the pattern
Ask yourself these and reach for a hash structure when the answer is yes:
- “Have I seen this before?” → hash set.
- “How many times does each thing appear?” → frequency map.
- “Is there another element that completes a target?” → complement lookup.
- “Group items that share a property?” → map from property → list.
For beginners: Hashing is the most common way to remove a nested loop. Whenever you catch yourself writing “for each element, scan the rest”, ask whether a map of what you have already seen would let you answer in one pass.
Going deeper: Hash lookups are O(1) average but have constant-factor overhead and poor cache locality versus arrays. For small, dense integer keys, a plain array used as a direct-address table (or a fixed
int[26]for lowercase letters) is faster than a hash map. Profile before assuming a map is fastest.
Ready to practice? Head to the hashing exercises.