Tries (Prefix Trees): Introduction
A trie (pronounced “try”, also called a prefix tree) is a tree-shaped data structure that stores a set of strings by sharing their common prefixes. Each edge represents one character, and the path from the root down to a node spells out a prefix. Tries make prefix queries — “which words start with app?” — fast and natural, which is why they power autocomplete, spell-checkers, and dictionary lookups.
What a trie looks like
Instead of storing whole words in separate slots, a trie breaks each word into characters and lays them along a path. Words that share a beginning share the same nodes, so cat, car, and card all reuse the c → a path:
(root)
|
c
|
a
/ \
t r cat, car
|
d card
Each node marks whether a complete word ends there (an end-of-word flag). That flag is how we tell that car is a stored word even though the path continues on to card.
A minimal trie node
The core building block is a node holding a map from a character to a child node, plus a boolean saying “a word ends here.” Here is the shape of a single node:
struct TrieNode {
std::unordered_map<char, TrieNode*> children;
bool isEnd = false;
};
class TrieNode {
Map<Character, TrieNode> children = new HashMap<>();
boolean isEnd = false;
}
class TrieNode {
constructor() {
this.children = new Map(); // char -> TrieNode
this.isEnd = false;
}
}
class TrieNode:
def __init__(self):
self.children = {} # char -> TrieNode
self.is_end = False
For beginners: A trie is just a tree where you “spell out” a word as you walk down from the root. Don’t worry about the code yet — the next page builds a full trie step by step in trie implementation.
Why use a trie? Complexity vs a hash set
The natural comparison is a hash set of strings. Both can answer “is this exact word present?”, but they differ sharply on cost and capability. Let L be the length of the word and n the number of stored words.
Operation Trie Hash set
----------------------- ----------- ------------------------
Insert a word O(L) O(L) average (must hash)
Search exact word O(L) O(L) average, O(n·L) worst
Starts-with prefix? O(L) O(n·L) — must scan all
All words with prefix O(L + k)* O(n·L) — must scan all
* where k is the size of the matched subtree.
The decisive win is prefix queries. A hash set can only answer “exact match”; to find every word starting with app, it must examine every stored word. A trie walks straight to the app node in O(L) and then collects the subtree. Notice that even exact lookups are worst-case O(L) in a trie with no hashing and no collisions — the cost depends only on the word length, never on how many words you have stored.
Going deeper: Hashing a string is itself
O(L), so a hash set is not magically faster for exact lookups. Tries also enable ordered traversal (you can emit all words in sorted order by visiting children alphabetically), which a hash set cannot do.
The trade-offs
- Memory. A trie can use more memory than a hash set because every distinct character along a path needs a node and a child map. The savings from shared prefixes help, but a fan-out of 26+ children per node adds up.
- Cache behavior. Following pointers through scattered nodes is less cache-friendly than a compact hash table.
- Best fit. Tries shine when you have many strings over a small alphabet and you need prefix-based operations.
When to reach for a trie
Use a trie when you need any of:
- Autocomplete / type-ahead suggestions for a prefix.
- Spell-check and dictionary membership over a fixed alphabet.
- Prefix matching such as IP routing tables or word-search puzzles.
- Longest-prefix lookups.
If you only ever need exact-match membership and never prefixes, a plain hash set is simpler and lighter.
Next, build one from scratch in implementing a trie, then learn the core trie operations and real trie applications. When you are ready, try the trie exercises.