Skip to content
DSA arrays 8 min read

Common String Manipulation Techniques in DSA

Most string problems are built from a handful of manipulation primitives: splitting text into pieces, joining pieces back together, slicing out a substring, and counting how often each character appears. This page shows each one idiomatically in all four languages and notes its cost so you can compose them without surprises. If you haven’t read about string mutability yet, start there — it explains why we join instead of += in a loop.

Split: text into parts

Splitting breaks a string into a list of substrings on a delimiter. It costs O(n) in the length of the string.

#include <string>
#include <vector>
#include <sstream>
std::vector<std::string> split(const std::string& s, char delim) {
    std::vector<std::string> out;
    std::stringstream ss(s);
    std::string part;
    while (std::getline(ss, part, delim)) out.push_back(part);
    return out;
}

Join: parts into text

Joining is the inverse: glue a list of strings together with a separator. Build with join (not repeated +=) to stay O(n).

#include <string>
#include <vector>
std::string join(const std::vector<std::string>& parts, const std::string& sep) {
    std::string out;
    for (size_t i = 0; i < parts.size(); i++) {
        if (i) out += sep;
        out += parts[i];
    }
    return out;
}

Substring: slice a range

A substring copies a contiguous range of characters, costing O(m) for length m. Conventionally the start is inclusive and the end is exclusive.

#include <string>
std::string s = "abcdef";
std::string sub = s.substr(1, 3); // start=1, length=3 -> "bcd"

Pitfall: C++ substr(pos, len) takes a start and a length, while Java substring, JS slice, and Python slicing take a start and an end index. Mixing them up is a common off-by-one bug.

Character frequency

Counting character occurrences underpins anagram checks, “first unique character,” and many hashing problems. Use a hash map (or a fixed-size array of 26/128/256 counts for known alphabets). It’s O(n) time.

#include <string>
#include <unordered_map>
std::unordered_map<char, int> charFreq(const std::string& s) {
    std::unordered_map<char, int> freq;
    for (char c : s) freq[c]++;
    return freq;
}

Tip: When the input is restricted to lowercase English letters, an int[26] array indexed by c - 'a' is faster and lighter than a hash map — same O(n) time, smaller constants and no hashing.

Other handy operations

  • Case conversiontoupper/tolower, toUpperCase(), .upper()/.lower(): O(n).
  • Trim whitespacestrip() / trim(): O(n).
  • Replace / contains / startsWith — convenience methods, typically O(n).
  • Reverse — see the two-pointer technique in array rotation & reversal; on immutable strings, reverse a char array/list then join.

Complexity recap

OperationComplexity
Split / joinO(n)
Substring (length m)O(m)
Character frequencyO(n) time, O(k) space (k = alphabet size)
Build via join vs += loopO(n) vs O(n²)

With these primitives you can tackle most string challenges. Put them to work on the string exercises.

Last updated June 25, 2026
Was this helpful?