How long can a string go without repeating itself? This classic problem challenges you to find the longest substring in a given string that contains no duplicate characters1. It’s a favorite in technical interviews because it rewards both intuition and algorithmic precision.
Given a string
s
, find the length of the longest substring ins
that contains no repeating characters. In other words, we want the longest contiguous sequence of characters where each character appears only once.
For example, consider the string "abcabcbb"
:
The substrings without repeating characters include
"abc"
,"bca"
,"cab"
, etc. The longest of these has length3
(such as"abc"
).If the input is
"bbbbb"
, every substring oflength > 1
has a repeat, so the longest substring without repeats is just"b"
withlength 1
.For
"pwwkew"
, the answer is3
, with a longest substring being"wke"
or"kew"
(note that"pwke"
is4
characters but is not a contiguous substring, so it doesn’t count).
These examples illustrate the problem: we need to efficiently find the maximum length of a unique-character substring within the given string.
Sliding window for efficient scanning
At first glance, one might consider checking all possible substrings, but that brute-force approach would be extremely inefficient (there are O(n²)
substrings and checking each for uniqueness would make it O(n³)
overall!). Instead, a smarter strategy is needed.
The sliding window technique2 provides an optimal solution by using two indices (or pointers) to maintain a window of characters that has no duplicates:
We expand the window by moving a right pointer through the string, adding characters to the window.
If adding a new character causes a repeat (i.e., the character is already in the current window), we "slide" the left pointer forward to shrink the window and remove the duplicate. Specifically, we move the left pointer to one position past the previous occurrence of that repeated character.
We keep track of the length of the window as we slide through the string, updating the maximum length whenever we find a larger window.
This approach ensures each character is processed at most twice (once when the right pointer visits it, and possibly once when the left pointer passes it), resulting in linear time complexity overall.
How the sliding window works
Use two pointers:
left
(start of the window) andright
(end of the window, as we iterate through the string). Initialize both to0
at the start.Use a data structure (like a hash map or array) to keep track of the last seen index of each character in the current window.
Iterate
right
from0
ton-1
(where n is the length of the string):If the character at
s[right]
has been seen before and its last seen index is ≥left
(meaning it's inside the current window), it means adding this character would create a duplicate in the window. In this case, updateleft
tolast_index_of(s[right]) + 1
. This effectively drops all characters up to and including the previous occurrence ofs[right]
from the window.Update or record the last seen index of
s[right]
to the current indexright
.Calculate the length of the current window as
right - left + 1
and update the maximum length if this is larger than the previous max.
The maximum length recorded by the end of the iteration is the length of the longest substring without repeating characters.
Using this strategy, we avoid re-checking substrings from scratch. The window dynamically adjusts to always maintain the invariant of "no duplicate characters".
To solidify this understanding, let's walk through a quick example with the sliding window approach before diving into code.
Example Walkthrough
Consider s = "abcabcbb"
. We’ll trace the window movement and track important variables:
We use a hash map (or array) to store the index of the last occurrence of each character. As we scan:
At indices
0,1,2,
we only encounter new charactersa, b, c
, so the window expands.At index
3
, we seean
again which was last seen at index0
(inside the current window starting at0
). We move the window start to index0+1 = 1
to drop the earliera
. The new window becomes"bca"
.Index
4
seesb
(last seen at1
, which is now the old window start), so move start to2
. Window becomes"cab"
.Index
5
seesc
(last seen at2
), move start to3
. Window becomes"abc"
.Index
6
seesb
(last seen at4
, which is ≥ current start3
), so move start to5
. Window becomes"cb"
.Index
7
seesb
(last seen at 6), move start to7
. Window becomes"b"
.
Throughout, the window length was tracked, and the largest length seen was 3
. Thus, the answer for "abcabcbb"
is 3
. This sliding window method runs in O(n)
time, far better than brute force.
Implementation
Now, let's implement this solution in three different programming languages – C++, Rust, and Python – using idiomatic and efficient constructs in each.
Solution in C++
Below is a C++ implementation of the longest substring without repeating characters using the sliding window approach. We use an array of size 256
(assuming the extended ASCII character set) to store the last seen index of each character. The logic updates the window start (left
) whenever a repeat is detected.
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
using namespace std;
int lengthOfLongestSubstring(const string& s) {
// Array to store last indices of each possible character, initialized to -1.
vector<int> lastIndex(256, -1);
int maxLen = 0;
int left = 0; // left pointer for the sliding window
for (int i = 0; i < s.size(); ++i) {
char ch = s[i];
// If this character was seen after or at the current left, move the window start
if (lastIndex[ch] >= left) {
left = lastIndex[ch] + 1;
}
// Update last seen index of the character to the current position
lastIndex[ch] = i;
// Calculate window length (i - left + 1) and update max length if needed
int currentLen = i - left + 1;
if (currentLen > maxLen) {
maxLen = currentLen;
}
}
return maxLen;
}
int main() {
string s1 = "abcabcbb";
string s2 = "bbbbb";
string s3 = "pwwkew";
cout << "\"" << s1 << "\" -> " << lengthOfLongestSubstring(s1) << endl;
cout << "\"" << s2 << "\" -> " << lengthOfLongestSubstring(s2) << endl;
cout << "\"" << s3 << "\" -> " << lengthOfLongestSubstring(s3) << endl;
return 0;
}
Explanation
We use
lastIndex[256]
to keep track of where each character was last seen. All values start at-1
(meaning no index).left
is the starting index of our current window (substring with unique chars).maxLen
keeps track of the longest length found.We iterate
i
from 0 tos.size()-1
(theright
pointer of the window):If
s[i]
was seen before at an indexlastIndex[s[i]]
that is at or after the currentleft
, it meanss[i]
is a duplicate in the current window. We then advanceleft
to one position past the last occurrence ofs[i]
(i.e.,left = lastIndex[s[i]] + 1
). This effectively removes the previous occurrence ofs[i]
and all characters before it from the window.Update
lastIndex[s[i]] = i
to mark the current index as the last seen position of characters[i]
.Calculate the length of the current window as
i - left + 1
. If it's greater thanmaxLen
, updatemaxLen
.
By the end of the loop,
maxLen
holds the length of the longest substring without repetition.
Using the earlier example "abcabcbb"
with this code:
Initially,
left = 0
,maxLen = 0
.i = 0
:ch = 'a'
, not seen before (lastIndex['a'] = -1
), so window[0..0]
is"a"
.maxLen
becomes1
.i = 1
:ch = 'b'
, not seen in window, window[0..1]
is"ab"
, length2
, updatemaxLen = 2
.i = 2
:ch = 'c'
, window[0..2]
is"abc"
, length 3, updatemaxLen = 3
.i = 3
:ch = 'a'
, seen before at index 0 which is ≥left
(0), so moveleft
to 0+1 = 1. Now window is[1..3]
("bca"
), length stays3
,maxLen
remains3
.i = 4
:ch = 'b'
, seen at index1
(≥ currentleft 1
), moveleft
to2
. Window[2..4]
="cab"
, length3
.i = 5
:ch = 'c'
, seen at index2
(≥left 2
), moveleft
to3
. Window[3..5]
="abc"
, length3
.i = 6
:ch = 'b'
, seen at index4
(≥left 3
), moveleft
to5
. Window[5..6]
="cb"
, length 2.i = 7
:ch = 'b'
, seen at index6
(≥left 5
), moveleft
to7
. Window[7..7]
="b"
, length1
.The largest window length encountered was
3
.
Time and space complexity
The C++ solution runs in O(n)
time, where n is the length of the string. Each character is processed once; the left
pointer only moves forward and never exceeds n. The use of the array for lastIndex
makes character lookup and updates O(1)
each.
Space complexity is O(m)
where m is the size of the character set. We used an array of size 256
for simplicity (assuming ASCII), which is effectively O(1)
space. The additional space usage doesn’t grow with the input string length.
Solution in Rust
Now let's solve the same problem in Rust. The idiomatic approach in Rust also utilizes a sliding window. We can use a HashMap<char, usize>
to store the last seen positions of characters. Rust’s standard library provides iterators that we can use to get indices and characters easily.
use std::collections::HashMap;
fn length_of_longest_substring(s: &str) -> usize {
let mut last_seen: HashMap<char, usize> = HashMap::new();
let mut left: usize = 0;
let mut max_len: usize = 0;
for (i, ch) in s.chars().enumerate() {
if let Some(&prev_index) = last_seen.get(&ch) {
// If character seen and within current window, move window start
if prev_index >= left {
left = prev_index + 1;
}
}
// Update last seen index of the character
last_seen.insert(ch, i);
// Update max length if needed
let current_len = i - left + 1;
if current_len > max_len {
max_len = current_len;
}
}
max_len
}
fn main() {
let tests = ["abcabcbb", "bbbbb", "pwwkew"];
for &s in &tests {
println!("\"{}\" -> {}", s, length_of_longest_substring(s));
}
}
Explanation
We use a
HashMap<char, usize>
calledlast_seen
to record the index of the last occurrence of each character.left
is the starting index of the current unique substring window, andmax_len
tracks the longest length found.We iterate through the string with
s.chars().enumerate()
, which gives us each characterch
and its indexi
(0-based).If
ch
exists inlast_seen
and the stored indexprev_index
is ≥left
, then the characterch
is repeating in the current window. We moveleft
toprev_index + 1
to start a new window after the previous occurrence ofch
.We then update
last_seen[ch] = i
to the current index.Compute the length of the current window as
i - left + 1
and updatemax_len
if this is larger.
Finally, return
max_len
.
The behavior is identical to the C++ logic, but let's do a quick trace with a sample input in Rust to ensure clarity. Consider s = "pwwkew"
:
Start:
left = 0
,max_len = 0
,last_seen
empty.i = 0
,ch = 'p'
: not seen before, add to map ({'p':0}
), window"p"
length1
,max_len = 1
.i = 1
,ch = 'w'
: not seen, add ({'p':0,'w':1}
), window"pw"
length2
,max_len = 2
.i = 2
,ch = 'w'
: seen at index1
which is ≥left
(0)
, so moveleft
to1+1 = 2
. Update'w'
last seen to2
. Now window starts at index2
, current window"w"
(the second'w'
alone), length1
,max_len
stays2
.i = 3
,ch = 'k'
: not seen, add (...,'k':3
), window"wk"
(from index2
to3
), length 2,max_len = 2
.i = 4
,ch = 'e'
: not seen, add (...,'e':4
), window"wke"
, length 3,max_len
becomes3
.i = 5
,ch = 'w'
: seen at index2
which is ≥left
(2)
, moveleft
to2+1 = 3
. Update'w'
last seen to5
. Now window starts at3
, current window"kew"
(indices3..5
), length3
,max_len
stays3
.End of string. The longest length found is
3
.
Just like in other languages, we maintained the window boundaries and used the map to efficiently skip over repeats.
Time and space complexity
Time complexity is O(n)
for iterating through the string of length n. Each character leads to at most one HashMap lookup and possibly moving the left
pointer forward. The HashMap operations are average O(1)
. Overall linear time.
Space complexity is O(m)
where m is the number of unique characters in the string (at most the size of the character set). In the worst case (all characters are unique), last_seen
will store every character. This is at most O(n)
extra space in the worst case (if n < size
of character set, otherwise O(m)
which is typically O(1)
for fixed-size char sets).
Note: In Rust, if we know the input is limited to ASCII characters, we could optimize space by using an array of length
128
or256
(similar to the C++ approach) instead of a HashMap. However, the HashMap approach is more general and idiomatic for handling arbitrary Unicode characters in the string.
Solution in Python
In Python, we can implement the sliding window approach concisely using a dictionary to store the last seen indices of characters. Python’s high-level data structures make the code quite straightforward.
def length_of_longest_substring(s: str) -> int:
last_index = {} # dictionary to store last seen index of each character
max_len = 0
left = 0 # start index of current window
for i, ch in enumerate(s):
if ch in last_index and last_index[ch] >= left:
# Character repeated in current window, move left just past the last index of ch
left = last_index[ch] + 1
# Update last seen index of ch
last_index[ch] = i
# Calculate current window length and update max length if needed
current_len = i - left + 1
if current_len > max_len:
max_len = current_len
return max_len
# Examples:
print(length_of_longest_substring("abcabcbb")) # Output: 3
print(length_of_longest_substring("bbbbb")) # Output: 1
print(length_of_longest_substring("pwwkew")) # Output: 3
Explanation
We use a dictionary
last_index
to keep track of the last index at which each character was seen.left
is the starting index of the current substring window with unique characters, andmax_len
records the longest length found.We loop over the string with
enumerate
to get both indexi
and characterch
:If
ch
is already inlast_index
and its stored index>= left
, it means adding thisch
would cause a duplicate in the current window. So, we updateleft
tolast_index[ch] + 1
to move past the previous occurrence ofch
in the window.Update
last_index[ch] = i
to record thatch
was last seen at indexi
.Compute the length of the current window as
i - left + 1
and updatemax_len
if this length is greater than the currentmax_len
.
Return
max_len
after iterating through the string.
To illustrate the process with a quick example, let's trace the code with s = "abba"
:
Start:
left = 0
,max_len = 0
,last_index = {}
.i = 0
,ch = 'a'
: not inlast_index
, so no need to moveleft
. Setlast_index['a'] = 0
. Window is"a"
(from index0
to0
), length1
, updatemax_len = 1
.i = 1
,ch = 'b'
: not seen before, setlast_index['b'] = 1
. Window"ab"
(0 to 1), length 2, updatemax_len = 2
.i = 2
,ch = 'b'
:'b'
is inlast_index
(last_index['b'] = 1
) and1
≥left
(0), so this'b'
is a repeat in the window. Moveleft
to1 + 1 = 2
(start after the previous'b'
). Updatelast_index['b'] = 2
. Now window is"b"
(from index2
to2
, since we essentially restarted after the first 'b'), length1
,max_len
stays2
.i = 3
,ch = 'a'
:'a'
is inlast_index
(last_index['a'] = 0
) but0 < left (2)
, meaning the previous'a'
is not in the current window (current window starts at2
). So we don't moveleft
. Updatelast_index['a'] = 3
. Window is"ba"
(from index2
to3
), length2
,max_len
remains2
.End of string. Longest length found is
2
, corresponding to substrings like"ab"
or"ba"
.
The Python code follows the same logic as the C++ and Rust solutions, leveraging Python's dictionary for quick lookups and updates.
Time and space complexity
Time complexity is O(n)
, where n is the length of the string. We traverse the string once. Dictionary operations for insertions and lookups are average-case O(1)
.
Space complexity is O(m)
where m is the number of distinct characters in the string. In the worst case (e.g., all characters in the string are unique), the dictionary will have an entry for each character, so space usage is O(n)
. Typically, m would be bounded by the character set (e.g., 26
letters or 128
ASCII characters), making it effectively O(1)
additional space.
Invite a friend & score a $50 voucher! 🎉
Together, we’ve built an incredible community, and now… it’s time to grow it even bigger!
Conclusion
In this article, we explored the "Longest Substring Without Repeating Characters" problem and demonstrated how a clever sliding window approach can drastically improve performance over naive methods. By maintaining a window of unique characters and adjusting its boundaries when a repeat is encountered, we achieve a linear time solution. We implemented this solution in three different languages (C++, Rust, and Python), showcasing the idiomatic practices and strengths of each.
Through the step-by-step walkthrough and trace table, we saw how the algorithm operates on a sample input, reinforcing understanding. The key takeaway is recognizing patterns like the sliding window for problems involving substrings or subarrays with certain conditions.
Further Exploration
This problem can be extended or varied in many ways. For instance, one might be asked to actually return the longest substring itself (not just the length), or to handle streams of characters in real-time. Another variation could involve finding longest substrings that meet different criteria (like at most k repeating characters, exactly k distinct characters, etc.). Understanding this sliding window technique provides a strong foundation to tackle such challenges.
Keep practicing by applying this approach to other problems, and you'll further solidify the concept and your ability to optimize string-handling algorithms.
Happy coding!