The “Find Words That Can Be Formed by Characters” problem is an interesting coding challenge that tests your knowledge of string manipulation and data structure usage in Python. This problem is especially relevant for applications that require text analysis, like natural language processing and search engines. In this in-depth article, we will explore the problem statement, multiple approaches to solving it, and evaluate their complexities.
Problem Description
The problem statement from Leetcode is as follows:
You are given an array of strings words
and a string chars
. A string is good if it can be formed by characters from chars
(each character can only be used once).
Return the sum of lengths of all good strings in words
.
Example:
Input: words = ["cat","bt","hat","tree"], chars = "atach"
Output: 6
Explanation:
The strings that can be formed are “cat” and “hat” so the answer is 3 + 3 = 6.
Approach 1: Brute-Force Solution
The naive approach would be to check each word in the array words
against the string chars
. For each word, we can create a copy of chars
and try to form the word using the characters in the copy. If the word can be formed, we add its length to our total.
Here’s the Python code for the brute-force approach:
def countCharacters(words, chars):
total_length = 0
for word in words:
temp_chars = list(chars)
for c in word:
if c in temp_chars:
temp_chars.remove(c)
else:
break
else:
total_length += len(word)
return total_length
Time Complexity:
The time complexity for this approach is O(n×m), where n is the number of words and m is the average length of a word.
Space Complexity:
The space complexity is O(m) for storing a temporary copy of the characters.
Approach 2: Using Hash Maps
To improve upon the naive solution, we can use hash maps to store the frequency of each character in chars
. This way, we can quickly look up the frequency of a character without scanning through the entire chars
string.
Here’s the Python code for this approach:
from collections import Counter
def countCharacters(words, chars):
total_length = 0
chars_count = Counter(chars)
for word in words:
word_count = Counter(word)
for c, count in word_count.items():
if chars_count.get(c, 0) < count:
break
else:
total_length += len(word)
return total_length
Time Complexity:
The time complexity remains O(n×m), but the constant factors are reduced because hash map lookups are faster than scanning a list.
Space Complexity:
The space complexity is O(m+k), where k is the length of chars
.
Approach 3: Optimized Hash Maps
We can further optimize the hash map solution by using a single hash map and modifying it in place. For each word, we decrement the counts in our hash map as we use characters, and then revert them back afterward.
Here’s the Python code for this optimized approach:
from collections import Counter
def countCharacters(words, chars):
total_length = 0
chars_count = Counter(chars)
for word in words:
temp_count = chars_count.copy()
for c in word:
if temp_count[c] > 0:
temp_count[c] -= 1
else:
break
else:
total_length += len(word)
return total_length
Time Complexity:
The time complexity is still O(n×m), but this approach minimizes constant factors even further.
Space Complexity:
The space complexity is O(m+k).
Conclusion
The “Find Words That Can Be Formed by Characters” problem offers a great opportunity to explore different ways to solve text manipulation problems in Python. While the naive approach is simpler to understand and implement, the hash map approach offers better performance in terms of time complexity.
Both the naive and hash map approaches have similar time complexities, but the hash map solutions are faster due to reduced constant factors. The optimized hash map approach pushes this even further, showing how small optimizations can make a difference in performance.