Substrings and string matching are essential topics in computer science and software engineering, often appearing in various forms across different applications, from search engines to bioinformatics. The problem titled “Number of Strings That Appear as Substrings in Word” from Leetcode offers a unique twist on these classical themes. This problem can be solved through multiple approaches, each offering its own advantages and disadvantages. In this extensive guide, we’ll discuss the problem statement, various solutions, their time complexities.
Problem Statement
Given an array of strings patterns
and a string word
, return the number of strings in patterns
that exist as a substring in word
.
A substring is a contiguous sequence of characters within a string.
The objective is to find the count of strings in the given patterns
array that appear as substrings in the word word
.
The Basic Approach: Brute-force Substring Matching
The simplest way to approach this problem is by checking each string in patterns
to see if it appears as a substring in word
. Python’s in
keyword makes this extremely straightforward.
Python Code
def numOfStrings(patterns, word):
count = 0
for pattern in patterns:
if pattern in word:
count += 1
return count
Time Complexity
The time complexity for this approach is O(n×m), where n is the length of the word
, and m is the total length of all strings in the patterns
array. This could be quite inefficient for very long words or a large number of patterns.
Optimized Approach: Using Trie Data Structure
A Trie (pronounced “try”) is a tree-like data structure that is used to store a dynamic set of strings. Tries are particularly useful for implementations that require fast lookups, like this one.
Python Code
class TrieNode:
def __init__(self):
self.children = {}
self.is_end = False
def buildTrie(patterns):
root = TrieNode()
for pattern in patterns:
node = root
for ch in pattern:
if ch not in node.children:
node.children[ch] = TrieNode()
node = node.children[ch]
node.is_end = True
return root
def searchTrie(root, word):
count = 0
for i in range(len(word)):
node = root
for j in range(i, len(word)):
if word[j] in node.children:
node = node.children[word[j]]
if node.is_end:
count += 1
node.is_end = False # To avoid counting duplicates
else:
break
return count
def numOfStrings(patterns, word):
root = buildTrie(patterns)
return searchTrie(root, word)
Time Complexity
The time complexity for building the Trie is O(m), where m is the total length of all strings in the patterns
array. The time complexity for the search operation is O(n), where n is the length of the word
. So, the overall time complexity would be O(n+m).
Conclusion
The “Number of Strings That Appear as Substrings in Word” problem is an intriguing problem that allows us to explore multiple techniques, such as brute-force string matching and Trie data structures. Each approach offers its own benefits, either in simplicity or in efficiency.