In this article, we will dissect the Unique Email Addresses problem, discuss the thought process behind solving it, and walk through the Python code to implement the solution.
Problem Statement
Every email consists of a local name and a domain name, separated by the “@” sign.
For example, in alice@leetcode.com
, alice
is the local name, and leetcode.com
is the domain name.
An email is considered unique if:
- All the characters except the dots (“.”) in the local name are considered, and
- All the characters in the domain name (from the “@” to the end) are considered.
However, there’s a special rule for the local name:
- If the local name has a plus (“+”) sign, ignore everything from the first plus sign onwards.
Given a list of emails, return the number of unique emails.
Problem Insights
- Dots in the local name don’t matter.
- Characters after the ‘+’ in the local name don’t matter.
- The domain name is always considered in its entirety.
Algorithm to Solve the Problem
- Split the email address into the local and domain parts.
- In the local part:
- Remove everything after the “+” sign.
- Remove all dots.
- Concatenate the processed local name and the domain name.
- Store the resulting email in a set (because sets store only unique elements).
- Return the size of the set.
Python Code Solution
Here’s how you can implement the solution in Python:
def numUniqueEmails(emails):
unique_emails = set() # To store unique emails
for email in emails:
local, domain = email.split('@') # Splitting into local and domain parts
# If there's a '+' in the local part, remove it and everything after it
if '+' in local:
local = local[:local.index('+')]
# Remove dots from the local part
local = local.replace('.', '')
# Add the cleaned-up email to the set
unique_emails.add(local + '@' + domain)
return len(unique_emails)
Complexity Analysis
- Time Complexity: The function iterates through each email once, and string operations like split, index, and replace have an average time complexity of O(N) for strings of length N. Hence, the overall time complexity is O(N), where N is the total number of characters in all emails combined.
- Space Complexity: The space used is proportional to the number of unique emails, which, in the worst case, is the same as the total number of emails given. Therefore, the space complexity is O(M), where M is the number of emails.
Testing the Solution
To ensure the function works:
emails = ["test.email+spam@leetcode.com","test.e.mail+bob@leetcode.com","testemail+david@leet.abc.com"]
print(numUniqueEmails(emails)) # Expected output: 2
Conclusion
The “Unique Email Addresses” problem on Leetcode is a wonderful demonstration of how string manipulation skills can be used to solve practical problems. While the problem seems complex at first glance, breaking it down step-by-step reveals its simplicity.