HashSet is a type of data structure that is used for storing unique elements in an unordered manner. A HashSet supports the
contains operations, all of which have an average time complexity of O(1) under the assumption that the number of keys are uniformly distributed.
In this article, we will focus on solving the LeetCode problem 705 – “Design HashSet,” using Python. In this problem, we need to design our own HashSet without using any built-in hash table libraries. We will discuss different solutions ranging from straightforward to optimized approaches, helping you understand the concept thoroughly.
To be more precise, the problem asks us to implement the
void add(key): Inserts the value
keyinto the HashSet.
bool contains(key): Returns whether the value
keyexists in the HashSet or not.
void remove(key): Removes the value
keyin the HashSet. If
keydoes not exist in the HashSet, do nothing.
Method 1: Using an Array
Let’s first start with a simple solution using a list in Python.
class MyHashSet: def __init__(self): self.set =  def add(self, key: int) -> None: if not self.contains(key): self.set.append(key) def remove(self, key: int) -> None: if self.contains(key): self.set.remove(key) def contains(self, key: int) -> bool: return key in self.set
In this simple solution, we use an array (
self.set) to store the elements. While this solution works, it has a high time complexity due to searching in an array, which takes O(n) time. This would be inefficient when dealing with a large number of elements.
Method 2: Using a Boolean Array
To optimize our solution, we can use a boolean array. This solution assumes that the keys are non-negative and fit into the memory.
class MyHashSet: def __init__(self): self.set = [False]*1000001 # Initialize a boolean array def add(self, key: int) -> None: self.set[key] = True def remove(self, key: int) -> None: self.set[key] = False def contains(self, key: int) -> bool: return self.set[key]
In this solution, the index of the array represents the key, and the value (True/False) indicates whether the key is present. The add, remove, and contains operations all have O(1) time complexity, making this a more efficient solution.
Method 3: Using a Bucket and LinkedList
While the previous method is efficient, it assumes that keys are non-negative and have an upper bound. To make our HashSet more flexible, we can implement it using an array of linked lists, often called “buckets.”
In this method, we first map the key to a specific bucket using a hash function. If a collision happens (two different keys map to the same bucket), we use a linked list to store all the keys.
Let’s implement this in Python:
class Node: def __init__(self, value, nextNode=None): self.value = value self.next = nextNode class Bucket: def __init__(self): # A dummy head node self.head = Node(0) def insert(self, newValue): # Check the key is already existed or not if not self.exists(newValue): newNode = Node(newValue, self.head.next) self.head.next = newNode # Else: do not insert the new value since it's already existed def delete(self, value): prev = self.head curr = self.head.next while curr is not None: if curr.value == value: # Delete the current node prev.next = curr.next return prev = curr curr = curr.next def exists(self, value): curr = self.head.next while curr is not None: if curr.value == value: return True curr = curr.next return False class MyHashSet: def __init__(self): self.size = 1997 # size of the bucket list self.buckets = [Bucket() for i in range(self.size)] def add(self, key: int) -> None: self.buckets[key % self.size].insert(key) def remove(self, key: int) -> None: self.buckets[key % self.size].delete(key) def contains(self, key: int) -> bool: return self.buckets[key % self.size].exists(key)
In this approach, we implement a singly linked list for each bucket. We chose the size of the bucket list to be a prime number (1997) to reduce the chance of mapping different keys to the same bucket, which can lead to a performance degradation. This approach provides a balance between the time and space complexity and makes fewer assumptions about the keys.
Designing a HashSet data structure is a great exercise to understand hashing, a crucial concept in computer science. In this article, we have explored different ways to implement a HashSet to solve the LeetCode problem “Design HashSet.”
We started with a simple approach using an array, then improved it using a boolean array, and finally implemented an optimized approach using a bucket of linked lists, suitable for a wider range of applications.