A HashMap, or a Hash Table, is a data structure that implements an associative array abstract data type, a structure that can map keys to values. A HashMap uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.
In this article, we’ll delve deep into the LeetCode problem 706, “Design HashMap,” which requires us to design a HashMap without using any built-in hash table libraries. To solve this problem, we will explore various approaches, from simple to optimized ones.
Problem Statement
We are required to design a HashMap that can perform the following operations:
put(key, value)
: Inserts a (key, value) pair into the HashMap. If the key already exists in the HashMap, update the corresponding value.get(key)
: Returns the value to which the specified key is mapped, or -1 if this map contains no mapping for the key.remove(key)
: Removes the mapping for the value key if this map contains the mapping for the key.
Let’s solve this problem in Python step-by-step.
Method 1: Using an Array
The simplest way to implement a HashMap is by using a list. We will represent each entry with another list of two elements, where the first element is the key and the second element is the value.
class MyHashMap:
def __init__(self):
self.size = 1000000
self.buckets = [[] for _ in range(self.size)]
def put(self, key, value):
for i, kv in enumerate(self.buckets[key % self.size]):
if kv[0] == key:
self.buckets[key % self.size][i] = (key, value)
return
self.buckets[key % self.size].append((key, value))
def get(self, key):
for k, v in self.buckets[key % self.size]:
if k == key:
return v
return -1
def remove(self, key):
for i, kv in enumerate(self.buckets[key % self.size]):
if kv[0] == key:
del self.buckets[key % self.size][i]
return
This solution is quite simple but inefficient because the get
, put
, and remove
operations have a time complexity of O(n) due to the linear search in a bucket.
Method 2: Using a Bucket and LinkedList
In order to optimize our solution, we can use an array of linked lists (known as buckets). The idea here is to assign a bucket to each key using a hash function and then use a linked list to handle any collisions (when different keys get hashed to the same bucket).
class ListNode:
def __init__(self, key, value):
self.pair = [key, value]
self.next = None
class MyHashMap:
def __init__(self):
self.size = 1000
self.buckets = [None]*self.size
def put(self, key, value):
index = key % self.size
if self.buckets[index] == None:
self.buckets[index] = ListNode(key, value)
else:
cur_node = self.buckets[index]
while True:
if cur_node.pair[0] == key:
cur_node.pair[1] = value
return
if cur_node.next == None:
break
cur_node = cur_node.next
cur_node.next = ListNode(key, value)
def get(self, key):
index = key % self.size
cur_node = self.buckets[index]
while cur_node:
if cur_node.pair[0] == key:
return cur_node.pair[1]
else:
cur_node = cur_node.next
return -1
def remove(self, key):
index = key % self.size
cur_node = prev_node = self.buckets[index]
if not cur_node:
return
if cur_node.pair[0] == key:
self.buckets[index] = cur_node.next
else:
cur_node = cur_node.next
while cur_node:
if cur_node.pair[0] == key:
prev_node.next = cur_node.next
break
else:
cur_node, prev_node = cur_node.next, prev_node.next
In this solution, we’ve used a singly linked list for each bucket, where the size of the bucket list is chosen to be a prime number (1000) to reduce the chance of mapping different keys to the same bucket. This approach provides a balance between the time and space complexity.
Method 3: Using a Boolean Array
This solution assumes that the keys are non-negative and fit into the memory. We can simply use an array where the index of the array represents the key and the value at that index represents the value corresponding to that key.
class MyHashMap:
def __init__(self):
self.size = 1000000
self.map = [-1] * self.size
def put(self, key, value):
self.map[key] = value
def get(self, key):
return self.map[key]
def remove(self, key):
self.map[key] = -1
In this solution, all operations get
, put
, and remove
have a time complexity of O(1), which makes this approach more efficient than the previous one. However, this method has a limitation that it can only handle non-negative keys and the maximum size is fixed.
Conclusion
In this article, we walked you through different approaches to design a HashMap from scratch in Python, based on the LeetCode problem “Design HashMap.” We began with a simple method using a list, moved on to a slightly complex approach using a bucket and LinkedList, and then discussed an optimized approach using a boolean array.