While we believe that this content benefits our community, we have not yet thoroughly reviewed it. If you have any suggestions for improvements, please let us know by clicking the “report an issue“ button at the bottom of the tutorial.
A Hash Table in C/C++ (Associative array) is a data structure that maps keys to values. This uses a hash function to compute indexes for a key.
Based on the Hash Table index, we can store the value at the appropriate location.
If two different keys get the same index, we need to use other data structures (buckets) to account for these collisions.
The whole benefit of using a Hash Table is due to it’s very fast access time. While there can be a collision, if we choose a very good hash function, this chance is almost zero.
So, on average, the time complexity is a constant O(1) access time. This is called Amortized Time Complexity.
The C++ STL (Standard Template Library) has the std::unordered_map()
data structure which implements all these hash table functions.
However, knowing how to construct a hash table from scratch is a crucial skill, and that is indeed what we aim to show you.
Let us understand more about the implementation details.
Any Hash Table implementation has the following three components:
The first step is to choose a reasonably good hash function that has a low chance of collision. But, since this is for illustration, I will be doing the opposite! Reverse Psychology, eh?
We will be working only with strings (or character arrays in C) in this article.
I’ll be using a very simple hash function, that simply sums the ASCII value of the string. I am using this to show you how we will handle collision cases.
#define CAPACITY 50000 // Size of the Hash Table
unsigned long hash_function(char* str) {
unsigned long i = 0;
for (int j=0; str[j]; j++)
i += str[j];
return i % CAPACITY;
}
You can test this for different strings and check if they collide or not. For example, the strings “Hel” and “Cau” will collide, since they have the same ASCII value.
NOTE: We must return a number within the capacity of the hash table. Otherwise, we may access an unbound memory location, leading to an error.
A Hash table is an array of items, which themselves are a {key: value} pair.
Let’s define our item structure first.
typedef struct Ht_item Ht_item;
// Define the Hash Table Item here
struct Ht_item {
char* key;
char* value;
};
Now, the Hash table has an array of pointers which themselves point to Ht_item
, so it is a double-pointer.
Other than that, we will also keep track of the number of elements in the Hash table using count
, and store the size of the table in size
.
typedef struct HashTable HashTable;
// Define the Hash Table here
struct HashTable {
// Contains an array of pointers
// to items
Ht_item** items;
int size;
int count;
};
We need functions to create a new Hash table into memory and also create its items.
Let’s create the item first. This is very simple since we only need to allocate memory for its key and value and return a pointer to the item.
Ht_item* create_item(char* key, char* value) {
// Creates a pointer to a new hash table item
Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
item->key = (char*) malloc (strlen(key) + 1);
item->value = (char*) malloc (strlen(value) + 1);
strcpy(item->key, key);
strcpy(item->value, value);
return item;
}
Now, let’s write the code for creating the table. This allocates memory for the wrapper structure HashTable
, and sets all it’s items to NULL
(Since they aren’t used).
HashTable* create_table(int size) {
// Creates a new HashTable
HashTable* table = (HashTable*) malloc (sizeof(HashTable));
table->size = size;
table->count = 0;
table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
for (int i=0; i<table->size; i++)
table->items[i] = NULL;
return table;
}
Now, we are almost done with this part. As a C/C++ programmer, it is your responsibility to free up memory that you’ve allocated on the heap using malloc()
, calloc()
.
So let’s write functions which free up a table item, and the whole table too.
void free_item(Ht_item* item) {
// Frees an item
free(item->key);
free(item->value);
free(item);
}
void free_table(HashTable* table) {
// Frees the table
for (int i=0; i<table->size; i++) {
Ht_item* item = table->items[i];
if (item != NULL)
free_item(item);
}
free(table->items);
free(table);
}
We’ve now completed our groundwork to building a functional Hash Table. Let’s now start writing our insert()
, search()
and delete()
methods.
We will create a function ht_insert()
that performs this task for us.
This takes in a HashTable
pointer, a key
and a value
as parameters.
void ht_insert(HashTable* table, char* key, char* value);
Now, there are certain steps involved in the insert function.
index
I will explain how we will handle collisions after we create the initial model, so let’s wait a bit for that.
The first step is simple. We directly call create_item(key, value)
.
int index = hash_function(key);
The second and third steps use hash_function(key)
to get the index. If we are inserting the key for the first time, the item must be a NULL
. Otherwise, the exact key : value pair already exists, or it is a collision.
In this case, we’ll define another function handle_collision()
, which as the name suggests, will handle that potential collision for us.
// Create the item
Ht_item* item = create_item(key, value);
// Compute the index
int index = hash_function(key);
Ht_item* current_item = table->items[index];
if (current_item == NULL) {
// Key does not exist.
if (table->count == table->size) {
// Hash Table Full
printf("Insert Error: Hash Table is full\n");
free_item(item);
return;
}
// Insert directly
table->items[index] = item;
table->count++;
}
Let’s consider the first scenario where the key : value pair already exists (i.e the same item was already inserted before). In this case, we must update the item value only to the new one.
if (current_item == NULL) {
....
....
}
else {
// Scenario 1: We only need to update value
if (strcmp(current_item->key, key) == 0) {
strcpy(table->items[index]->value, value);
return;
}
else {
// Scenario 2: Collision
// We will handle case this a bit later
handle_collision(table, item);
return;
}
}
Okay, so our insert function (without collisions) now looks a bit like this:
void handle_collision(HashTable* table, Ht_item* item) {
}
void ht_insert(HashTable* table, char* key, char* value) {
// Create the item
Ht_item* item = create_item(key, value);
Ht_item* current_item = table->items[index];
if (current_item == NULL) {
// Key does not exist.
if (table->count == table->size) {
// Hash Table Full
printf("Insert Error: Hash Table is full\n");
return;
}
// Insert directly
table->items[index] = item;
table->count++;
}
else {
// Scenario 1: We only need to update value
if (strcmp(current_item->key, key) == 0) {
strcpy(table->items[index]->value, value);
return;
}
else {
// Scenario 2: Collision
// We will handle case this a bit later
handle_collision(table, item);
return;
}
}
}
If we want to check if the insertion was done correctly, we also need to define a search function, that checks if the key exists or not, and returns the corresponding value if it does.
char* ht_search(HastTable* table, char* key);
The logic is very simple. It simply moves to non-NULL items and compares the key. Otherwise, we will return NULL
char* ht_search(HashTable* table, char* key) {
// Searches the key in the hashtable
// and returns NULL if it doesn't exist
int index = hash_function(key);
Ht_item* item = table->items[index];
// Ensure that we move to a non NULL item
if (item != NULL) {
if (strcmp(item->key, key) == 0)
return item->value;
}
return NULL;
}
Let’s check if what we’ve written was correct, by writing a main()
driver program.
We’ll add one more function print_table()
that prints the Hash Table, for illustration.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define CAPACITY 50000 // Size of the Hash Table
unsigned long hash_function(char* str) {
unsigned long i = 0;
for (int j=0; str[j]; j++)
i += str[j];
return i % CAPACITY;
}
typedef struct Ht_item Ht_item;
// Define the Hash Table Item here
struct Ht_item {
char* key;
char* value;
};
typedef struct HashTable HashTable;
// Define the Hash Table here
struct HashTable {
// Contains an array of pointers
// to items
Ht_item** items;
int size;
int count;
};
Ht_item* create_item(char* key, char* value) {
// Creates a pointer to a new hash table item
Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
item->key = (char*) malloc (strlen(key) + 1);
item->value = (char*) malloc (strlen(value) + 1);
strcpy(item->key, key);
strcpy(item->value, value);
return item;
}
HashTable* create_table(int size) {
// Creates a new HashTable
HashTable* table = (HashTable*) malloc (sizeof(HashTable));
table->size = size;
table->count = 0;
table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
for (int i=0; i<table->size; i++)
table->items[i] = NULL;
return table;
}
void free_item(Ht_item* item) {
// Frees an item
free(item->key);
free(item->value);
free(item);
}
void free_table(HashTable* table) {
// Frees the table
for (int i=0; i<table->size; i++) {
Ht_item* item = table->items[i];
if (item != NULL)
free_item(item);
}
free(table->items);
free(table);
}
void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
}
void ht_insert(HashTable* table, char* key, char* value) {
// Create the item
Ht_item* item = create_item(key, value);
// Compute the index
unsigned long index = hash_function(key);
Ht_item* current_item = table->items[index];
if (current_item == NULL) {
// Key does not exist.
if (table->count == table->size) {
// Hash Table Full
printf("Insert Error: Hash Table is full\n");
// Remove the create item
free_item(item);
return;
}
// Insert directly
table->items[index] = item;
table->count++;
}
else {
// Scenario 1: We only need to update value
if (strcmp(current_item->key, key) == 0) {
strcpy(table->items[index]->value, value);
return;
}
else {
// Scenario 2: Collision
// We will handle case this a bit later
handle_collision(table, index, item);
return;
}
}
}
char* ht_search(HashTable* table, char* key) {
// Searches the key in the hashtable
// and returns NULL if it doesn't exist
int index = hash_function(key);
Ht_item* item = table->items[index];
// Ensure that we move to a non NULL item
if (item != NULL) {
if (strcmp(item->key, key) == 0)
return item->value;
}
return NULL;
}
void print_search(HashTable* table, char* key) {
char* val;
if ((val = ht_search(table, key)) == NULL) {
printf("Key:%s does not exist\n", key);
return;
}
else {
printf("Key:%s, Value:%s\n", key, val);
}
}
void print_table(HashTable* table) {
printf("\nHash Table\n-------------------\n");
for (int i=0; i<table->size; i++) {
if (table->items[i]) {
printf("Index:%d, Key:%s, Value:%s\n", i, table->items[i]->key, table->items[i]->value);
}
}
printf("-------------------\n\n");
}
int main() {
HashTable* ht = create_table(CAPACITY);
ht_insert(ht, "1", "First address");
ht_insert(ht, "2", "Second address");
print_search(ht, "1");
print_search(ht, "2");
print_search(ht, "3");
print_table(ht);
free_table(ht);
return 0;
}
Output
Key:1, Value:First address
Key:2, Value:Second address
Key:3 does not exist
Hash Table
-------------------
Index:49, Key:1, Value:First address
Index:50, Key:2, Value:Second address
-------------------
Great! This seems to work as expected, so now let’s move on to handling collisions.
There are different ways through which a collision can be resolved. We will look at a method called Separate Chaining, which aims to create independent chains for all items that have the same hash index.
We will create these chains using Linked Lists.
Whenever there is a collision, we will add further items that collide on the same index on an Overflow Bucket List. Thus, we will not have to delete any existing records on our hash table.
Due to linked lists having O(n) time complexity for insertion, searching and deletion, in case of a collision, we will have a worst-case access time of O(n) as well. The advantage of this method is that it is a good choice if your hash table has a low capacity.
With that covered, let’s start implementing our good old Linked List!
typedef struct LinkedList LinkedList;
// Define the Linkedlist here
struct LinkedList {
Ht_item* item;
LinkedList* next;
};
LinkedList* allocate_list () {
// Allocates memory for a Linkedlist pointer
LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList));
return list;
}
LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) {
// Inserts the item onto the Linked List
if (!list) {
LinkedList* head = allocate_list();
head->item = item;
head->next = NULL;
list = head;
return list;
}
else if (list->next == NULL) {
LinkedList* node = allocate_list();
node->item = item;
node->next = NULL;
list->next = node;
return list;
}
LinkedList* temp = list;
while (temp->next->next) {
temp = temp->next;
}
LinkedList* node = allocate_list();
node->item = item;
node->next = NULL;
temp->next = node;
return list;
}
Ht_item* linkedlist_remove(LinkedList* list) {
// Removes the head from the linked list
// and returns the item of the popped element
if (!list)
return NULL;
if (!list->next)
return NULL;
LinkedList* node = list->next;
LinkedList* temp = list;
temp->next = NULL;
list = node;
Ht_item* it = NULL;
memcpy(temp->item, it, sizeof(Ht_item));
free(temp->item->key);
free(temp->item->value);
free(temp->item);
free(temp);
return it;
}
void free_linkedlist(LinkedList* list) {
LinkedList* temp = list;
while (list) {
temp = list;
list = list->next;
free(temp->item->key);
free(temp->item->value);
free(temp->item);
free(temp);
}
}
Now, we need to add these “Overflow Bucket” lists to our Hash Table. We want every item to have one such chain, so for the whole table, it is an array of LinkedList pointers.
typedef struct HashTable HashTable;
// Define the Hash Table here
struct HashTable {
// Contains an array of pointers
// to items
Ht_item** items;
LinkedList** overflow_buckets;
int size;
int count;
};
Now that we’ve defined our overflow_buckets
, we will add functions to create and delete them. We also need to account for them in our old create_table()
and free_table()
functions.
LinkedList** create_overflow_buckets(HashTable* table) {
// Create the overflow buckets; an array of linkedlists
LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*));
for (int i=0; i<table->size; i++)
buckets[i] = NULL;
return buckets;
}
void free_overflow_buckets(HashTable* table) {
// Free all the overflow bucket lists
LinkedList** buckets = table->overflow_buckets;
for (int i=0; i<table->size; i++)
free_linkedlist(buckets[i]);
free(buckets);
}
HashTable* create_table(int size) {
// Creates a new HashTable
HashTable* table = (HashTable*) malloc (sizeof(HashTable));
table->size = size;
table->count = 0;
table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
for (int i=0; i<table->size; i++)
table->items[i] = NULL;
table->overflow_buckets = create_overflow_buckets(table);
return table;
}
void free_table(HashTable* table) {
// Frees the table
for (int i=0; i<table->size; i++) {
Ht_item* item = table->items[i];
if (item != NULL)
free_item(item);
}
// Free the overflow bucket linked linkedlist and it's items
free_overflow_buckets(table);
free(table->items);
free(table);
}
Phew! We still have work to do. Let’s now go to the handle_collision()
function.
There are two scenarios here. If the overflow bucket list for the item does not exist, we need to create one such list and add the item to it.
Otherwise, we can simply insert the item to the list
void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
LinkedList* head = table->overflow_buckets[index];
if (head == NULL) {
// We need to create the list
head = allocate_list();
head->item = item;
table->overflow_buckets[index] = head;
return;
}
else {
// Insert to the list
table->overflow_buckets[index] = linkedlist_insert(head, item);
return;
}
}
So we are done with insertion, but now, we need to update our search function as well, since we may need to look at the overflow buckets as well.
char* ht_search(HashTable* table, char* key) {
// Searches the key in the hashtable
// and returns NULL if it doesn't exist
int index = hash_function(key);
Ht_item* item = table->items[index];
LinkedList* head = table->overflow_buckets[index];
// Ensure that we move to items which are not NULL
while (item != NULL) {
if (strcmp(item->key, key) == 0)
return item->value;
if (head == NULL)
return NULL;
item = head->item;
head = head->next;
}
return NULL;
}
Finally, we have accounted for collisions in insert()
and search()
!
To remind you of the current status of the code, I’ll post the complete program until now.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define CAPACITY 50000 // Size of the Hash Table
unsigned long hash_function(char* str) {
unsigned long i = 0;
for (int j=0; str[j]; j++)
i += str[j];
return i % CAPACITY;
}
typedef struct Ht_item Ht_item;
// Define the Hash Table Item here
struct Ht_item {
char* key;
char* value;
};
typedef struct LinkedList LinkedList;
// Define the Linkedlist here
struct LinkedList {
Ht_item* item;
LinkedList* next;
};
typedef struct HashTable HashTable;
// Define the Hash Table here
struct HashTable {
// Contains an array of pointers
// to items
Ht_item** items;
LinkedList** overflow_buckets;
int size;
int count;
};
static LinkedList* allocate_list () {
// Allocates memory for a Linkedlist pointer
LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList));
return list;
}
static LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) {
// Inserts the item onto the Linked List
if (!list) {
LinkedList* head = allocate_list();
head->item = item;
head->next = NULL;
list = head;
return list;
}
else if (list->next == NULL) {
LinkedList* node = allocate_list();
node->item = item;
node->next = NULL;
list->next = node;
return list;
}
LinkedList* temp = list;
while (temp->next->next) {
temp = temp->next;
}
LinkedList* node = allocate_list();
node->item = item;
node->next = NULL;
temp->next = node;
return list;
}
static Ht_item* linkedlist_remove(LinkedList* list) {
// Removes the head from the linked list
// and returns the item of the popped element
if (!list)
return NULL;
if (!list->next)
return NULL;
LinkedList* node = list->next;
LinkedList* temp = list;
temp->next = NULL;
list = node;
Ht_item* it = NULL;
memcpy(temp->item, it, sizeof(Ht_item));
free(temp->item->key);
free(temp->item->value);
free(temp->item);
free(temp);
return it;
}
static void free_linkedlist(LinkedList* list) {
LinkedList* temp = list;
while (list) {
temp = list;
list = list->next;
free(temp->item->key);
free(temp->item->value);
free(temp->item);
free(temp);
}
}
static LinkedList** create_overflow_buckets(HashTable* table) {
// Create the overflow buckets; an array of linkedlists
LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*));
for (int i=0; i<table->size; i++)
buckets[i] = NULL;
return buckets;
}
static void free_overflow_buckets(HashTable* table) {
// Free all the overflow bucket lists
LinkedList** buckets = table->overflow_buckets;
for (int i=0; i<table->size; i++)
free_linkedlist(buckets[i]);
free(buckets);
}
Ht_item* create_item(char* key, char* value) {
// Creates a pointer to a new hash table item
Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
item->key = (char*) malloc (strlen(key) + 1);
item->value = (char*) malloc (strlen(value) + 1);
strcpy(item->key, key);
strcpy(item->value, value);
return item;
}
HashTable* create_table(int size) {
// Creates a new HashTable
HashTable* table = (HashTable*) malloc (sizeof(HashTable));
table->size = size;
table->count = 0;
table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
for (int i=0; i<table->size; i++)
table->items[i] = NULL;
table->overflow_buckets = create_overflow_buckets(table);
return table;
}
void free_item(Ht_item* item) {
// Frees an item
free(item->key);
free(item->value);
free(item);
}
void free_table(HashTable* table) {
// Frees the table
for (int i=0; i<table->size; i++) {
Ht_item* item = table->items[i];
if (item != NULL)
free_item(item);
}
free_overflow_buckets(table);
free(table->items);
free(table);
}
void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
LinkedList* head = table->overflow_buckets[index];
if (head == NULL) {
// We need to create the list
head = allocate_list();
head->item = item;
table->overflow_buckets[index] = head;
return;
}
else {
// Insert to the list
table->overflow_buckets[index] = linkedlist_insert(head, item);
return;
}
}
void ht_insert(HashTable* table, char* key, char* value) {
// Create the item
Ht_item* item = create_item(key, value);
// Compute the index
unsigned long index = hash_function(key);
Ht_item* current_item = table->items[index];
if (current_item == NULL) {
// Key does not exist.
if (table->count == table->size) {
// Hash Table Full
printf("Insert Error: Hash Table is full\n");
// Remove the create item
free_item(item);
return;
}
// Insert directly
table->items[index] = item;
table->count++;
}
else {
// Scenario 1: We only need to update value
if (strcmp(current_item->key, key) == 0) {
strcpy(table->items[index]->value, value);
return;
}
else {
// Scenario 2: Collision
handle_collision(table, index, item);
return;
}
}
}
char* ht_search(HashTable* table, char* key) {
// Searches the key in the hashtable
// and returns NULL if it doesn't exist
int index = hash_function(key);
Ht_item* item = table->items[index];
LinkedList* head = table->overflow_buckets[index];
// Ensure that we move to items which are not NULL
while (item != NULL) {
if (strcmp(item->key, key) == 0)
return item->value;
if (head == NULL)
return NULL;
item = head->item;
head = head->next;
}
return NULL;
}
void print_search(HashTable* table, char* key) {
char* val;
if ((val = ht_search(table, key)) == NULL) {
printf("%s does not exist\n", key);
return;
}
else {
printf("Key:%s, Value:%s\n", key, val);
}
}
void print_table(HashTable* table) {
printf("\n-------------------\n");
for (int i=0; i<table->size; i++) {
if (table->items[i]) {
printf("Index:%d, Key:%s, Value:%s", i, table->items[i]->key, table->items[i]->value);
if (table->overflow_buckets[i]) {
printf(" => Overflow Bucket => ");
LinkedList* head = table->overflow_buckets[i];
while (head) {
printf("Key:%s, Value:%s ", head->item->key, head->item->value);
head = head->next;
}
}
printf("\n");
}
}
printf("-------------------\n");
}
int main() {
HashTable* ht = create_table(CAPACITY);
ht_insert(ht, "1", "First address");
ht_insert(ht, "2", "Second address");
ht_insert(ht, "Hel", "Third address");
ht_insert(ht, "Cau", "Fourth address");
print_search(ht, "1");
print_search(ht, "2");
print_search(ht, "3");
print_search(ht, "Hel");
print_search(ht, "Cau");
print_table(ht);
free_table(ht);
return 0;
}
Let’s now finally look at the delete function:
void ht_delete(HashTable* table, char* key);
Again, the method is similar to insertion.
I am not providing you with too much detail since this only involves updating the head items and freeing up memory. My suggestion would be to try and implement this yourself.
I will provide you with a working version for comparison.
void ht_delete(HashTable* table, char* key) {
// Deletes an item from the table
int index = hash_function(key);
Ht_item* item = table->items[index];
LinkedList* head = table->overflow_buckets[index];
if (item == NULL) {
// Does not exist. Return
return;
}
else {
if (head == NULL && strcmp(item->key, key) == 0) {
// No collision chain. Remove the item
// and set table index to NULL
table->items[index] = NULL;
free_item(item);
table->count--;
return;
}
else if (head != NULL) {
// Collision Chain exists
if (strcmp(item->key, key) == 0) {
// Remove this item and set the head of the list
// as the new item
free_item(item);
LinkedList* node = head;
head = head->next;
node->next = NULL;
table->items[index] = create_item(node->item->key, node->item->value);
free_linkedlist(node);
table->overflow_buckets[index] = head;
return;
}
LinkedList* curr = head;
LinkedList* prev = NULL;
while (curr) {
if (strcmp(curr->item->key, key) == 0) {
if (prev == NULL) {
// First element of the chain. Remove the chain
free_linkedlist(head);
table->overflow_buckets[index] = NULL;
return;
}
else {
// This is somewhere in the chain
prev->next = curr->next;
curr->next = NULL;
free_linkedlist(curr);
table->overflow_buckets[index] = head;
return;
}
}
curr = curr->next;
prev = curr;
}
}
}
}
Now, finally, I will show the entire program for the Hash Table.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define CAPACITY 50000 // Size of the Hash Table
unsigned long hash_function(char* str) {
unsigned long i = 0;
for (int j=0; str[j]; j++)
i += str[j];
return i % CAPACITY;
}
typedef struct Ht_item Ht_item;
// Define the Hash Table Item here
struct Ht_item {
char* key;
char* value;
};
typedef struct LinkedList LinkedList;
// Define the Linkedlist here
struct LinkedList {
Ht_item* item;
LinkedList* next;
};
typedef struct HashTable HashTable;
// Define the Hash Table here
struct HashTable {
// Contains an array of pointers
// to items
Ht_item** items;
LinkedList** overflow_buckets;
int size;
int count;
};
static LinkedList* allocate_list () {
// Allocates memory for a Linkedlist pointer
LinkedList* list = (LinkedList*) malloc (sizeof(LinkedList));
return list;
}
static LinkedList* linkedlist_insert(LinkedList* list, Ht_item* item) {
// Inserts the item onto the Linked List
if (!list) {
LinkedList* head = allocate_list();
head->item = item;
head->next = NULL;
list = head;
return list;
}
else if (list->next == NULL) {
LinkedList* node = allocate_list();
node->item = item;
node->next = NULL;
list->next = node;
return list;
}
LinkedList* temp = list;
while (temp->next->next) {
temp = temp->next;
}
LinkedList* node = allocate_list();
node->item = item;
node->next = NULL;
temp->next = node;
return list;
}
static Ht_item* linkedlist_remove(LinkedList* list) {
// Removes the head from the linked list
// and returns the item of the popped element
if (!list)
return NULL;
if (!list->next)
return NULL;
LinkedList* node = list->next;
LinkedList* temp = list;
temp->next = NULL;
list = node;
Ht_item* it = NULL;
memcpy(temp->item, it, sizeof(Ht_item));
free(temp->item->key);
free(temp->item->value);
free(temp->item);
free(temp);
return it;
}
static void free_linkedlist(LinkedList* list) {
LinkedList* temp = list;
while (list) {
temp = list;
list = list->next;
free(temp->item->key);
free(temp->item->value);
free(temp->item);
free(temp);
}
}
static LinkedList** create_overflow_buckets(HashTable* table) {
// Create the overflow buckets; an array of linkedlists
LinkedList** buckets = (LinkedList**) calloc (table->size, sizeof(LinkedList*));
for (int i=0; i<table->size; i++)
buckets[i] = NULL;
return buckets;
}
static void free_overflow_buckets(HashTable* table) {
// Free all the overflow bucket lists
LinkedList** buckets = table->overflow_buckets;
for (int i=0; i<table->size; i++)
free_linkedlist(buckets[i]);
free(buckets);
}
Ht_item* create_item(char* key, char* value) {
// Creates a pointer to a new hash table item
Ht_item* item = (Ht_item*) malloc (sizeof(Ht_item));
item->key = (char*) malloc (strlen(key) + 1);
item->value = (char*) malloc (strlen(value) + 1);
strcpy(item->key, key);
strcpy(item->value, value);
return item;
}
HashTable* create_table(int size) {
// Creates a new HashTable
HashTable* table = (HashTable*) malloc (sizeof(HashTable));
table->size = size;
table->count = 0;
table->items = (Ht_item**) calloc (table->size, sizeof(Ht_item*));
for (int i=0; i<table->size; i++)
table->items[i] = NULL;
table->overflow_buckets = create_overflow_buckets(table);
return table;
}
void free_item(Ht_item* item) {
// Frees an item
free(item->key);
free(item->value);
free(item);
}
void free_table(HashTable* table) {
// Frees the table
for (int i=0; i<table->size; i++) {
Ht_item* item = table->items[i];
if (item != NULL)
free_item(item);
}
free_overflow_buckets(table);
free(table->items);
free(table);
}
void handle_collision(HashTable* table, unsigned long index, Ht_item* item) {
LinkedList* head = table->overflow_buckets[index];
if (head == NULL) {
// We need to create the list
head = allocate_list();
head->item = item;
table->overflow_buckets[index] = head;
return;
}
else {
// Insert to the list
table->overflow_buckets[index] = linkedlist_insert(head, item);
return;
}
}
void ht_insert(HashTable* table, char* key, char* value) {
// Create the item
Ht_item* item = create_item(key, value);
// Compute the index
unsigned long index = hash_function(key);
Ht_item* current_item = table->items[index];
if (current_item == NULL) {
// Key does not exist.
if (table->count == table->size) {
// Hash Table Full
printf("Insert Error: Hash Table is full\n");
// Remove the create item
free_item(item);
return;
}
// Insert directly
table->items[index] = item;
table->count++;
}
else {
// Scenario 1: We only need to update value
if (strcmp(current_item->key, key) == 0) {
strcpy(table->items[index]->value, value);
return;
}
else {
// Scenario 2: Collision
handle_collision(table, index, item);
return;
}
}
}
char* ht_search(HashTable* table, char* key) {
// Searches the key in the hashtable
// and returns NULL if it doesn't exist
int index = hash_function(key);
Ht_item* item = table->items[index];
LinkedList* head = table->overflow_buckets[index];
// Ensure that we move to items which are not NULL
while (item != NULL) {
if (strcmp(item->key, key) == 0)
return item->value;
if (head == NULL)
return NULL;
item = head->item;
head = head->next;
}
return NULL;
}
void ht_delete(HashTable* table, char* key) {
// Deletes an item from the table
int index = hash_function(key);
Ht_item* item = table->items[index];
LinkedList* head = table->overflow_buckets[index];
if (item == NULL) {
// Does not exist. Return
return;
}
else {
if (head == NULL && strcmp(item->key, key) == 0) {
// No collision chain. Remove the item
// and set table index to NULL
table->items[index] = NULL;
free_item(item);
table->count--;
return;
}
else if (head != NULL) {
// Collision Chain exists
if (strcmp(item->key, key) == 0) {
// Remove this item and set the head of the list
// as the new item
free_item(item);
LinkedList* node = head;
head = head->next;
node->next = NULL;
table->items[index] = create_item(node->item->key, node->item->value);
free_linkedlist(node);
table->overflow_buckets[index] = head;
return;
}
LinkedList* curr = head;
LinkedList* prev = NULL;
while (curr) {
if (strcmp(curr->item->key, key) == 0) {
if (prev == NULL) {
// First element of the chain. Remove the chain
free_linkedlist(head);
table->overflow_buckets[index] = NULL;
return;
}
else {
// This is somewhere in the chain
prev->next = curr->next;
curr->next = NULL;
free_linkedlist(curr);
table->overflow_buckets[index] = head;
return;
}
}
curr = curr->next;
prev = curr;
}
}
}
}
void print_search(HashTable* table, char* key) {
char* val;
if ((val = ht_search(table, key)) == NULL) {
printf("%s does not exist\n", key);
return;
}
else {
printf("Key:%s, Value:%s\n", key, val);
}
}
void print_table(HashTable* table) {
printf("\n-------------------\n");
for (int i=0; i<table->size; i++) {
if (table->items[i]) {
printf("Index:%d, Key:%s, Value:%s", i, table->items[i]->key, table->items[i]->value);
if (table->overflow_buckets[i]) {
printf(" => Overflow Bucket => ");
LinkedList* head = table->overflow_buckets[i];
while (head) {
printf("Key:%s, Value:%s ", head->item->key, head->item->value);
head = head->next;
}
}
printf("\n");
}
}
printf("-------------------\n");
}
int main() {
HashTable* ht = create_table(CAPACITY);
ht_insert(ht, "1", "First address");
ht_insert(ht, "2", "Second address");
ht_insert(ht, "Hel", "Third address");
ht_insert(ht, "Cau", "Fourth address");
print_search(ht, "1");
print_search(ht, "2");
print_search(ht, "3");
print_search(ht, "Hel");
print_search(ht, "Cau"); // Collision!
print_table(ht);
ht_delete(ht, "1");
ht_delete(ht, "Cau");
print_table(ht);
free_table(ht);
return 0;
}
Output
Key:1, Value:First address
Key:2, Value:Second address
3 does not exist
Key:Hel, Value:Third address
Key:Cau, Value:Fourth address
-------------------
Index:49, Key:1, Value:First address
Index:50, Key:2, Value:Second address
Index:281, Key:Hel, Value:Third address => Overflow Bucket => Key:Cau, Value:Fourth address
-------------------
-------------------
Index:50, Key:2, Value:Second address
Index:281, Key:Hel, Value:Third address
-------------------
Hopefully, you’ve understood how a Hash Table can be implemented from scratch in C/C++ and possibly implemented one yourself!
I would also suggest you try other collision handling algorithms and different hash functions on this Hash Table and test their performance.
You can get the code through a Github Gist that I’ve uploaded. Feel free to ask any questions or provide suggestions in the comment section below!
Join our DigitalOcean community of over a million developers for free! Get help and share knowledge in our Questions & Answers section, find tutorials and tools that will help you grow as a developer and scale your project or business, and subscribe to topics of interest.
Sign up
Create table as a redundant loop. You allocated memory with malloc which makes all inner cells NULL.
- Moshe
Thank you for putting this article together. Conceptually I can follow and appreciate your work. I think their is some issue with the ht_insert and linkedlist_insert functions. At the time of writing I am unable to pin point the issue. Maybe you fixed it. If so I suggest revising the article. Regardless thank you for sharing. Regards Mahendra Gunawardena
- Mahendra Gunawardena
Ht_item* it = NULL; memcpy(temp->item, it, sizeof(Ht_item)); Doesn’t your compiler warn about this? you are copying NULL to temp->item. shouldn’t this be: memcpy( it, temp->item, sizeof(Ht_item)); Because in the current implementation you are freeing NULL this should throw an error and can lead to an memory leak.
- Rick Nijhuis
Could you post how to alter the same code for phone book application that compiles in dev c++
- Anon
Since the linked list is not ordered, you can simplify the linkedlist_insert function by inserting at the head instead of the tail. This simplifies the function and makes insertion time O(1).
- Justin Bode
In the function ht_insert(), we should not call strcpy directly for the value, first we need to free the existing value and allocate new memory for the new value or we must call realloc. so below code will lead overflow. // Scenario 1: We only need to update value if (strcmp(current_item->key, key) == 0) { strcpy(table->items[index]->value, value); return; }
- Dileep
HI, i would like to add another thing I found. In the insert element into table function, I don’t think we need to check if the table is full, since a table using separate chaining can never be full. Once there is a collision, we add it to the chain, as long as we have enough memory. I feel like we need to check the memory allocation in this instead of the table index. thx!
- another
This is the best tutorial for implementing hash table! Thank you so much ! Have a littble doubl in the delete table function. if (strcmp(curr->item->key, key) == 0) { if (prev == NULL) { // First element of the chain. Remove the chain free_linkedlist(head); table->overflow_buckets[index] = NULL; return; } Maybe if the matched element found on the first element of the chain, table->overflow_buckets[index] = NULL; We shouldn’t set it to null, since the chain has another nodes as well. we just need to take the heed node off the chain, then set the chain back to table ? thank you
- unknow
else if (head != NULL) { // Collision Chain exists if (strcmp(item->key, key) == 0) { // Remove this item and set the head of the list // as the new item free_item(item); LinkedList* node = head; head = head->next; node->next = NULL; table->items[index] = create_item(node->item->key, node->item->value); free_linkedlist(node); table->overflow_buckets[index] = head; return; } LinkedList* curr = head; LinkedList* prev = NULL; while (curr) { if (strcmp(curr->item->key, key) == 0) { if (prev == NULL) { // First element of the chain. Remove the chain free_linkedlist(head); table->overflow_buckets[index] = NULL; return; } else { // This is somewhere in the chain prev->next = curr->next; curr->next = NULL; free_linkedlist(curr); table->overflow_buckets[index] = head; return; } } curr = curr->next; prev = curr; } } } } HI, I am quite not sure I understand the logic part of deleting element from table here. If I understand correctly, there are 2 possibilities if the collision chain exist. The element is on the table index, or it is on the chain, if it is on the table, we will delete it on the table, then move the first node of chain to the table, and the rest of the original chain will become the new chain attached to it. what confuses me is the second case; if it is on the chain, I saw your code check if it is on the head node or somewhere else in the chain. Why you remove the whole chain if there is one element matched on the chain. Aren’t we supposed to just detele the matched node and put the rest of chain back to it ? We are deleting the matched string, not the matched hashed function index. Thank you!
- unknown
if i want the user to input the key and value, how to do it? can you please help me?
- irene