Skip to content
- Tutorials
- Python Tutorial
- Taking Input in Python
- Python Operators
- Python Data Types
- Python Loops and Control Flow
- Python Functions
- Python OOPS Concept
- Python Data Structures
- Python Exception Handling
- Python File Handling
- Python Exercises
- Java
- Java Programming Language
- Java Collections
- Java 8 Tutorial
- Java Programs
- Java Interview Questions
- Java Exercises
- Java Quiz
- Java Projects
- Advance Java
- Programming Languages
- System Design
- Interview Corner
- Computer Science Subjects
- Operating System
- DBMS
-
Last Updated : 07 Aug, 2024
Summarize
Comments
Improve
Hashing is a technique used in data structures that efficiently stores and retrieves data in a way that allows for quick access. It involves mapping data to a specific index in a hash table using a hash function that enables fast retrieval of information based on its key. This method is commonly used in databases, caching systems, and various programming applications to optimize search and retrieval operations. The great thing about hashing is, we can achieve all three operations (search, insert and delete) in O(1) time on average.
Hashing in Data Structure
Table of Content
- What is Hashing in Data Structure?
- Hash Table in Data Structure
- Hash Function
- What is a Hash Collision?
- Collision Resolution Techniques
- Applications of Hashing
- Basics of Hashing in Data Structure
- Easy Problem on Hashing
- Medium Problem on Hashing
- Hard Problem on Hashing
What is Hashing in Data Structure?
Hashing is a technique used in data structures to store and retrieve data efficiently. It involves using a hash function to map data items to a fixed-size array which is called a hash table. Below are basic terminologies in hashing.
- Hash Function: You provide your data items into the hash function.
- Hash Code: The hash function crunches the data and give a unique hash code. This hash code is typically integer value that can be used an index.
- Hash Table: The hash code then points you to a specific location within the hash table.
Hash Table in Data Structure
A hash table is also referred as a hash map (key value pairs) or a hash set (only keys). It uses a hash function to map keys to a fixed-size array, called a hash table. This allows in faster search, insertion, and deletion operations.
Hash Function
The hash function is a function that takes a key and returns an index into the hash table. The goal of a hash function is to distribute keys evenly across the hash table, minimizing collisions (when two keys map to the same index).
Common hash functions include:
- Division Method: Key % Hash Table Size
- Multiplication Method: (Key * Constant) % Hash Table Size
- Universal Hashing: A family of hash functions designed to minimize collisions
What is a Hash Collision?
A hash collision occurs when two different keys map to the same index in a hash table. This can happen even with a good hash function, especially if the hash table is full or the keys are similar.
See AlsoHash TablesHash Tables Interview Questions & Tips for Senior EngineersHow can you compare arrays and hash tables?Why choose a hash table for your algorithm?Causes of Hash Collisions:
- Poor Hash Function: A hash function that does not distribute keys evenly across the hash table can lead to more collisions.
- High Load Factor: A high load factor (ratio of keys to hash table size) increases the probability of collisions.
- Similar Keys: Keys that are similar in value or structure are more likely to collide.
Collision Resolution Techniques
There are two types of collision resolution techniques:
- Open Addressing:
- Linear Probing: Search for an empty slot sequentially
- Quadratic Probing: Search for an empty slot using a quadratic function
- Closed Addressing:
- Chaining: Store colliding keys in a linked list or binary search tree at each index
- Cuckoo Hashing: Use multiple hash functions to distribute keys
Applications of Hashing
Hash tables are used wherever we have a combinations of search, insert and/or delete operations.
- Dictionaries : To implement a dictionary so that we can quickly search a word
- Databases: Hashing is used in database indexing. There are two popular ways to implement indexing, search trees (B or B+ Tree) and hashing.
- Cryptography : When we create a password on a website, they typically store it after applying a hash function rather than plain text
- Caching: Storing frequently accessed data for faster retrieval. For example browser caches, we can use URL as keys and find the local storage of the URL.
- Symbol Tables: Mapping identifiers to their values in programming languages
- Network Routing: Determining the best path for data packets
- Associative Arrays: Associative arrays are nothing but hash tables only. Commonly SQL library functions allow you retrieve data as associative arrays so that the retrieved data in RAM can be quickly searched for a key.
Please refer Applications of Hashing for more detailed list.
Basics of Hashing in Data Structure
- Introduction to Hashing – Data Structure and Algorithm Tutorials
- What is Hashing?
- Index Mapping (or Trivial Hashing)
- Separate Chaining for Collision Handling
- Open Addressing for Collision Handling
- Double Hashing
- Load Factor and Rehashing
Easy Problem on Hashing
- Find whether an array is subset of another array
- Union and Intersection of two linked lists
- Given an array A[] and a number x, check for pair in A[] with sum as x
- Maximum distance between two occurrences of same element in array
- Count maximum points on same line
- Most frequent element in an array
- Find the only repetitive element between 1 to n-1
- How to check if two given sets are disjoint?
- Non-overlapping sum of two sets
- Check if two arrays are equal or not
- Find missing elements of a range
- Minimum number of subsets with distinct elements
- Remove minimum number of elements such that no common element exist in both array
- Find pairs with given sum such that elements of pair are in different rows
- Count pairs with given sum
- Count quadruples from four sorted arrays whose sum is equal to a given value x
- Sort elements by frequency
- Find all pairs (a, b) in an array such that a % b = k
- Group words with same set of characters
- k-th distinct (or non-repeating) element in an array.
Medium Problem on Hashing
- Find Itinerary from a given list of tickets
- Find number of Employees Under every Employee
- Longest subarray with sum divisible by k
- Find the largest subarray with 0 sum
- Longest Increasing consecutive subsequence
- Count distinct elements in every window of size k
- Design a data structure that supports insert, delete, search and getRandom in constant time
- Find subarray with given sum | Set 2 (Handles Negative Numbers)
- Implementing our Own Hash Table with Separate Chaining in Java
- Implementing own Hash Table with Open Addressing Linear Probing in C++
- Minimum insertions to form a palindrome with permutations allowed
- Maximum possible difference of two subsets of an array
- Sorting using trivial hash function
- Smallest subarray with k distinct numbers
Hard Problem on Hashing
- Clone a Binary Tree with Random Pointers
- Largest subarray with equal number of 0s and 1s
- All unique triplets that sum up to a given value
- Palindrome Substring Queries
- Range Queries for Frequencies of array elements
- Elements to be added so that all elements of a range are present in array
- Cuckoo Hashing – Worst case O(1) Lookup!
- Count subarrays having total distinct elements same as original array
- Maximum array from two given arrays keeping order same
- Find Sum of all unique sub-array sum for a given array.
- Recaman’s sequence
- Length of longest strict bitonic subsequence
- Find All Duplicate Subtrees
- Find if there is a rectangle in binary matrix with corners as 1
Quick Links :
- ‘Practice Problems’ on Hashing
- Top 20 Hashing Technique based Interview Questions
- ‘Quizzes’ on Hashing
- ‘Videos’ on Hashing
Recommended:
Please Login to comment...
Similar Reads
Introduction to Universal Hashing in Data Structure
Universal hashing is a technique used in computer science and information theory for designing hash functions. It is a family of hash functions that can be efficiently computed by using a randomly selected hash function from a set of hash functions. The goal of universal hashing is to minimize the chance of collisions between distinct keys, which c
5 min read
Static Data Structure vs Dynamic Data Structure
Data structure is a way of storing and organizing data efficiently such that the required operations on them can be performed be efficient with respect to time as well as memory. Simply, Data Structure are used to reduce complexity (mostly the time complexity) of the code. Data structures can be two types : 1. Static Data Structure 2. Dynamic Data
4 min read
Encryption vs Encoding vs Hashing
Pre-Requisite: Encryption, Encoding, Hashing. Encryption, Encoding, and Hahsing are similar kinds of things and have little difference between them. They all are used to change the format of the data or data transformation for different purposes. We will discuss them separately. Let us first discuss the definition of all these three processes and t
4 min read
Practice Problems on Hashing
In this article, we will discuss the types of questions based on hashing. Before understanding this, you should have idea about hashing, hash function, open addressing and chaining techniques (see: Introduction, Separate chaining, Open addressing). These are some key points in hashing: The purpose of hashing is to achieve search, insert and delete
8 min read
Coalesced hashing
Coalesced hashing is a collision avoidance technique when there is a fixed sized data. It is a combination of both Separate chaining and Open addressing. It uses the concept of Open Addressing(linear probing) to find first empty place for colliding element from the bottom of the hash table and the concept of Separate Chaining to link the colliding
5 min read
Find the smallest positive number missing from an unsorted array : Hashing Implementation
Given an unsorted array with both positive and negative elements including 0. The task is to find the smallest positive number missing from the array in O(N) time. Examples: Input: arr[] = {-5, 2, 0, -1, -10, 15} Output: 1 Input: arr[] = {0, 1, 2, 3, 4, 5} Output: 6 Input: arr[] = {1, 1, 1, 0, -1, -2} Output: 2 We can use hashing to solve this prob
5 min read
Rearrange characters in a string such that no two adjacent are same using hashing
Given a string str with repeated characters, the task is to rearrange the characters in a string such that no two adjacent characters are the same. If it is possible then print Yes else print No. Examples: Input: str = "geeksforgeeks" Output: Yes "egeksforegeks" is one such arrangement. Input: str = "bbbbb" Output: No Approach: The idea is to store
5 min read
Extendible Hashing (Dynamic approach to DBMS)
Extendible Hashing is a dynamic hashing method wherein directories, and buckets are used to hash data. It is an aggressively flexible method in which the hash function also experiences dynamic changes. Main features of Extendible Hashing: The main features in this hashing technique are: Directories: The directories store addresses of the buckets in
7 min read
Area of the largest square that can be formed from the given length sticks using Hashing
Given an array arr[] of N integers representing the heights of the sticks. The task is to find the area of the largest square that can be formed using these sticks and the count of such squares. Note that a single side of the square can only use a single stick.Examples: Input: arr[] = {5, 3, 2, 3, 6, 3, 3} Output: Area = 9 Count = 1 Side of the squ
6 min read
Full domain Hashing with variable Hash size in Python
A cryptographic hash function is a special class of hash function that has certain properties which make it suitable for use in cryptography. It is a mathematical algorithm that maps data of arbitrary size to a bit string of a fixed size (a hash function) which is designed to also be a one-way function, that is, a function which is infeasible to in
5 min read
Folding Method in Hashing
Folding Method in Hashing: It breaks up a key value into precise segments that are added to form a hash value, and look at another technique is to apply a multiplicative hash function to each segment individually before adding. Some folding methods go one step further and reverse every other piece before the addition. This folding method is indepen
2 min read
What is the difference between Hashing and Hash Tables?
What is Hashing? Hashing refers to the process of generating a fixed-size output from an input of variable size using the mathematical formulas known as hash functions. This technique determines an index or location for the storage of an item in a data structure. It might not be strictly related to key-value pairs only if you are manipulating the d
2 min read
Hashing meaning in DSA
Hashing is defined as a data distribution technique that transforms given key into a different value using hash function for faster access to data. Characteristics of Hashing:Hashing maps the data object to exactly one memory bucket.It allows uniform distribution of keys across the memory.Uses different functions to perform hashing such as mid squa
2 min read
Address Calculation Sort using Hashing
In this sorting algorithm, Hash Function f is used with the property of Order Preserving Function which states that if [Tex]x <= y, f(x) <= f(y) [/Tex]. Hash Function: f(x) = floor( (x/maximum) * SIZE )where maximum => maximum value in the array, SIZE => size of the address table (10 in our case), floor => floor functionThis algorith
14 min read
Longest Palindromic Substring using hashing in O(nlogn)
Given a string S, The task is to find the longest substring which is a palindrome using hashing in O(N log N) time. Input: S: ”forgeeksskeegfor”, Output: “geeksskeeg” Input: S: ”Geeks”, Output: “ee” Hashing to Solve the Problem:The hashing approach to solving the longest palindromic substring problem uses a hash table to store the characters of the
11 min read
Convert an Array to reduced form using Hashing
Given an array with N distinct elements, convert the given array to a form where all elements are in the range from 0 to N-1. The order of elements is the same, i.e., 0 is placed in the place of the smallest element, 1 is placed for the second smallest element, ... N-1 is placed for the largest element. Examples: Input: arr[] = {10, 40, 20}Output:
15+ min read
Minimax Algorithm in Game Theory | Set 5 (Zobrist Hashing)
Previous posts on this topic : Minimax Algorithm in Game Theory, Evaluation Function in Game Theory, Tic-Tac-Toe AI – Finding optimal move, Alpha-Beta Pruning.Zobrist Hashing is a hashing function that is widely used in 2 player board games. It is the most common hashing function used in transposition table. Transposition tables basically store the
12 min read
Index Mapping (or Trivial Hashing) with negatives allowed
Index Mapping (also known as Trivial Hashing) is a simple form of hashing where the data is directly mapped to an index in a hash table. The hash function used in this method is typically the identity function, which maps the input data to itself. In this case, the key of the data is used as the index in the hash table, and the value is stored at t
7 min read
Implement Phone Directory using Hashing
Hashing is a technique that uses fewer key comparisons and searches the element in O(n) time in the worst case and in O(1) time in the average case. The task is to implement all functions of phone directory: create_record display_record delete_record search_record update_record Following data will be taken from the client: ID, Name, Telephone numbe
15+ min read
What is Hashing?
Hashing refers to the process of generating a fixed-size output from an input of variable size using the mathematical formulas known as hash functions. This technique determines an index or location for the storage of an item in a data structure. Need for Hash data structureThe amount of data on the internet is growing exponentially every day, maki
3 min read
Mid-Square hashing
Mid-Square hashing is a hashing technique in which unique keys are generated. In this technique, a seed value is taken and it is squared. Then, some digits from the middle are extracted. These extracted digits form a number which is taken as the new seed. This technique can generate keys with high randomness if a big enough seed value is taken. How
7 min read
Double Hashing
Double hashing is a collision resolution technique used in hash tables. It works by using two hash functions to compute two different hash values for a given key. The first hash function is used to compute the initial hash value, and the second hash function is used to compute the step size for the probing sequence. Double hashing has the ability t
15+ min read
Implement Secure Hashing Algorithm - 512 ( SHA-512 ) as Functional Programming Paradigm
Given a string S of length N, the task is to find the SHA-512 Hash Value of the given string S. Examples: Input: S = "GeeksforGeeks"Output: acc10c4e0b38617f59e88e49215e2e894afaee5ec948c2af6f44039f03c9fe47a9210e01d5cd926c142bdc9179c2ad30f927a8faf69421ff60a5eaddcf8cb9c Input: S = "hello world"Output:309ecc489c12d6eb4cc40f50c902f2b4d0ed77ee511a7c7a9bc
15 min read
Crossword Puzzle Of The Week #35 (For Hashing)
In this issue of Crossword Puzzle of the Week, we will dive into the topic of Hashing data structure. The solution to the crossword puzzle is provided at the end. HINTS: Down: 1. The idea behind separate _______ is to implement the array as a linked list called a chain.3. The situation where the newly inserted key maps to an already occupied, and i
2 min read
Cuckoo Hashing in Python
Cuckoo Hashing derived its name from the cuckoo bird, which lays its eggs in the nests of other birds, replacing their eggs with its own. Cuckoo Hashing works in a similar manner which involves moving the values to different location whenever there is a collision in the hash table. In this article, we will learn how to implement Cuckoo Hashing in P
5 min read
Double Hashing in Python
Double hashing is a collision resolution technique used in hash tables. It works by using two hash functions to compute two different hash values for a given key. The first hash function is used to compute the initial hash value, and the second hash function is used to compute the step size for the probing sequence. In this article, we'll explore w
4 min read
Top 20 Hashing Technique based Interview Questions
Find whether an array is subset of another arrayUnion and Intersection of two Linked ListsFind a pair with given sumFind Itinerary from a given list of ticketsFind four elements a, b, c and d in an array such that a+b = c+dFind the largest subarray with 0 sumCount distinct elements in every window of size kFind smallest range containing elements fr
1 min read
Hashing in Competitive Programming
Hashing is a fundamental technique in competitive programming that is used to efficiently manipulate and process large amounts of data. Data Structures like Hash Maps and Hash Sets use hashing techniques to provide faster insertion, deletion and retrieval of values. Table of Content What is Hashing?Why use Hashing in Competitive Programming?Advanta
15+ min read
Quadratic Probing in Hashing
Hashing is an improvement technique over the Direct Access Table. The idea is to use a hash function that converts a given phone number or any other key to a smaller number and uses the small number as the index in a table called a hash table. Hash Function: A function that converts a given big number to a small practical integer value. The mapped
15 min read
Separate Chaining Collision Handling Technique in Hashing
Separate Chaining is a collision handling technique. Separate chaining is one of the most popular and commonly used techniques in order to handle collisions. In this article, we will discuss about what is Separate Chain collision handling technique, its advantages, disadvantages, etc. What is Collision? Since a hash function gets us a small number
4 min read
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
- Python Tutorial
'); $('.spinner-loading-overlay').show(); jQuery.ajax({ url: writeApiUrl + 'create-improvement-post/?v=1', type: "POST", contentType: 'application/json; charset=utf-8', dataType: 'json', xhrFields: { withCredentials: true }, data: JSON.stringify({ gfg_id: post_id, check: true }), success:function(result) { jQuery.ajax({ url: writeApiUrl + 'suggestions/auth/' + `${post_id}/`, type: "GET", dataType: 'json', xhrFields: { withCredentials: true }, success: function (result) { $('.spinner-loading-overlay:eq(0)').remove(); var commentArray = result; if(commentArray === null || commentArray.length === 0) { // when no reason is availaible then user will redirected directly make the improvment. // call to api create-improvement-post $('body').append('
'); $('.spinner-loading-overlay').show(); jQuery.ajax({ url: writeApiUrl + 'create-improvement-post/?v=1', type: "POST", contentType: 'application/json; charset=utf-8', dataType: 'json', xhrFields: { withCredentials: true }, data: JSON.stringify({ gfg_id: post_id, }), success:function(result) { $('.spinner-loading-overlay:eq(0)').remove(); $('.improve-modal--overlay').hide(); $('.unlocked-status--improve-modal-content').css("display","none"); $('.create-improvement-redirection-to-write').attr('href',writeUrl + 'improve-post/' + `${result.id}` + '/', '_blank'); $('.create-improvement-redirection-to-write')[0].click(); }, error:function(e) { $('.spinner-loading-overlay:eq(0)').remove(); var result = e.responseJSON; if(result.detail.non_field_errors.length){ $('.improve-modal--improve-content .improve-modal--improve-content-modified').text(`${result.detail.non_field_errors}.`); jQuery('.improve-modal--overlay').show(); jQuery('.improve-modal--improvement').show(); $('.locked-status--impove-modal').css("display","block"); $('.unlocked-status--improve-modal-content').css("display","none"); $('.improve-modal--improvement').attr("status","locked"); $('.improvement-reason-modal').hide(); } }, }); return; } var improvement_reason_html = ""; for(var comment of commentArray) { // loop creating improvement reason list markup var comment_id = comment['id']; var comment_text = comment['suggestion']; improvement_reason_html += `
${comment_text}
`; } $('.improvement-reasons_wrapper').html(improvement_reason_html); $('.improvement-bottom-btn').html("Create Improvement"); $('.improve-modal--improvement').hide(); $('.improvement-reason-modal').show(); }, error: function(e){ $('.spinner-loading-overlay:eq(0)').remove(); // stop loader when ajax failed; }, }); }, error:function(e) { $('.spinner-loading-overlay:eq(0)').remove(); var result = e.responseJSON; if(result.detail.non_field_errors.length){ $('.improve-modal--improve-content .improve-modal--improve-content-modified').text(`${result.detail.non_field_errors}.`); jQuery('.improve-modal--overlay').show(); jQuery('.improve-modal--improvement').show(); $('.locked-status--impove-modal').css("display","block"); $('.unlocked-status--improve-modal-content').css("display","none"); $('.improve-modal--improvement').attr("status","locked"); $('.improvement-reason-modal').hide(); } }, }); } else { if(loginData && !loginData.isLoggedIn) { $('.improve-modal--overlay').hide(); if ($('.header-main__wrapper').find('.header-main__signup.login-modal-btn').length) { $('.header-main__wrapper').find('.header-main__signup.login-modal-btn').click(); } return; } } }); $('.left-arrow-icon_wrapper').on('click',function(){ if($('.improve-modal--suggestion').is(":visible")) $('.improve-modal--suggestion').hide(); else{ $('.improvement-reason-modal').hide(); } $('.improve-modal--improvement').show(); }); function loadScript(src, callback) { var script = document.createElement('script'); script.src = src; script.onload = callback; document.head.appendChild(script); } function suggestionCall() { var suggest_val = $.trim($("#suggestion-section-textarea").val()); var array_String= suggest_val.split(" ") var gCaptchaToken = $("#g-recaptcha-response-suggestion-form").val(); var error_msg = false; if(suggest_val != "" && array_String.length >=4){ if(suggest_val.length <= 2000){ var payload = { "gfg_post_id" : `${post_id}`, "suggestion" : `
${suggest_val}
`, } if(!loginData || !loginData.isLoggedIn) // User is not logged in payload["g-recaptcha-token"] = gCaptchaToken jQuery.ajax({ type:'post', url: "https://apiwrite.geeksforgeeks.org/suggestions/auth/create/", xhrFields: { withCredentials: true }, crossDomain: true, contentType:'application/json', data: JSON.stringify(payload), success:function(data) { jQuery('.spinner-loading-overlay:eq(0)').remove(); jQuery('#suggestion-section-textarea').val(""); jQuery('.suggest-bottom-btn').css("display","none"); // Update the modal content const modalSection = document.querySelector('.suggestion-modal-section'); modalSection.innerHTML = `
Thank You!
Your suggestions are valuable to us.
You can now also contribute to the GeeksforGeeks community by creating improvement and help your fellow geeks.
`; }, error:function(data) { jQuery('.spinner-loading-overlay:eq(0)').remove(); jQuery('#suggestion-modal-alert').html("Something went wrong."); jQuery('#suggestion-modal-alert').show(); error_msg = true; } }); } else{ jQuery('.spinner-loading-overlay:eq(0)').remove(); jQuery('#suggestion-modal-alert').html("Minimum 5 Words and Maximum Character limit is 2000."); jQuery('#suggestion-modal-alert').show(); jQuery('#suggestion-section-textarea').focus(); error_msg = true; } } else{ jQuery('.spinner-loading-overlay:eq(0)').remove(); jQuery('#suggestion-modal-alert').html("Enter atleast four words !"); jQuery('#suggestion-modal-alert').show(); jQuery('#suggestion-section-textarea').focus(); error_msg = true; } if(error_msg){ setTimeout(() => { jQuery('#suggestion-section-textarea').focus(); jQuery('#suggestion-modal-alert').hide(); }, 3000); } } document.querySelector('.suggest-bottom-btn').addEventListener('click', function(){ jQuery('body').append('
'); jQuery('.spinner-loading-overlay').show(); if(loginData && loginData.isLoggedIn) { suggestionCall(); return; } // load the captcha script and set the token loadScript('https://www.google.com/recaptcha/api.js?render=6LdMFNUZAAAAAIuRtzg0piOT-qXCbDF-iQiUi9KY',[], function() { setGoogleRecaptcha(); }); }); $('.improvement-bottom-btn.create-improvement-btn').click(function() { //create improvement button is clicked $('body').append('
'); $('.spinner-loading-overlay').show(); // send this option via create-improvement-post api jQuery.ajax({ url: writeApiUrl + 'create-improvement-post/?v=1', type: "POST", contentType: 'application/json; charset=utf-8', dataType: 'json', xhrFields: { withCredentials: true }, data: JSON.stringify({ gfg_id: post_id }), success:function(result) { $('.spinner-loading-overlay:eq(0)').remove(); $('.improve-modal--overlay').hide(); $('.improvement-reason-modal').hide(); $('.create-improvement-redirection-to-write').attr('href',writeUrl + 'improve-post/' + `${result.id}` + '/', '_blank'); $('.create-improvement-redirection-to-write')[0].click(); }, error:function(e) { $('.spinner-loading-overlay:eq(0)').remove(); var result = e.responseJSON; if(result.detail.non_field_errors.length){ $('.improve-modal--improve-content .improve-modal--improve-content-modified').text(`${result.detail.non_field_errors}.`); jQuery('.improve-modal--overlay').show(); jQuery('.improve-modal--improvement').show(); $('.locked-status--impove-modal').css("display","block"); $('.unlocked-status--improve-modal-content').css("display","none"); $('.improve-modal--improvement').attr("status","locked"); $('.improvement-reason-modal').hide(); } }, }); });