Working of HashMap (Internals)
You already know that HashMap gives you O(1) average-time get and put. But how does it achieve that? Understanding the internals makes you a better developer — you’ll avoid performance traps, tune it correctly, and ace interviews.
The Core Idea: Hashing
A HashMap stores key-value pairs by converting each key into an integer index — a hash code — and using that index to locate a slot (called a bucket) in an internal array. This is what makes lookups near-instant: instead of scanning every entry, Java computes the slot directly.
import java.util.HashMap;
public class HashMapDemo {
public static void main(String[] args) {
HashMap<String, Integer> scores = new HashMap<>();
scores.put("Alice", 95);
scores.put("Bob", 87);
scores.put("Carol", 92);
System.out.println(scores.get("Alice")); // 95
}
}
Output:
95
When you call put("Alice", 95), Java:
- Calls
"Alice".hashCode()to get an integer. - Spreads that integer across the available buckets.
- Stores the key-value pair in the resulting bucket.
The Internal Array of Buckets
HashMap internally holds a field called table — an array of Node<K,V> objects. Each element of that array is one bucket. The default initial capacity is 16 buckets.
// Simplified conceptual view (not the exact source)
Node<K,V>[] table; // the bucket array
int threshold; // resize when size > threshold
float loadFactor; // default 0.75
int size; // number of key-value pairs stored
Each Node holds:
| Field | Purpose |
|---|---|
hash | the pre-computed hash of the key |
key | the actual key object |
value | the associated value |
next | pointer to the next node in the same bucket (chaining) |
Step-by-Step: How put() Works
HashMap<String, Integer> map = new HashMap<>();
map.put("Java", 1);
1. Compute the Hash
Java calls key.hashCode() and then applies a secondary spread function to reduce clustering:
// Simplified version of HashMap's internal hash()
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
XOR-ing the high 16 bits into the low 16 bits ensures that keys whose hashCode() values differ only in the upper bits still land in different buckets.
2. Compute the Bucket Index
int index = (capacity - 1) & hash; // equivalent to hash % capacity (when capacity is a power of two)
Because the capacity is always a power of two, the & mask is a fast bitwise operation rather than a slower modulus.
3. Insert the Node
- If
table[index]is empty, a newNodeis placed there directly. - If
table[index]is occupied (a collision), the new node is appended to the linked list at that bucket — this is called separate chaining.
Note: Two different keys landing in the same bucket is called a hash collision. Collisions are normal and handled gracefully; they only affect performance, not correctness.
Handling Collisions: Chaining
When multiple keys hash to the same bucket index, HashMap chains them in a linked list within that bucket. During get(), Java walks the chain comparing each node’s key with equals() until it finds the right one.
import java.util.HashMap;
public class CollisionDemo {
public static void main(String[] args) {
// "Aa" and "BB" have the same hashCode() in Java
HashMap<String, Integer> map = new HashMap<>();
map.put("Aa", 1);
map.put("BB", 2);
System.out.println(map.get("Aa")); // 1
System.out.println(map.get("BB")); // 2
}
}
Output:
1
2
Both keys coexist correctly even though they collide — equals() distinguishes them.
Java 8+: Treeification
In Java 8, a crucial optimization was added: when a single bucket’s chain grows beyond 8 nodes, that linked list is automatically converted into a red-black tree (a balanced binary search tree). This changes worst-case lookup in that bucket from O(n) to O(log n).
The threshold constants are:
| Constant | Value | Meaning |
|---|---|---|
TREEIFY_THRESHOLD | 8 | Convert chain → tree after 8 nodes |
UNTREEIFY_THRESHOLD | 6 | Convert tree → chain after resize reduces to 6 |
MIN_TREEIFY_CAPACITY | 64 | Only treeify if the table has at least 64 buckets |
Tip: Treeification is a safety net for pathological cases (e.g., deliberately crafted keys with the same hash). With a good
hashCode()implementation, you’ll rarely trigger it.
Load Factor and Resizing
The load factor (default 0.75) controls how full the bucket array can get before it is resized. The threshold at which a resize triggers is:
threshold = capacity × loadFactor
With the defaults: threshold = 16 × 0.75 = 12. Once you insert a 13th entry, HashMap doubles the array to 32 buckets and rehashes all existing entries into the new positions.
// Specifying initial capacity and load factor
HashMap<String, Integer> map = new HashMap<>(32, 0.75f);
Warning: Resizing is expensive — it allocates a new array and rehashes every key. If you know your map will hold ~1000 entries, initialize with
new HashMap<>(1400)(roughlyexpected / loadFactor) to avoid multiple resizes.
Resizing Efficiency in Java 8+
Because capacity is always a power of two, Java 8 avoids recomputing the full hash during resize. Each key either stays in its current bucket index or moves to oldIndex + oldCapacity. This is determined by checking one extra bit of the hash, making rehashing very efficient.
Step-by-Step: How get() Works
Integer value = map.get("Java");
- Compute
hash("Java"). - Compute
index = (capacity - 1) & hash. - Walk the chain (or tree) at
table[index], comparing each node withhash == nodeHash && key.equals(nodeKey). - Return the matching node’s value, or
nullif not found.
Note:
get()uses bothhashCode()andequals(). This is why the contract “ifa.equals(b)thena.hashCode() == b.hashCode()” is critical. Violate it, andget()silently returnsnullfor keys that are logically equal.
null Keys and null Values
HashMap explicitly handles a null key by storing it at bucket index 0:
HashMap<String, String> map = new HashMap<>();
map.put(null, "nullValue");
map.put("key", null);
System.out.println(map.get(null)); // nullValue
System.out.println(map.get("key")); // null
Output:
nullValue
null
Hashtable does not allow null keys or values — another key difference covered in HashMap vs Hashtable.
Under the Hood
Memory Layout
Each Node<K,V> is a separate heap object. With 8 bytes of object header, 4 fields (hash, key ref, value ref, next ref), a single node costs roughly 32–40 bytes on a 64-bit JVM with compressed oops. For a million-entry map that’s around 40 MB just for nodes — worth keeping in mind when sizing caches.
Hash Quality Matters
The time complexity guarantee of O(1) average is conditional on a good hash function. Java’s String.hashCode() is high quality. For custom objects, follow the rule: always override hashCode() when you override equals(). A poor hash (e.g., always returning a constant) degrades the map to O(n) even before Java 8 treeification kicks in.
// Good custom hashCode (using Objects.hash for convenience)
import java.util.Objects;
public class Point {
int x, y;
Point(int x, int y) { this.x = x; this.y = y; }
@Override
public boolean equals(Object o) {
if (!(o instanceof Point p)) return false;
return x == p.x && y == p.y;
}
@Override
public int hashCode() {
return Objects.hash(x, y);
}
}
Thread Safety
HashMap is not thread-safe. Concurrent writes can corrupt the internal structure (in older JDKs, even cause infinite loops during resize). Use ConcurrentHashMap for multi-threaded access, or wrap with Collections.synchronizedMap() if you need to retrofit existing code.
Iteration Order
HashMap does not guarantee iteration order. If you need insertion order, use LinkedHashMap. For sorted keys, use TreeMap.
Key Takeaways
put()computes a hash, finds a bucket via bitwise AND, and chains nodes on collision.- Java 8+ converts long chains to red-black trees for O(log n) worst-case bucket access.
- The default load factor of 0.75 balances memory and speed well; resize pre-allocate when size is known.
- Always implement both
hashCode()andequals()consistently for custom keys. HashMapis not thread-safe — useConcurrentHashMapin concurrent code.
Related Topics
- HashMap — complete API guide and usage examples for HashMap
- LinkedHashMap — HashMap that preserves insertion order
- TreeMap — sorted map backed by a red-black tree
- HashMap vs Hashtable — key differences including null handling and thread safety
- HashSet — Set backed by a HashMap internally; same hashing concepts apply
- Concurrent Collections — thread-safe alternatives including ConcurrentHashMap