[The underlying implementation principle of HashMap]

1. Implementation based on Arraylist collection

public class ArraylistHashMap<K, V> {
    private List<Entry<K, V>> entries = new ArrayList<Entry<K, V>>();

    class Entry<K, V> {
        K k;
        V v;
        Entry next;

        public Entry(K k, V v) {
            this.k = k;
            this.v = v;

    public void put(K k, V v) {
        entries.add(new Entry<>(k, v));

    public V get(K k) {
        for (Entry<K, V> entry :
                entries) {
            if (entry.k.equals(k)) {
                return entry.v;
        return null;

    public static void main(String[] args) {
        ArraylistHashMap<String, String> arraylistHashMap = new ArraylistHashMap<>();
        arraylistHashMap.put("demo", "123");
        arraylistHashMap.put("123", "demo");

According to the key query time complexity is o(n), the efficiency is very low.

2. Based on array + linked list implementation (Jdk)

How the bottom layer of HashMap1.7 is implemented based on array + linked list
The hashCode values ​​stored in the same linked list may be the same, but the content values ​​are different

public class ExtHashMap<K, V> {
    private Entry[] objects = new Entry[1 0 000];

    class Entry<K, V> {
        public K k;
        public V v;
        Entry next;

        public Entry(K k, V v) {
            this.k = k;
            this.v = v;

    public void put(K k, V v) {
        int index = k == null ? 0 : k.hashCode() % objects.length;
        Entry<K, V> oldEntry = objects[index];
        // determine whether there is
        if (oldEntry == null) {
            objects[index] = new Entry<>(k, v);
        } else {
            // If a hash collision occurs, it is stored in the back of the linked list
            oldEntry.next = new Entry<>(k, v);

    public V get(K k) {
//        if (k == null) {
//            for (Entry<K, V> oldEntry = objects[0]; oldEntry != null; oldEntry = oldEntry.next) {
//                if (oldEntry.k.equals(k)) {
//                    return oldEntry.v;
//                }
//            }
//        }
        int index = k == null ? 0 : k.hashCode() % objects.length;
        for (Entry<K, V> oldEntry = objects[index]; oldEntry != null; oldEntry = oldEntry.next) {
            if (oldEntry.k == null || oldEntry.k.equals(k)) {
                return oldEntry.v;
        return null;

    public static void main(String[] args) {
        ExtHashMap<Object, String> hashMap = new ExtHashMap<>();
        hashMap.put("a", "a");
        hashMap.put(97, "97");
        hashMap.put(null, "nulldemo");


3. Is the bottom layer of HashMap stored in order?

Unordered, hash storage
When traversing, each linked list is traversed from array 0, and the storage order of the traversal results is not guaranteed to be consistent.
If you need to save according to the storage order, you can use LinkedHashMap The bottom layer is based on doubly linked list storage
LinkedHashMap is implemented based on doubly linked list
HashMap basic singly linked list implementation

4. LinkedHashMap implements cache elimination framework

LRU (less recently used) cache elimination algorithm
LFU (Least Frequently Used Algorithm) Cache Elimination Algorithm
ARC (Adaptive Cache Replacement Algorithm) cache elimination algorithm
FIFO (First In First Out) Cache Elimination Algorithm
MRU (Most Recently Used Algorithm) Cache Removal Algorithm

LinkedHashMap is implemented based on a doubly linked list, which can be divided into two types: insertion or access order. The order is guaranteed in the form of a double linked list.
can be based on insertion or read order

LinkedHashMap is a subclass of HashMap, but there is also a doubly linked list inside to maintain the order of key-value pairs. Each key-value pair is located in both the hash table and the doubly linked list. LinkedHashMap supports two kinds of sequential insertion order, access order

Insertion order: first added first, last added last. Modify operations do not affect the order
After the get/put operation is performed, the corresponding key-value pair will be moved to the end of the linked list, so the last one is the most recently accessed, and the first one is the one that has not been accessed for the longest time. This is the access order.
The parameter accessOrder is used to specify whether to follow the access order. If true, it is the access order. false is based on the new order.

public class LruCache<K, V> extends LinkedHashMap<K, V> {
     * capacity
    private int capacity;

    public LruCache(int capacity) {
        super(capacity, 0.75f, true);
        this.capacity = capacity;

     * Clear the first one if the storage capacity is exceeded
     * @param eldest
     * @return
    protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
        return size() > capacity;

    public static void main(String[] args) {
        LruCache<String, String> lruCache = new LruCache<>(3);
        lruCache.put("a", "a");
        lruCache.put("b", "b");
        lruCache.put("c", "c");
        // ca e
        lruCache.put("e", "demo");
        lruCache.forEach((k, v) -> {
            System.out.println(k + "," + v);

5. How HashMap reduces the probability of Hash conflict

static final int hash(Object key) {
    int h;
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
((p = tab[i = (n - 1) & hash])

1. Ensure that array out of bounds will not occur
The first thing we need to know is that in HashMap, the length of the array must be a power of 2 as specified. Therefore, the binary form of the length of the array is: 10000…000, 1 followed by an even number of 0s. Then, the binary form of length - 1 is 01111.111, with an even number of 1s after the 0. The highest bit is 0, and the hash value is "AND", the result value must not be greater than the length of the array, so the array will not be out of bounds. A hash value is 8, which is 1000 in binary, and a hash value is 9, which is 1001 in binary. After the "AND" operation with 1111, the results are 1000 and 1001 respectively, which are allocated in different positions of the array, so that the hash distribution is very uniform.

6. Interpretation of HashMap source code

6.1 The role of modCount

We know that java.util.HashMap is not thread-safe, so if another thread modifies the map during the use of the iterator, a ConcurrentModificationException will be thrown, which is the so-called fail-fast strategy. The implementation of this strategy in the source code is through the modCount field. As the name implies, modCount is the number of modifications. Any modification to the content of the HashMap will increase this value, then this value will be assigned to the expectedModCount of the iterator during the iterator initialization process. During the iteration process, it is judged whether modCount and expectedModCount are equal. If they are not equal, it means that other threads have modified the Map:

6.2 HashMap7 expansion causes infinite loop problem

void transfer(Entry[] newTable, boolean rehash) {
    int newCapacity = newTable.length;
    for (Entry<K,V> e : table) {
        while(null != e) {
            Entry<K,V> next = e.next;
            if (rehash) {
                e.hash = null == e.key ? 0 : hash(e.key);
            int i = indexFor(e.hash, newCapacity);
            e.next = newTable[i];
            newTable[i] = e;
            e = next;

6.3 The underlying principle of HashMap8 expansion

Divide the original linked list into two linked lists and store; the low position still stores the original index position, and the high position stores index=j+original length
If ((e.hash & oldCap) = = 0) {since the original capacity of oldCap is not subtracted by 1, all hash&oldCap
is 0 or 1;

6.4 Why is the HashMap load factor 0.75 instead of 1 or 0.5

Generate background: reduce the probability of Hash conflict index;
Query efficiency and space issues;

In the case of simple inference, expand the capacity in advance:

  1. If the loading factor is larger, the space utilization rate is higher, and the probability of conflict may be higher;
  2. If the loading factor is smaller, the probability of conflict may be relatively small, and the space utilization rate is not high;
    Balance point in space and time: 0.75
    Statistical probability: The Poisson distribution is a discrete probability distribution common in statistics and probability

6.5 How to store 10,000 key s in HashMap with the highest efficiency

Refer to Alibaba's official manual:

6.6 Why JDK does not officially recognize the infinite loop Bug of Jdk7 expansion


7. Interpretation of ConcurrentHashMap source code

7.1 JDK1.7ConcurrentHashMap source code interpretation

Using the traditional HashTable to ensure the thread problem is to use the synchronized lock to lock the array in the entire HashTable.
It is very inefficient to allow only one thread to access Put or Get among multiple threads, but it can guarantee thread safety.

Jdk officially does not recommend using HashTable or HashMap in multi-threaded situations. It is recommended to use ConcurrentHashMap to segment HashMap.
Very efficient.

ConcurrentHashMap splits a large HashMap collection into n different small HashTable s (Segment s), which are divided into 16 different by default
Segment. Each segment has its own independent hashentry<k, v>[] table;

7.1.1 Simple implementation of ConcurrentHashMap

public class MyConcurrentHashMap<K, V> {
    private Hashtable<K, V>[] segments;

    public MyConcurrentHashMap() {
        segments = new Hashtable[16];

    public void put(K k, V v) {
        int segmentIndex = k.hashCode() & segments.length - 1;
        Hashtable hashtable = segments[segmentIndex];
        if (hashtable == null) {
            hashtable = new Hashtable<K, V>();
        hashtable.put(k, v);
        segments[segmentIndex] = hashtable;

    public V get(K k) {
        int segmentIndex = k.hashCode() & segments.length - 1;
        Hashtable<K, V> hashtable = segments[segmentIndex];
        if (hashtable != null) {
            return hashtable.get(k);
        return null;

    public static void main(String[] args) {
        MyConcurrentHashMap<String, String> concurrentHashMap = new MyConcurrentHashMap<>();
//        concurrentHashMap.put("a", "a");
//        concurrentHashMap.put("97", "97");
        for (int i = 0; i < 10; i++) {
            concurrentHashMap.put(i + "", i + "");
//        System.out.println(concurrentHashMap.get("97"));
        for (int i = 0; i < 10; i++) {
            System.out.println(concurrentHashMap.get(i + ""));


7.1.2 Analysis of core parameters

##1. Analysis of no-argument constructor:
initialCapacity ---16 
loadFactor  HashEntry<K,V>[] table; load factor 0.75
concurrencyLevel Concurrency level default is 16
##2. The concurrency level can be greater than 2 to the 16th power
if (concurrencyLevel > MAX_SEGMENTS)
concurrencyLevel = MAX_SEGMENTS;
##3. The number of times of sshift left shift ssize Function: record the size of the segment array
        int sshift = 0;
        int ssize = 1;
        while (ssize < concurrencyLevel) {
            ssize <<= 1;
##4. segmentShift segmentMask: ssize - 1 can store key s evenly when doing AND operation;
        this.segmentShift = 32 - sshift;
        this.segmentMask = ssize - 1;
##5. Initialize Segment0 and assign it to the position of subscript 0
Segment<K,V> s0 =
            new Segment<K,V>(loadFactor, (int)(cap * loadFactor),
                             (HashEntry<K,V>[])new HashEntry[cap]);    
##6. Use CAS to modify and copy to Segment array
 UNSAFE.putOrderedObject(ss, SBASE, s0); // or     

Put The implementation of the bottom layer of the method Simple analysis

        Segment<K,V> s;
        if (value == null)
            throw new NullPointerException();
        ###Calculate the subscript position of the Segment array where the key is stored;
        int hash = hash(key);
        int j = (hash >>> segmentShift28) & segmentMask15;
        ###Use cas to get the Segment[10] object If it is not obtained, create a new segment object
        if ((s = (Segment<K,V>)UNSAFE.getObject          // nonvolatile; recheck
             (segments, (j << SSHIFT) + SBASE)) == null) //  in ensureSegment
            s = ensureSegment(j);
        ### Using lock locks to ensure thread safety for put methods
        return s.put(key, hash, value, false);

0000 0000 00000 0000 0000 0000 0000 0011
                                    0000 0000 00000 0000 0000 0000 0000 0011

In-depth analysis:
Segment<K,V> ensureSegment(int k) 
    private Segment<K,V> ensureSegment(int k) {
        final Segment<K,V>[] ss = this.segments;
        long u = (k << SSHIFT) + SBASE; // raw offset
        Segment<K,V> seg;
        ### Use UNSAFE to force the Segment object to be retrieved from main memory, if not retrieved = null
        if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) {
        	## Use the prototype mode to assign the segment setting parameter information with the subscript 0 to the new Segment object
            Segment<K,V> proto = ss[0]; // use segment 0 as prototype
            int cap = proto.table.length;
            float lf = proto.loadFactor;
            int threshold = (int)(cap * lf);
            HashEntry<K,V>[] tab = (HashEntry<K,V>[])new HashEntry[cap];
            #### Use UNSAFE to force the Segment object to be retrieved from main memory, if not retrieved = null
            if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u))
                == null) { // recheck
            	###Create a new Segment object
                Segment<K,V> s = new Segment<K,V>(lf, threshold, tab);
                while ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u))
                       == null) {
                	###Modify using CAS
                    if (UNSAFE.compareAndSwapObject(ss, u, null, seg = s))
        return seg;

   final V put(K key, int hash, V value, boolean onlyIfAbsent) {
   	###Attempt to acquire the lock, spin if acquired
            HashEntry<K,V> node = tryLock() ? null :
                scanAndLockForPut(key, hash, value);
            V oldValue;
            try {
                HashEntry<K,V>[] tab = table;
                ###Calculate the index subscript position where the key is stored
                int index = (tab.length - 1) & hash;
                HashEntry<K,V> first = entryAt(tab, index);
                for (HashEntry<K,V> e = first;;) {
                    if (e != null) {
                        K k;
                        ###Find the linked list and modify it if the key exists in the linked list
                        if ((k = e.key) == key ||
                            (e.hash == hash && key.equals(k))) {
                            oldValue = e.value;
                            if (!onlyIfAbsent) {
                                e.value = value;
                        e = e.next;
                    else {
                        if (node != null)
                        	###Create a new node node head insertion method
                            node = new HashEntry<K,V>(hash, key, value, first);
                        int c = count + 1;
                          ###Expansion in advance if the threshold is reached
                        if (c > threshold && tab.length < MAXIMUM_CAPACITY)
                            setEntryAt(tab, index, node);
                        count = c;
                        oldValue = null;
            } finally {
            	###release lock
            return oldValue;

7.1.3 Core APIs

This method is similar to the getObject function above, but with the addition of 'volatile' load semantics, which forces the property value to be retrieved from main memory. Similar methods are getIntVolatile, getDoubleVolatile, and so on. This method requires the property to be used to be volatile, otherwise the function is the same as the getObject method.

The tryLock() method has a return value, which indicates that it is used to try to acquire the lock. If the acquisition is successful, it returns true, and if the acquisition fails (that is, the lock has been acquired by another thread), it returns false. This method will return immediately anyway. . It won't wait there forever if it can't get the lock.

7.2 Interpretation of JDK1.8 ConcurrentHashMap source code

The removal of ConcurrentHashMap 1.8 cancels the segment design, adopts CAS + synchronized to ensure concurrent thread safety, and splits the lock granularity into each index
Subscript position, the implementation efficiency is the same as ConcurrentHashMap1.7.

7.2.1 Source code interpretation

sizeCtl : The default value is 0, which is used to control the initialization and expansion of the table. The specific application will be reflected in the follow-up.
-1 means the table is being initialized
N means that there are N-1 threads in the process of expansion
The rest:
1. If the table is not initialized, it indicates the size of the table that needs to be initialized. 0
2. If the initialization of the table is completed, it indicates the capacity of the table. By default, it is 0.75 times the size of the table. Unexpectedly, this formula is used to calculate 0.75 (n - (n > > > 2))

CounterCells records the number of times of size++ per thread

8. Arraylist collection source code interpretation

The bottom layer of Arraylist is based on array implementation

 System.arraycopy(elementData, index+1, elementData, index, numMoved);

Object src : original array
int srcPos : start from the starting position of the metadata
Object dest : the destination array
int destPos : the starting position of the destination array
int length : the length of the array to copy

public class DemoArraylist<T> {
     * store data elements
    private Object[] elementData;
     * The number of records stored
    private int size = 0;
     * Default capacity is 10
    private static final int DEFAULT_CAPACITY = 10;

    public void add(T t) {
        if (elementData == null) {
            elementData = new Object[10];
        // Determine if expansion is required
        if ((size + 1) - elementData.length > 0) {
            // Original capacity 10
            int oldCapacity = elementData.length;
            // new capacity 10+5;
            int newCapacity = oldCapacity + (oldCapacity >> 1);
            // Expansion
            elementData = Arrays.copyOf(elementData, newCapacity);
        elementData[size++] = t;

    public T get(int index) {
        return (T) elementData[index];

    public boolean remove(T value) {
        for (int i = 0; i < size; i++) {
            T oldValue = (T) elementData[i];
            if (oldValue.equals(value)) {
                int numMoved = size - i - 1;
                if (numMoved > 0)
                    System.arraycopy(elementData, i + 1, elementData, i,
                elementData[--size] = null;
                return true;
        return false;

    public static void main(String[] args) {
        DemoArraylist<String> arraylist = new DemoArraylist<>();
        for (int i = 0; i < 100; i++) {
            arraylist.add("demo" + i);
//        arraylist.add("demo11");
        for (int i = 0; i < arraylist.size; i++) {
//        System.out.println(arraylist.get(0));

9. HashMap source code comments

Tags: Java JavaSE HashMap

Posted by Fribbles on Mon, 25 Jul 2022 22:19:37 +0530