1. Implementation based on Arraylist collection
public class ArraylistHashMap<K, V> { private List<Entry<K, V>> entries = new ArrayList<Entry<K, V>>(); class Entry<K, V> { K k; V v; Entry next; public Entry(K k, V v) { this.k = k; this.v = v; } } public void put(K k, V v) { entries.add(new Entry<>(k, v)); } public V get(K k) { for (Entry<K, V> entry : entries) { if (entry.k.equals(k)) { return entry.v; } } return null; } public static void main(String[] args) { ArraylistHashMap<String, String> arraylistHashMap = new ArraylistHashMap<>(); arraylistHashMap.put("demo", "123"); arraylistHashMap.put("123", "demo"); System.out.println(arraylistHashMap.get("123")); } }
According to the key query time complexity is o(n), the efficiency is very low.
2. Based on array + linked list implementation (Jdk)
How the bottom layer of HashMap1.7 is implemented based on array + linked list
The hashCode values stored in the same linked list may be the same, but the content values are different
public class ExtHashMap<K, V> { private Entry[] objects = new Entry[1 0 000]; class Entry<K, V> { public K k; public V v; Entry next; public Entry(K k, V v) { this.k = k; this.v = v; } } public void put(K k, V v) { int index = k == null ? 0 : k.hashCode() % objects.length; Entry<K, V> oldEntry = objects[index]; // determine whether there is if (oldEntry == null) { objects[index] = new Entry<>(k, v); } else { // If a hash collision occurs, it is stored in the back of the linked list oldEntry.next = new Entry<>(k, v); } } public V get(K k) { // if (k == null) { // for (Entry<K, V> oldEntry = objects[0]; oldEntry != null; oldEntry = oldEntry.next) { // if (oldEntry.k.equals(k)) { // return oldEntry.v; // } // } // } int index = k == null ? 0 : k.hashCode() % objects.length; for (Entry<K, V> oldEntry = objects[index]; oldEntry != null; oldEntry = oldEntry.next) { if (oldEntry.k == null || oldEntry.k.equals(k)) { return oldEntry.v; } } return null; } public static void main(String[] args) { ExtHashMap<Object, String> hashMap = new ExtHashMap<>(); hashMap.put("a", "a"); hashMap.put(97, "97"); hashMap.put(null, "nulldemo"); System.out.println(hashMap.get(97)); } }
3. Is the bottom layer of HashMap stored in order?
Unordered, hash storage
When traversing, each linked list is traversed from array 0, and the storage order of the traversal results is not guaranteed to be consistent.
If you need to save according to the storage order, you can use LinkedHashMap The bottom layer is based on doubly linked list storage
LinkedHashMap is implemented based on doubly linked list
HashMap basic singly linked list implementation
4. LinkedHashMap implements cache elimination framework
LRU (less recently used) cache elimination algorithm
LFU (Least Frequently Used Algorithm) Cache Elimination Algorithm
ARC (Adaptive Cache Replacement Algorithm) cache elimination algorithm
FIFO (First In First Out) Cache Elimination Algorithm
MRU (Most Recently Used Algorithm) Cache Removal Algorithm
LinkedHashMap is implemented based on a doubly linked list, which can be divided into two types: insertion or access order. The order is guaranteed in the form of a double linked list.
can be based on insertion or read order
LinkedHashMap is a subclass of HashMap, but there is also a doubly linked list inside to maintain the order of key-value pairs. Each key-value pair is located in both the hash table and the doubly linked list. LinkedHashMap supports two kinds of sequential insertion order, access order
Insertion order: first added first, last added last. Modify operations do not affect the order
After the get/put operation is performed, the corresponding key-value pair will be moved to the end of the linked list, so the last one is the most recently accessed, and the first one is the one that has not been accessed for the longest time. This is the access order.
The parameter accessOrder is used to specify whether to follow the access order. If true, it is the access order. false is based on the new order.
public class LruCache<K, V> extends LinkedHashMap<K, V> { /** * capacity */ private int capacity; public LruCache(int capacity) { super(capacity, 0.75f, true); this.capacity = capacity; } /** * Clear the first one if the storage capacity is exceeded * * @param eldest * @return */ @Override protected boolean removeEldestEntry(Map.Entry<K, V> eldest) { return size() > capacity; } public static void main(String[] args) { LruCache<String, String> lruCache = new LruCache<>(3); lruCache.put("a", "a"); lruCache.put("b", "b"); lruCache.put("c", "c"); // ca e lruCache.get("a"); //cae lruCache.put("e", "demo"); lruCache.forEach((k, v) -> { System.out.println(k + "," + v); }); } }
5. How HashMap reduces the probability of Hash conflict
static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); } ((p = tab[i = (n - 1) & hash])
1. Ensure that array out of bounds will not occur
The first thing we need to know is that in HashMap, the length of the array must be a power of 2 as specified. Therefore, the binary form of the length of the array is: 10000…000, 1 followed by an even number of 0s. Then, the binary form of length - 1 is 01111.111, with an even number of 1s after the 0. The highest bit is 0, and the hash value is "AND", the result value must not be greater than the length of the array, so the array will not be out of bounds. A hash value is 8, which is 1000 in binary, and a hash value is 9, which is 1001 in binary. After the "AND" operation with 1111, the results are 1000 and 1001 respectively, which are allocated in different positions of the array, so that the hash distribution is very uniform.
6. Interpretation of HashMap source code
6.1 The role of modCount
We know that java.util.HashMap is not thread-safe, so if another thread modifies the map during the use of the iterator, a ConcurrentModificationException will be thrown, which is the so-called fail-fast strategy. The implementation of this strategy in the source code is through the modCount field. As the name implies, modCount is the number of modifications. Any modification to the content of the HashMap will increase this value, then this value will be assigned to the expectedModCount of the iterator during the iterator initialization process. During the iteration process, it is judged whether modCount and expectedModCount are equal. If they are not equal, it means that other threads have modified the Map:
6.2 HashMap7 expansion causes infinite loop problem
void transfer(Entry[] newTable, boolean rehash) { int newCapacity = newTable.length; for (Entry<K,V> e : table) { while(null != e) { Entry<K,V> next = e.next; if (rehash) { e.hash = null == e.key ? 0 : hash(e.key); } int i = indexFor(e.hash, newCapacity); e.next = newTable[i]; newTable[i] = e; e = next; } } }
6.3 The underlying principle of HashMap8 expansion
Divide the original linked list into two linked lists and store; the low position still stores the original index position, and the high position stores index=j+original length
If ((e.hash & oldCap) = = 0) {since the original capacity of oldCap is not subtracted by 1, all hash&oldCap
is 0 or 1;
6.4 Why is the HashMap load factor 0.75 instead of 1 or 0.5
Generate background: reduce the probability of Hash conflict index;
Query efficiency and space issues;
In the case of simple inference, expand the capacity in advance:
- If the loading factor is larger, the space utilization rate is higher, and the probability of conflict may be higher;
- If the loading factor is smaller, the probability of conflict may be relatively small, and the space utilization rate is not high;
Balance point in space and time: 0.75
Statistical probability: The Poisson distribution is a discrete probability distribution common in statistics and probability
6.5 How to store 10,000 key s in HashMap with the highest efficiency
Refer to Alibaba's official manual:
6.6 Why JDK does not officially recognize the infinite loop Bug of Jdk7 expansion
https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6423457
7. Interpretation of ConcurrentHashMap source code
7.1 JDK1.7ConcurrentHashMap source code interpretation
Using the traditional HashTable to ensure the thread problem is to use the synchronized lock to lock the array in the entire HashTable.
It is very inefficient to allow only one thread to access Put or Get among multiple threads, but it can guarantee thread safety.
Jdk officially does not recommend using HashTable or HashMap in multi-threaded situations. It is recommended to use ConcurrentHashMap to segment HashMap.
Very efficient.
ConcurrentHashMap splits a large HashMap collection into n different small HashTable s (Segment s), which are divided into 16 different by default
Segment. Each segment has its own independent hashentry<k, v>[] table;
7.1.1 Simple implementation of ConcurrentHashMap
public class MyConcurrentHashMap<K, V> { private Hashtable<K, V>[] segments; public MyConcurrentHashMap() { segments = new Hashtable[16]; } public void put(K k, V v) { int segmentIndex = k.hashCode() & segments.length - 1; Hashtable hashtable = segments[segmentIndex]; if (hashtable == null) { hashtable = new Hashtable<K, V>(); } hashtable.put(k, v); segments[segmentIndex] = hashtable; } public V get(K k) { int segmentIndex = k.hashCode() & segments.length - 1; Hashtable<K, V> hashtable = segments[segmentIndex]; if (hashtable != null) { return hashtable.get(k); } return null; } public static void main(String[] args) { MyConcurrentHashMap<String, String> concurrentHashMap = new MyConcurrentHashMap<>(); // concurrentHashMap.put("a", "a"); // concurrentHashMap.put("97", "97"); for (int i = 0; i < 10; i++) { concurrentHashMap.put(i + "", i + ""); } // System.out.println(concurrentHashMap.get("97")); for (int i = 0; i < 10; i++) { System.out.println(concurrentHashMap.get(i + "")); } } }
7.1.2 Analysis of core parameters
##1. Analysis of no-argument constructor: initialCapacity ---16 loadFactor HashEntry<K,V>[] table; load factor 0.75 concurrencyLevel Concurrency level default is 16 ##2. The concurrency level can be greater than 2 to the 16th power if (concurrencyLevel > MAX_SEGMENTS) concurrencyLevel = MAX_SEGMENTS; ##3. The number of times of sshift left shift ssize Function: record the size of the segment array int sshift = 0; int ssize = 1; while (ssize < concurrencyLevel) { ++sshift; ssize <<= 1; } ##4. segmentShift segmentMask: ssize - 1 can store key s evenly when doing AND operation; this.segmentShift = 32 - sshift; this.segmentMask = ssize - 1; ##5. Initialize Segment0 and assign it to the position of subscript 0 Segment<K,V> s0 = new Segment<K,V>(loadFactor, (int)(cap * loadFactor), (HashEntry<K,V>[])new HashEntry[cap]); ##6. Use CAS to modify and copy to Segment array UNSAFE.putOrderedObject(ss, SBASE, s0); // or Put The implementation of the bottom layer of the method Simple analysis Segment<K,V> s; if (value == null) throw new NullPointerException(); ###Calculate the subscript position of the Segment array where the key is stored; int hash = hash(key); int j = (hash >>> segmentShift28) & segmentMask15; ###Use cas to get the Segment[10] object If it is not obtained, create a new segment object if ((s = (Segment<K,V>)UNSAFE.getObject // nonvolatile; recheck (segments, (j << SSHIFT) + SBASE)) == null) // in ensureSegment s = ensureSegment(j); ### Using lock locks to ensure thread safety for put methods return s.put(key, hash, value, false); 0000 0000 00000 0000 0000 0000 0000 0011 0000 0000 00000 0000 0000 0000 0000 0011 In-depth analysis: Segment<K,V> ensureSegment(int k) private Segment<K,V> ensureSegment(int k) { final Segment<K,V>[] ss = this.segments; long u = (k << SSHIFT) + SBASE; // raw offset Segment<K,V> seg; ### Use UNSAFE to force the Segment object to be retrieved from main memory, if not retrieved = null if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) { ## Use the prototype mode to assign the segment setting parameter information with the subscript 0 to the new Segment object Segment<K,V> proto = ss[0]; // use segment 0 as prototype int cap = proto.table.length; float lf = proto.loadFactor; int threshold = (int)(cap * lf); HashEntry<K,V>[] tab = (HashEntry<K,V>[])new HashEntry[cap]; #### Use UNSAFE to force the Segment object to be retrieved from main memory, if not retrieved = null if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) { // recheck ###Create a new Segment object Segment<K,V> s = new Segment<K,V>(lf, threshold, tab); while ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) { ###Modify using CAS if (UNSAFE.compareAndSwapObject(ss, u, null, seg = s)) break; } } } return seg; } final V put(K key, int hash, V value, boolean onlyIfAbsent) { ###Attempt to acquire the lock, spin if acquired HashEntry<K,V> node = tryLock() ? null : scanAndLockForPut(key, hash, value); V oldValue; try { HashEntry<K,V>[] tab = table; ###Calculate the index subscript position where the key is stored int index = (tab.length - 1) & hash; HashEntry<K,V> first = entryAt(tab, index); for (HashEntry<K,V> e = first;;) { if (e != null) { K k; ###Find the linked list and modify it if the key exists in the linked list if ((k = e.key) == key || (e.hash == hash && key.equals(k))) { oldValue = e.value; if (!onlyIfAbsent) { e.value = value; ++modCount; } break; } e = e.next; } else { if (node != null) node.setNext(first); else ###Create a new node node head insertion method node = new HashEntry<K,V>(hash, key, value, first); int c = count + 1; ###Expansion in advance if the threshold is reached if (c > threshold && tab.length < MAXIMUM_CAPACITY) rehash(node); else setEntryAt(tab, index, node); ++modCount; count = c; oldValue = null; break; } } } finally { ###release lock unlock(); } return oldValue; }
7.1.3 Core APIs
GetObjectVolatile
This method is similar to the getObject function above, but with the addition of 'volatile' load semantics, which forces the property value to be retrieved from main memory. Similar methods are getIntVolatile, getDoubleVolatile, and so on. This method requires the property to be used to be volatile, otherwise the function is the same as the getObject method.
The tryLock() method has a return value, which indicates that it is used to try to acquire the lock. If the acquisition is successful, it returns true, and if the acquisition fails (that is, the lock has been acquired by another thread), it returns false. This method will return immediately anyway. . It won't wait there forever if it can't get the lock.
7.2 Interpretation of JDK1.8 ConcurrentHashMap source code
The removal of ConcurrentHashMap 1.8 cancels the segment design, adopts CAS + synchronized to ensure concurrent thread safety, and splits the lock granularity into each index
Subscript position, the implementation efficiency is the same as ConcurrentHashMap1.7.
7.2.1 Source code interpretation
sizeCtl : The default value is 0, which is used to control the initialization and expansion of the table. The specific application will be reflected in the follow-up.
-1 means the table is being initialized
N means that there are N-1 threads in the process of expansion
The rest:
1. If the table is not initialized, it indicates the size of the table that needs to be initialized. 0
2. If the initialization of the table is completed, it indicates the capacity of the table. By default, it is 0.75 times the size of the table. Unexpectedly, this formula is used to calculate 0.75 (n - (n > > > 2))
CounterCells records the number of times of size++ per thread
8. Arraylist collection source code interpretation
The bottom layer of Arraylist is based on array implementation
System.arraycopy(elementData, index+1, elementData, index, numMoved);
Object src : original array
int srcPos : start from the starting position of the metadata
Object dest : the destination array
int destPos : the starting position of the destination array
int length : the length of the array to copy
public class DemoArraylist<T> { /** * store data elements */ private Object[] elementData; /** * The number of records stored */ private int size = 0; /** * Default capacity is 10 */ private static final int DEFAULT_CAPACITY = 10; public void add(T t) { if (elementData == null) { elementData = new Object[10]; } // Determine if expansion is required if ((size + 1) - elementData.length > 0) { // Original capacity 10 int oldCapacity = elementData.length; // new capacity 10+5; int newCapacity = oldCapacity + (oldCapacity >> 1); // Expansion elementData = Arrays.copyOf(elementData, newCapacity); } elementData[size++] = t; } public T get(int index) { return (T) elementData[index]; } public boolean remove(T value) { for (int i = 0; i < size; i++) { T oldValue = (T) elementData[i]; if (oldValue.equals(value)) { int numMoved = size - i - 1; if (numMoved > 0) //remove System.arraycopy(elementData, i + 1, elementData, i, numMoved); elementData[--size] = null; return true; } } return false; } public static void main(String[] args) { DemoArraylist<String> arraylist = new DemoArraylist<>(); for (int i = 0; i < 100; i++) { arraylist.add("demo" + i); } arraylist.remove("demo2"); // arraylist.add("demo11"); for (int i = 0; i < arraylist.size; i++) { System.out.println(arraylist.get(i)); } // // System.out.println(arraylist.get(0)); } }