Deep analysis of ThreadLocal source code

Deep analysis of ThreadLocal source code

The role of ThreadLocal

ThreadLocal is used to provide local variables in threads. To put it bluntly, it is to create a copy of variables in each thread. Compared with using various locking mechanisms to access variables, ThreadLocal's idea is to exchange space for time, so that each thread can access its own copy of variables, and the variable values do not interfere with each other, Reduce the complexity of transferring some common variables between multiple functions or components in the same thread.

Use of ThreadLocal

According to the official documents, ThreadLocal includes the following methods:

 

The get function is used to obtain the value of the ThreadLocal associated with the current thread. If the current thread does not have the value of the ThreadLocal, the initialValue function is called to obtain the initial value return. The initialValue is of protected type, so we generally need to inherit this function to give the initial value. The set function is used to set the ThreadLocal value of the current thread, and the remove function is used to delete the ThreadLocal bound value. In some cases, it needs to be called manually to prevent memory leakage.

Code example

public class Main {
    private static final ThreadLocal<Integer> threadLocal = new ThreadLocal<Integer>(){
        @Override
        protected Integer initialValue() {
            return Integer.valueOf(0);
        }
    };

    public static void main(String[] args) {
        Thread[] threads = new Thread[5];
        for (int i=0;i<5;i++) {
            threads[i] = new Thread(new Runnable() {
                @Override
                public void run() {
                    System.out.println(Thread.currentThread() +"'s initial value: " + threadLocal.get());
                    for (int j=0;j<10;j++) {
                        threadLocal.set(threadLocal.get() + j);
                    }
                    System.out.println(Thread.currentThread() +"'s last value: " + threadLocal.get());
                }
            });
        }

        for (Thread t: threads)
            t.start();
    }
}
Copy code

In the above example, threadLocal is unique to each thread, so even after accumulation, their values do not affect each other. The final output is as follows:

 

 

Analysis of ThreadLocal source code

After reading the basic use of ThreadLocal, readers must want to know how ThreadLocal is implemented internally? The main content of this article is that the code based on Java1.8 takes you to analyze the internal implementation of ThreadLocal. First look at the following picture:

 

We can see that each Thread maintains a ThreadLocalMap. What is stored in the ThreadLocalMap is a table array with entry as the element. Entry is a key value structure. Key is ThreadLocal and value is the stored value. Similar to the implementation of HashMap, each Thread stores Thread independent values with the help of a hash table. Let's take a look at the definition of entry:

 

static class Entry extends WeakReference<ThreadLocal<?>> {
    /** The value associated with this ThreadLocal. */
    Object value;

    Entry(ThreadLocal<?> k, Object v) {
        super(k);
        value = v;
    }
}
Copy code

The line between ThreadLocal and key here is a dotted line, because the Entry inherits the implementation of WeakReference. When the ThreadLocal Ref is destroyed, the only strong reference to the ThreadLocal instance in the heap disappears. Only the Entry has a weak reference to the ThreadLocal instance. Assuming you know the characteristics of weak references, the ThreadLocal instance here can be GC dropped. At this time, the key in the Entry is null, so the value in the Entry cannot be recycled until the end of the thread. Memory leaks may occur here. How to solve them will be described later.

After knowing the general data structure, let's explore the implementation of ThreadLocal methods:

get method

public T get() {
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null) {
            ThreadLocalMap.Entry e = map.getEntry(this);
            if (e != null) {
                @SuppressWarnings("unchecked")
                T result = (T)e.value;
                return result;
            }
        }
        return setInitialValue();
    }
Copy code

Looking directly at the code, you can analyze the following steps:

  1. Get the current Thread object, and get the ThreadLocalMap in the Thread through getMap
  2. If the map already exists, take the current ThreadLocal as the key, get the Entry object, and get the value from the Entry
  3. Otherwise, call setInitialValue to initialize.

Let's look at the functions mentioned above:

getMap

 ThreadLocalMap getMap(Thread t) {
        return t.threadLocals;
    }
Copy code

getMap is very simple. It is to return the ThreadLocalMap in the Thread. Jump to the Thread source code. ThreadLocalMap is defined as follows:

ThreadLocal.ThreadLocalMap threadLocals = null;
Copy code

Therefore, ThreadLocalMap is still defined in ThreadLocal. We have already mentioned the Entry definition in ThreadLocalMap. In order to introduce the definition of ThreadLocalMap, we put setInitialValue in front.

setInitialValue

private T setInitialValue() {
    T value = initialValue();
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null)
        map.set(this, value);
    else
        createMap(t, value);
    return value;
}
Copy code

Setinitialvalue is called when the Map does not exist

  1. First, call initialValue to generate an initial value. After going deep into the initialValue function, we can see that it returns a null;
  2. Then, the following map is still in get. If the map exists, the map Set, this function will be described later;
  3. If it does not exist, createMap will be called to create ThreadLocalMap. Here, we can explain ThreadLocalMap first.

ThreadLocalMap

The definition of the createMap method is simple:

void createMap(Thread t, T firstValue) {
    t.threadLocals = new ThreadLocalMap(this, firstValue);
}
Copy code

Call the constructor of ThreadLocalMap to generate a map. Let's take a look at the definition of ThreadLocalMap:

static class ThreadLocalMap {
    static class Entry extends WeakReference<ThreadLocal<?>> {
        /** The value associated with this ThreadLocal. */
        Object value;

        Entry(ThreadLocal<?> k, Object v) {
            super(k);
            value = v;
        }
    }

    /**
     * The initial capacity -- MUST be a power of two.
     */
    private static final int INITIAL_CAPACITY = 16;

    /**
     * The table, resized as necessary.
     * table.length MUST always be a power of two.
     */
    private Entry[] table;

    /**
     * The number of entries in the table.
     */
    private int size = 0;

    /**
     * The next size value at which to resize.
     */
    private int threshold; // Default to 0

    /**
     * Set the resize threshold to maintain at worst a 2/3 load factor.
     */
    private void setThreshold(int len) {
        threshold = len * 2 / 3;
    }

    /**
     * Increment i modulo len.
     */
    private static int nextIndex(int i, int len) {
        return ((i + 1 < len) ? i + 1 : 0);
    }

    /**
     * Decrement i modulo len.
     */
    private static int prevIndex(int i, int len) {
        return ((i - 1 >= 0) ? i - 1 : len - 1);
    }
Copy code

ThreadLocalMap is defined as a static class, which contains the main members:

  1. The first is the definition of Entry, which has been mentioned earlier;
  2. Initial capacity_ Capability = 16;
  3. The main data structure is an Entry array table;
  4. size is used to record the actual number of entries in the Map;
  5. Threshold is the upper capacity expansion limit. When the size reaches the threshold, the entire Map needs to be resize d. The initial value of threshold is len * 2 / 3;
  6. nextIndex and prevIndex are used in the following functions to move indexes safely.

The constructor of ThreadLocalMap is as follows:

ThreadLocalMap(ThreadLocal<?> firstKey, Object firstValue) {
    table = new Entry[INITIAL_CAPACITY];
    int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);
    table[i] = new Entry(firstKey, firstValue);
    size = 1;
    setThreshold(INITIAL_CAPACITY);
}
Copy code

Create an Entry using firstKey and firstValue, calculate the index i, insert the created Entry into the i position in the table, and then set the size and threshold.

Finally, the getEntry function of the get function to complete substantive functions:

map.getEntry

private Entry getEntry(ThreadLocal<?> key) {
    int i = key.threadLocalHashCode & (table.length - 1);
    Entry e = table[i];
    if (e != null && e.get() == key)
        return e;
    else
        return getEntryAfterMiss(key, i, e);
}
Copy code
  1. First, calculate the index position i by calculating the hash%(table.length-1) of the key;
  2. According to the Entry obtained, if the Entry exists and the key of the Entry happens to be equal to ThreadLocal, the Entry object is returned directly;
  3. Otherwise, if the corresponding Entry cannot be found at this location, getEntryAfterMiss will be called.

This insight leads to the getEntryAfterMiss method:

getEntryAfterMiss

private Entry getEntryAfterMiss(ThreadLocal<?> key, int i, Entry e) {
    Entry[] tab = table;
    int len = tab.length;

    while (e != null) {
        ThreadLocal<?> k = e.get();
        if (k == key)
            return e;
        if (k == null)
            expungeStaleEntry(i);
        else
            i = nextIndex(i, len);
        e = tab[i];
    }
    return null;
}
Copy code

This method must be combined with the previous step, which is because e is not satisfied= Null & & e.get() == key is reduced to calling getEntryAfterMiss. Therefore, if e is null, getEntryAfterMiss will directly return null. If e is not satisfied with e.get() == key, then enter the while loop. Here is a continuous loop. If e is not empty all the time, then call nextIndex and increase i. two judgments will be made during this process:

  1. If k==key, it means that the required Entry is found and returned directly;
  2. If k==null, the key in the Entry is proved to be null, and the Entry is an expired object. Here, call expungeStaleEntry to clean up the Entry. Here is an answer to the previous problem: when ThreadLocal Ref is destroyed, the ThreadLocal instance will be deleted by the GC because only a weak reference in the Entry points to it. The key of the Entry is missing, and the value may leak memory. In fact, such entries with null keys will be cleaned up during each get and set operation.

Why circular lookup?

Here you can skip to the following set method directly, mainly because of the method of handling hash conflicts. We all know that HashMap uses the zipper method to handle hash conflicts, that is, if there is already an element at a location, it uses a linked list to link the conflicting element behind the element, while ThreadLocal uses the open address method, that is, after a conflict, it places the element to be inserted at the place where the position to be inserted is null, For specific differences between the two methods, please refer to: Main methods for resolving HASH conflicts . Therefore, the above loop is because there may not be an Entry with the key exactly equal to the key we want to find at the first calculated i position. Therefore, we can only continue to loop in the future to find whether the element is inserted later until a null element is found, because if it is inserted, it is also null.

After analyzing the causes of the cycle, you can also go deep into the expungeStaleEntry to see how it is cleaned up.

expungeStaleEntry

private int expungeStaleEntry(int staleSlot) {
    Entry[] tab = table;
    int len = tab.length;

    // expunge entry at staleSlot
    tab[staleSlot].value = null;
    tab[staleSlot] = null;
    size--;

    // Rehash until we encounter null
    Entry e;
    int i;
    for (i = nextIndex(staleSlot, len);
         (e = tab[i]) != null;
         i = nextIndex(i, len)) {
        ThreadLocal<?> k = e.get();
        if (k == null) {
            e.value = null;
            tab[i] = null;
            size--;
        } else {
            int h = k.threadLocalHashCode & (len - 1);
            if (h != i) {
                tab[i] = null;

                // Unlike Knuth 6.4 Algorithm R, we must scan until
                // null because multiple entries could have been stale.
                while (tab[h] != null)
                    h = nextIndex(h, len);
                tab[h] = e;
            }
        }
    }
    return i;
}
Copy code

There are two parts to the above code:

  1. expunge entry at staleSlot: in this section, the value of the Entry at position i is set to null, and the reference of the Entry is set to null, so the system will clean up this memory during GC;
  2. Rehash until we encoder null: this section is the array of entries before null after scanning the position staleSlot. Clear every Entry whose key is null. At the same time, if the key is not empty, rehash and adjust its position.

Why rehash?

Because we will set a value to null during the cleaning process, if the area behind the value is connected with the previous one, then the next time we will look up the value in the loop, we will only find null.

An example is:< Key1 (hASH1), value1>, <key2 (hASH1), value2> (that is, the hash values of key1 and key2 are the same) at this time, if <key3 (hash2) and value3> are inserted, the target location of hash calculation is occupied by <key2 (hASH1) and value2>, so look for the available location later, and the hash table may change to:< Key1 (hASH1), value1>, <key2 (hASH1), value2>, <key3 (hash2), value3> At this time, if <key2 (hASH1), value2> is cleared, obviously <key3 (hash2), value3> should be moved forward (i.e. the position should be adjusted through rehash). Otherwise, if you look up the hash table with Key3, Key3 will not be found

set method

We also described the idea of the set method in the circular search of the get method, that is, the open address method. See the specific code below:

public void set(T value) {
    Thread t = Thread.currentThread();
    ThreadLocalMap map = getMap(t);
    if (map != null)
        map.set(this, value);
    else
        createMap(t, value);
}
Copy code

First, get the current thread. Get the ThreadLocalMap according to the thread. If there is a ThreadLocalMap, call map Set (ThreadLocal <? > key, object value). If not, call createMap to create.

map.set

private void set(ThreadLocal<?> key, Object value) {
    Entry[] tab = table;
    int len = tab.length;
    int i = key.threadLocalHashCode & (len-1);
    
    for (Entry e = tab[i];
         e != null;
         e = tab[i = nextIndex(i, len)]) {
        ThreadLocal<?> k = e.get();
    
        if (k == key) {
            e.value = value;
            return;
        }
    
        if (k == null) {
            replaceStaleEntry(key, value, i);
            return;
        }
    }

    tab[i] = new Entry(key, value);
    int sz = ++size;
    if (!cleanSomeSlots(i, sz) && sz >= threshold)
        rehash();
}
Copy code

Look at the above code:

  1. First, calculate the position I according to the key, and then find the Entry at the position i,
  2. If the Entry already exists and the key is equal to the passed in key, then directly assign a new value to the Entry.
  3. If the Entry exists but the key is null, call replacestateentry to replace the Entry with an empty key
  4. Continue loop detection until a null place is encountered. At this time, if you have not return ed during the loop, create a new Entry at the null position, insert it, and increase the size by 1.
  5. Finally, call cleanSomeSlots. I won't go into details about this function. All you need to know is that the expungeStaleEntry function mentioned above has been called internally to clean up the entries with null key s. Finally, return whether the entries have been cleaned up. Next, judge sz>thresgold. Here is to judge whether the rehash condition has been met. If so, call the rehash function.

There are two functions in the above code that need to be analyzed. The first is

replaceStaleEntry

private void replaceStaleEntry(ThreadLocal<?> key, Object value,
                               int staleSlot) {
    Entry[] tab = table;
    int len = tab.length;
    Entry e;

    for (int i = prevIndex(staleSlot, len);
         (e = tab[i]) != null;
         i = prevIndex(i, len))
        if (e.get() == null)
            slotToExpunge = i;

    for (int i = nextIndex(staleSlot, len);
         (e = tab[i]) != null;
         i = nextIndex(i, len)) {
        ThreadLocal<?> k = e.get();
        if (k == key) {
            e.value = value;

            tab[i] = tab[staleSlot];
            tab[staleSlot] = e;

            // Start expunge at preceding stale entry if it exists
            if (slotToExpunge == staleSlot)
                slotToExpunge = i;
            cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
            return;
        }

        if (k == null && slotToExpunge == staleSlot)
            slotToExpunge = i;
    }

    // If key not found, put new entry in stale slot
    tab[staleSlot].value = null;
    tab[staleSlot] = new Entry(key, value);

    // If there are any other stale entries in run, expunge them
    if (slotToExpunge != staleSlot)
        cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
}
Copy code

First, we recall that in the previous step, the replacestateentry was called because the key of the Entry in this location was null.

  1. The first for loop: we find the location where the key is null and record it as slotToExpunge. This is for the later cleaning process. We can ignore it;
  2. The second for loop: we start from the staleSlot to the next null. If we find an Entry whose key is equal to the incoming key, we assign a new value to the Entry, exchange it with the Entry at the staleSlot, and then call CleanSomeSlots to clean up the Entry whose key is null.
  3. If there is no Entry with the same key as the passed in key, create an Entry at the staleSlot. The function finally cleans up the Entry of an empty key.

After replacestateentry, another important function is rehash and rehash conditions:

First, call rehash when SZ > threshold:

rehash

private void rehash() {
    expungeStaleEntries();

    // Use lower threshold for doubling to avoid hysteresis
    if (size >= threshold - threshold / 4)
        resize();
}
Copy code

After clearing the entries of empty key s, if the size is greater than the threshold of 3/4, call the resize function:

private void resize() {
    Entry[] oldTab = table;
    int oldLen = oldTab.length;
    int newLen = oldLen * 2;
    Entry[] newTab = new Entry[newLen];
    int count = 0;

    for (int j = 0; j < oldLen; ++j) {
        Entry e = oldTab[j];
        if (e != null) {
            ThreadLocal<?> k = e.get();
            if (k == null) {
                e.value = null; // Help the GC
            } else {
                int h = k.threadLocalHashCode & (newLen - 1);
                while (newTab[h] != null)
                    h = nextIndex(h, newLen);
                newTab[h] = e;
                count++;
            }
        }
    }

    setThreshold(newLen);
    size = count;
    table = newTab;
}
Copy code

From the source code, we can see that the size of each expansion is doubled. Then in a for loop, clear the entries of empty keys, recalculate the hash values of entries whose keys are not empty, put them in the correct positions, and update all properties of ThreadLocalMap.

remove

The last thing to explore is the remove function, which is used to remove an unused Entry from the map. Also, the hash value is calculated first. If the first hit is missed, it will cycle until null. During this process, expungeStaleEntry will also be called to clear the empty key node. The codes are as follows:

private void remove(ThreadLocal<?> key) {
    Entry[] tab = table;
    int len = tab.length;
    int i = key.threadLocalHashCode & (len-1);
    for (Entry e = tab[i];
         e != null;
         e = tab[i = nextIndex(i, len)]) {
         if (e.get() == key) {
            e.clear();
            expungeStaleEntry(i);
            return;
        }
    }
}
Copy code

Best practices for using ThreadLocal

We found that no matter in the set,get or remove methods, the Entry with null key will be erased during the process, so the value in the Entry will not have a strong reference chain, and will be recycled during GC. So how can there be a memory leak? But the above idea assumes that you call the get or set method, which we haven't called many times, so the best practice is*

1 The user needs to manually call the remove function to delete the ThreadLocal

2 In addition, try to set ThreadLocal to private static, so that ThreadLocal will die together with the thread itself.

 

https://juejin.im/post/5a5efb1b518825732b19dca4

Posted by bow-viper1 on Mon, 30 May 2022 10:08:53 +0530