Author: Jingdong Technology, South Korea Kai
1.1 Solve the circular dependency process
1.1.1 The role of the third-level cache
Circular dependency is a relatively common problem in our daily development. spring has optimized circular dependency so that we can help us solve the problem of circular dependency without perception.
The simplest circular dependency is that A depends on B, B depends on C, and C depends on A. If the problem of circular dependency is not solved, it will eventually lead to OOM, but not all circular dependencies can be solved. spring can only solve it through properties or setter s Injected singleton beans, and beans injected through constructors or non-singleton patterns are not resolvable.
Through the process of creating beans above, we know that when obtaining beans, we will first try to obtain them from the cache. If we cannot obtain them from the cache, we will create beans, and the three-tier cache is the key to solving circular dependencies. :
protected Object getSingleton(String beanName, boolean allowEarlyReference) { // Quick check for existing instance without full singleton lock //Load from L1 cache Object singletonObject = this.singletonObjects.get(beanName); if (singletonObject == null && isSingletonCurrentlyInCreation(beanName)) { //load from second level cache singletonObject = this.earlySingletonObjects.get(beanName); //allowEarlyReference is true means to look up in the third-level cache, and it is true at this time if (singletonObject == null && allowEarlyReference) { synchronized (this.singletonObjects) { singletonObject = this.singletonObjects.get(beanName); if (singletonObject == null) { singletonObject = this.earlySingletonObjects.get(beanName); if (singletonObject == null) { //Load from L3 cache ObjectFactory<?> singletonFactory = this.singletonFactories.get(beanName); if (singletonFactory != null) { singletonObject = singletonFactory.getObject(); this.earlySingletonObjects.put(beanName, singletonObject); this.singletonFactories.remove(beanName); } } } } } } return singletonObject; }
You can see that the three-tier cache is actually three hashmap s:
/** Cache of singleton objects: bean name to bean instance. */ private final Map<String, Object> singletonObjects = new ConcurrentHashMap<>(256); /** Cache of singleton factories: bean name to ObjectFactory. */ private final Map<String, ObjectFactory<?>> singletonFactories = new HashMap<>(16); /** Cache of early singleton objects: bean name to bean instance. */ private final Map<String, Object> earlySingletonObjects = new ConcurrentHashMap<>(16);
The role of the third level cache:
- singletonObjects: used to store fully initialized beans, beans taken from the cache can be used directly
- earlySingletonObjects: The cache of the singleton object exposed in advance, storing the original bean object (not filled with properties), used to solve the circular dependency
- singletonFactories: the cache of the singleton object factory, storing bean factory objects, used to resolve circular dependencies
1.1.2 Process for solving circular dependencies
According to the above, we all know that the process of creating bean s mainly includes the following three steps:
- instantiated bean
- Assembly bean properties
- initialize bean s
For example, we now have A that depends on B, and B that depends on A, so how does spring solve the three-layer loop?
- First try to load A from the cache and find that A does not exist
- Instantiate A (no attributes, semi-finished product)
- Put the instantiated A into the third-level cache
- Assembly attribute B (no attribute, semi-finished product)
- Tried to load B from cache, found that B does not exist
- Instantiate B
- Put the instantiated B into the third-level cache
- Assembly property A
- Try to load A from the cache, find that A exists in the cache (step 3), remove A from the third-level cache, put it in the second-level cache, and assign it to B, and the B assembly attribute is completed
- At this time, the assembly properties of B are completed, initialize B, and remove B from the third-level cache and put it into the first-level cache
- Go back to the fourth step, at this time the attributes of A are also assembled
- Initialize A and put A into the first-level cache
Since then, instances A and B have completed the creation process respectively.
Use a picture to describe:
Then there is a problem at this time. In step 9, the A owned by B is an object that is only instantiated, and there is no attribute assembly and initialization. The initialization of A is after step 11, and then all creation is completed in B at the end. Is attribute A a semi-finished product or a finished product that can already work normally? The answer is the finished product, because B can be understood as passing by reference to A, that is to say, the attribute A in B is the same as A before step 11, then A completes the attribute assembly in step 11, and the attribute in B naturally Attribute assembly is also done.
For example, if we pass in an instantiated object in a method, if the instantiated object is modified in the method, then the instantiated object will also be modified after the method ends. It is the instantiated object that needs to be noted, not There are several basic objects in java, and the basic objects belong to the value transfer (in fact, the instantiated object is also the value transfer, but the reference of the object is passed in, which can be understood as the reference transfer).
private static void fun3(){ Student student = new Student(); student.setName("zhangsan"); System.out.println(student.getName()); changeName(student); System.out.println(student.getName()); String str = "zhangsan"; System.out.println(str); changeString(str); System.out.println(str); } private static void changeName(Student student){ student.setName("lisi"); } private static void changeString(String str){ str = "lisi"; } //output result zhangsan lisi zhangsan zhangsan
It can be seen that passing by reference will change the object, but passing by value will not.
2.2 Why is the three-tier cache
2.2.1 The role of L3 cache in circular dependency
Some friends may have noticed why a three-tier cache is needed. Two-tier cache seems to be able to solve the problem. However, if the proxy is not considered, it is true that two-tier cache can solve the problem, but if the object to be referenced is not ordinary bean but the object being proxied will cause problems.
What you need to know is that when spring creates a proxy object, it first instantiates the source object, and then obtains the proxy object after the initialization of the source object is completed.
Let's not consider why it is a third-level cache, but let's take a look at what problems exist in the proxy object in the process just now
Going back to the example we just gave, now we need a proxy object A, where A depends on B, and B is also a proxy object. If there is no special treatment, there will be problems:
- First try to load A from the cache and find that A does not exist
- Instantiate A (no attributes, semi-finished product)
- Put the instantiated A into the third-level cache
- Assembly attribute B (no attribute, semi-finished product)
- Tried to load B from cache, found that B does not exist
- Instantiate B
- Put the instantiated B into the third-level cache
- Assembly property A
- Try to load A from the cache, find that A exists in the cache (step 3), remove A from the third-level cache, put it in the second-level cache, and assign it to B, and the B assembly attribute is completed
- At this time, the assembly properties of B are completed, initialize B, and remove B from the third-level cache and put it into the first-level cache
- Go back to the fourth step, at this time the attributes of A are also assembled
- Initialize A and put A into the first-level cache
The same process as before, then the object owned by B at this time is the ordinary object of A, not the proxy object, which is a problem.
Some students may ask, isn't there a passing by reference? Won't A being completed by an agent still be owned by B?
But the answer is also very simple, no, A and A's proxy objects must be two objects, and must also be two addresses in memory, so this situation needs to be resolved.
2.2.2 Solving the problem of proxy objects
Let's take a look at how spring solves this problem:
According to the process of bean creation, we know that the bean will be instantiated first, and after the instantiation is completed, such a piece of code will be executed:
//3. Whether it needs to be exposed in advance, used to solve circular dependencies boolean earlySingletonExposure = (mbd.isSingleton() && this.allowCircularReferences && isSingletonCurrentlyInCreation(beanName)); if (earlySingletonExposure) { if (logger.isTraceEnabled()) { logger.trace("Eagerly caching bean '" + beanName + "' to allow for resolving potential circular references"); } addSingletonFactory(beanName, () -> getEarlyBeanReference(beanName, mbd, bean)); }
This is also the code in our 2.2.5 (see the previous article for details), there are two main parts
First of all, it will judge whether it needs to be exposed in advance. The judgment result is composed of three parts, namely:
- Whether it is a singleton mode, it is a singleton mode by default, and spring can only solve the circular dependency in this case
- Whether to allow early exposure, the default is true, and can also be changed
- Whether it is being created, normally a bean starts with true when it is created and false when it ends
It can be seen that under normal circumstances, these results of a bean are all true, that is, it will enter the following method. There is a lamda expression in this method, which is split here for readability.
ObjectFactory<Object> objectFactory = new ObjectFactory<Object>() { @Override public Object getObject() throws BeansException { return getEarlyBeanReference(beanName, mbd, bean); } } ; addSingletonFactory(beanName, objectFactory);
The main content of this method is to create an ObjectFactory, in which the getObject() method returns the result of the getEarlyBeanReference(beanName, mbd, bean) method call.
See what the addSingletonFactory() method does
protected void addSingletonFactory(String beanName, ObjectFactory<?> singletonFactory) { Assert.notNull(singletonFactory, "Singleton factory must not be null"); synchronized (this.singletonObjects) { if (!this.singletonObjects.containsKey(beanName)) { //Add content to L3 cache this.singletonFactories.put(beanName, singletonFactory); this.earlySingletonObjects.remove(beanName); this.registeredSingletons.add(beanName); } } }
The most important line of code is to add an object to the third-level cache, and the added object is the objectFactory passed in. Note that what is added here is not a bean, but a factory. This is also the operation of the third step in the process of resolving circular dependencies above.
If there is a circular dependency, A has been added to the cache at this time. When A is used as a dependency by other bean s, the cache will be called according to our previous logic
//Load from L3 cache ObjectFactory<?> singletonFactory = this.singletonFactories.get(beanName); if (singletonFactory != null) { //Get the object from the objectFactory just added, that is, call getObject, that is, get the content of the getEarlyBeanReference method singletonObject = singletonFactory.getObject(); this.earlySingletonObjects.put(beanName, singletonObject); this.singletonFactories.remove(beanName); }
Get the object from the objectFactory just added, that is, call getObject, that is, get the content of the getEarlyBeanReference method
So let's see why a factory is returned instead of a bean
exposedObject = bp.getEarlyBeanReference(exposedObject, beanName);
return wrapIfNecessary(bean, beanName, cacheKey);
According to the link, it can be seen that the final call to this method returns, that is to say, when instance A is depended on by other bean s, what is returned is actually the result of this method.
protected Object wrapIfNecessary(Object bean, String beanName, Object cacheKey) { if (StringUtils.hasLength(beanName) && this.targetSourcedBeans.contains(beanName)) { return bean; } if (Boolean.FALSE.equals(this.advisedBeans.get(cacheKey))) { return bean; } if (isInfrastructureClass(bean.getClass()) || shouldSkip(bean.getClass(), beanName)) { this.advisedBeans.put(cacheKey, Boolean.FALSE); return bean; } // If it is an object that needs to be proxied, the proxied object will be returned here Object[] specificInterceptors = getAdvicesAndAdvisorsForBean(bean.getClass(), beanName, null); if (specificInterceptors != DO_NOT_PROXY) { this.advisedBeans.put(cacheKey, Boolean.TRUE); Object proxy = createProxy( bean.getClass(), beanName, specificInterceptors, new SingletonTargetSource(bean)); this.proxyTypes.put(cacheKey, proxy.getClass()); return proxy; } this.advisedBeans.put(cacheKey, Boolean.FALSE); return bean; }
It should be clear at a glance how spring handles the situation where there are proxy objects and circular dependencies.
Going back to the previous logic, for example, if instance B needs to fill attribute A at this time, it will query whether A exists from the cache, and if A already exists, call the getObject() method of A. If A is an object that needs to be proxied, then Returns the proxied object, otherwise returns a normal bean.
In addition, it should be noted that creating the object first, then creating the proxy class, and then initializing the original object is the same as creating the proxy class after initialization, which is also the basis for exposing the proxy object in advance.
2.2.3 The role of the second level cache
So what is the second level cache for?
addSingletonFactory(beanName, () -> getEarlyBeanReference(beanName, mbd, bean));
It can be seen in the source code that every call to getObject() and then call createProxy() in getEarlyBeanReference() will generate a new proxy object, which does not conform to the singleton pattern.
(Many articles on the Internet say that new objects will be generated because of calling lamda expressions. In fact, if the non-proxy bean does not generate new objects, because the objectFactory holds the original object, even if it is called multiple times, it will return the same The result. But for the proxy object, one will be created each time, so in fact, a new proxy object will be generated instead of a new normal object. So the reason why the second-level cache is used in essence is because the proxy object is created using createProxy () method will generate a new proxy object every time it is called, so in fact, as long as there is a place that can return the same proxy object according to the beanName, there is no need for a second-level cache, which is also the essence of the second-level cache. In fact, You can cache the created proxy object in the getObject() method, but doing so is too inelegant and does not conform to the coding standards of spring.)
Object proxy = createProxy( bean.getClass(), beanName, specificInterceptors, new SingletonTargetSource(bean));
For example, A depends on B, B depends on A, C and D, and C and D depend on A. If no processing is performed, after the instantiation of A is completed, the proxy object A1 of A is obtained during the creation of B, and then C , The proxy objects obtained by D are A2 and A3, which obviously do not conform to the singleton mode.
Therefore, there needs to be a place to store objects obtained from the factory, which is the function of the second-level cache:
if (singletonFactory != null) { singletonObject = singletonFactory.getObject(); //Store the obtained proxy object or ordinary bean this.earlySingletonObjects.put(beanName, singletonObject); //At this time, the factory of the third-level cache is meaningless this.singletonFactories.remove(beanName); }
2.2.4 When is the proxy object initialized
I thought it was over here, but when sorting out the circular dependencies with proxy objects, I suddenly found another problem:
There is still an example where both A and B have proxies that depend on each other. After B assembles the proxy object of A, B is initialized and A starts to initialize. But at this time, A is the original beanA1, not the proxy object A2, which is held by B. It is the proxy object A2, so the original object A1 is initialized and A2 is not initialized. Isn’t this a problem?
After a day of searching and searching for information and debug ging, I finally found the answer in an article:
No, this is because whether it is the proxy class generated by the cglib proxy or the jdk dynamic proxy, it holds a reference to the target class internally. When the method of the proxy object is called, the method of the target object will actually be called, and A completes the initialization. Since the proxy object itself has also completed initialization
That is to say, after the initialization of the original object A1 is completed, since A2 is the encapsulation and enhancement of A1, it means that the initialization of A2 is also completed.
2.2.5 When to return the proxy object
In addition, one more thing to note is that after A1 is assembled, other bean s should depend on A2 in the future, and the one added to the first-level cache should also be A2, so when will A1 be replaced by A2?
//It has been judged above, generally true if (earlySingletonExposure) { //1. Note that what is passed in here is false, and will not be searched in the third-level cache, and if it is a proxy object, the proxy object will be returned at this time Object earlySingletonReference = getSingleton(beanName, false); if (earlySingletonReference != null) { //2. Determine whether the object has been changed after the post-processor. If ==, it means that it has not been changed. If it is a proxy object, return the proxied bean if (exposedObject == bean) { exposedObject = earlySingletonReference; } else if (!this.allowRawInjectionDespiteWrapping && hasDependentBean(beanName)) { String[] dependentBeans = getDependentBeans(beanName); Set<String> actualDependentBeans = new LinkedHashSet<>(dependentBeans.length); for (String dependentBean : dependentBeans) { if (!removeSingletonIfCreatedForTypeCheckOnly(dependentBean)) { actualDependentBeans.add(dependentBean); } } if (!actualDependentBeans.isEmpty()) { throw new BeanCurrentlyInCreationException(beanName, "Bean with name '" + beanName + "' has been injected into other beans [" + StringUtils.collectionToCommaDelimitedString(actualDependentBeans) + "] in its raw version as part of a circular reference, but has eventually been " + "wrapped. This means that said other beans do not use the final version of the " + "bean. This is often the result of over-eager type matching - consider using " + "'getBeanNamesForType' with the 'allowEagerInit' flag turned off, for example."); } } } }
There are two things to note above:
- Because before assembling attribute A in B, that is, when searching for A from the third-level cache, if the bean of A (possibly a proxy object) is found, it will be put into the second-level cache, and then the third-level cache will be deleted, then At this time, getSingleton() returns the bean in the second-level cache.
- Here it is judged whether the bean passed through the post-processor has been changed, generally it will not be changed, unless the BeanPostProcessor interface is implemented and manually changed. If there is no change, the data in the cache will be taken out and returned, that is to say, if the proxy beanA2 is in the second-level cache at this time, A2 will be returned instead of the original object A1, and it will be the same if it is a normal bean. And if the object has been changed, go to the following logic to judge whether it is possible to report an error.
3.1 Circular dependency summary
After studying the cyclic dependency of agents, I found that the second-level cache is not really needed. You can choose whether to generate a proxy object after the bean is instantiated. If you want to generate it, put the proxy object in the third-level cache. Otherwise, Just put it in an ordinary bean, so that when others come to get it, they don't have to judge whether they need to return the proxy object.
Later, I found that there are many people who think the same as me on the Internet. At present, referring to other people's ideas and my own summary is probably like this:
There is no difference in efficiency whether you perform object proxy after the instantiation is completed or choose to return a factory to perform proxy when you use it. It’s just that one is done in advance and the other is done after a delay, so why does spring choose to do it with a lag? Woolen cloth? My own thinking is:
The reason is also very simple. Since there is no improvement in efficiency, why destroy the creation process of ordinary bean s. Originally, circular dependency is a very small probability thing. The bean creation logic is tantamount to putting the cart before the horse, and this is also the meaning of the factory stored in the second-level cache or the third-level cache.