Introduction and practice of high performance cache Caffeine

overview

In this article, we will introduce Caffeine -A Java high performance cache library. A fundamental difference between caching and Map is that caching evicts stored elements. The eviction policy determines which objects should be deleted at what time. The eviction policy directly affects the hit rate of the cache, which is a key feature of the cache library. Caffeine uses the Window TinyLfu eviction strategy, which provides a near optimal hit rate.

Add dependency

First in pom Add Caffeine related dependencies to the XML file:

<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
    <version>2.5.5</version>
</dependency>

You can find the latest version of Caffeine on Maven Central.

Cache population

Let's focus on Caffeine's three cache population strategies: manual, synchronous, and asynchronous.

First, let's create a DataObject class for storing in the cache:

class DataObject {
    private final String data;
 
    private static int objectCounter = 0;
    // standard constructors/getters
     
    public static DataObject get(String data) {
        objectCounter++;
        return new DataObject(data);
    }
}

Manual fill

In this strategy, we manually insert values into the cache and retrieve them later.

Let's initialize the cache:

Cache<String, DataObject> cache = Caffeine.newBuilder()
  .expireAfterWrite(1, TimeUnit.MINUTES)
  .maximumSize(100)
  .build();

Now, we can use the getIfPresent method to get the value from the cache. If the value does not exist in the cache, this method returns null:

String key = "A";
DataObject dataObject = cache.getIfPresent(key);
 
assertNull(dataObject);

We can manually insert values into the cache using the put method:

cache.put(key, dataObject);
dataObject = cache.getIfPresent(key);
 
assertNotNull(dataObject);

We can also get values using the get method, which takes Lambda functions and keys as parameters. If the key does not exist in the cache, this Lambda function will be used to provide the return value, and the return value will be inserted into the cache after calculation:

dataObject = cache
  .get(key, k -> DataObject.get("Data for A"));
 
assertNotNull(dataObject);
assertEquals("Data for A", dataObject.getData());

The get method performs the calculation atomically. This means that the calculation will only occur once, even if multiple threads request the value at the same time. This is why getting is better than getIfPresent.

Sometimes we need to manually invalidate some cached values:

cache.invalidate(key);
dataObject = cache.getIfPresent(key);
 
assertNull(dataObject);

Synchronous loading

This method of loading caches has a function that initializes values, similar to the get method of a manual policy. Let's see how to use it.

First, we need to initialize the cache:

LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
  .maximumSize(100)
  .expireAfterWrite(1, TimeUnit.MINUTES)
  .build(k -> DataObject.get("Data for " + k));

Now we can use the get method to retrieve the value:

DataObject dataObject = cache.get(key);
 
assertNotNull(dataObject);
assertEquals("Data for " + key, dataObject.getData());

We can also use the getAll method to get a set of values:

Map<String, DataObject> dataObjectMap 
  = cache.getAll(Arrays.asList("A", "B", "C"));
 
assertEquals(3, dataObjectMap.size());

Retrieve the value from the initialization function passed to the build method. This allows you to decorate the access values by caching them in.

Asynchronous loading

This policy is the same as the previous policy, but the operation is performed asynchronously and the CompletableFuture that saves the actual value is returned:

AsyncLoadingCache<String, DataObject> cache = Caffeine.newBuilder()
  .maximumSize(100)
  .expireAfterWrite(1, TimeUnit.MINUTES)
  .buildAsync(k -> DataObject.get("Data for " + k));

Considering the fact that they return CompletableFuture, we can use the get and getAll methods in the same way:

String key = "A";
 
cache.get(key).thenAccept(dataObject -> {
    assertNotNull(dataObject);
    assertEquals("Data for " + key, dataObject.getData());
});
 
cache.getAll(Arrays.asList("A", "B", "C"))
  .thenAccept(dataObjectMap -> assertEquals(3, dataObjectMap.size()));

Completabilefuture has a rich and useful API that you can use in the text For more information.

Evict element

Caffeine has three element eviction strategies: capacity based, time-based and reference based.

Capacity based eviction

This eviction occurs when the configured cache size limit is exceeded. There are two ways to get the current occupancy of capacity, calculating the number of objects in the cache or getting their weights.

Let's look at how to handle objects in the cache. When initializing the cache, its size is equal to zero:

LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
  .maximumSize(1)
  .build(k -> DataObject.get("Data for " + k));
 
assertEquals(0, cache.estimatedSize());

When we add a value, the size obviously increases:

cache.get("A");
 
assertEquals(1, cache.estimatedSize());

We can add the second value to the cache, causing the first value to be deleted:

cache.get("B");
cache.cleanUp();
 
assertEquals(1, cache.estimatedSize());

It is worth mentioning that we call the cleanUp method before getting the cache size. This is because the cache eviction is performed asynchronously, and this method helps to wait for the eviction operation to complete.

We can also pass a * * * weight * * * function to specify the weight size of the cache value:

LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
  .maximumWeight(10)
  .weigher((k,v) -> 5)
  .build(k -> DataObject.get("Data for " + k));
 
assertEquals(0, cache.estimatedSize());
 
cache.get("A");
assertEquals(1, cache.estimatedSize());
 
cache.get("B");
assertEquals(2, cache.estimatedSize());

When the weight exceeds 10, redundant values are removed from the cache in chronological order:

cache.get("C");
cache.cleanUp();
 
assertEquals(2, cache.estimatedSize());

Time based eviction

This eviction policy is based on the expiration time of the element and has three types:

  • Expire after access - the element expires after the expiration time since the last read or write occurred.
  • Expire after write - the element expires after the expiration time has elapsed since the last write.
  • Custom policy - calculate the expiration time of each element separately through the expiration implementation.

Let's use the expireAfterAccess method to configure the expiration policy after access:

LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
  .expireAfterAccess(5, TimeUnit.MINUTES)
  .build(k -> DataObject.get("Data for " + k));

To configure the post write expiration policy, we use the expireAfterWrite method:

cache = Caffeine.newBuilder()
  .expireAfterWrite(10, TimeUnit.SECONDS)
  .weakKeys()
  .weakValues()
  .build(k -> DataObject.get("Data for " + k));

To initialize the custom policy, we need to implement the Expiry interface:

cache = Caffeine.newBuilder().expireAfter(new Expiry<String, DataObject>() {
    @Override
    public long expireAfterCreate(
      String key, DataObject value, long currentTime) {
        return value.getData().length() * 1000;
    }
    @Override
    public long expireAfterUpdate(
      String key, DataObject value, long currentTime, long currentDuration) {
        return currentDuration;
    }
    @Override
    public long expireAfterRead(
      String key, DataObject value, long currentTime, long currentDuration) {
        return currentDuration;
    }
}).build(k -> DataObject.get("Data for " + k));

Reference based eviction

We can configure the cache to allow garbage collection of cached keys or values. To do this, we will configure the usage of WeakReference for keys and values, and we can only configure SoftReference for garbage collection of values.

The WeakReference usage allows garbage collection of objects when they do not have any strong references. SoftReference allows garbage collection of objects according to the JVM's global least recently used policy. For more details on Java references, see here.

We should use caffeine Weakkeys(), caffeine Weakvalues() and caffeine Softvalues() to enable each option:

LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
  .expireAfterWrite(10, TimeUnit.SECONDS)
  .weakKeys()
  .weakValues()
  .build(k -> DataObject.get("Data for " + k));
 
cache = Caffeine.newBuilder()
  .expireAfterWrite(10, TimeUnit.SECONDS)
  .softValues()
  .build(k -> DataObject.get("Data for " + k));

Refresh cache

You can configure the cache to automatically refresh elements after a defined period of time. Let's see how to do this using the refreshAfterWrite method:

Caffeine.newBuilder()
  .refreshAfterWrite(1, TimeUnit.MINUTES)
  .build(k -> DataObject.get("Data for " + k));

Here, we should understand the difference between expireAfter and refreshAfter. In the former, when requesting expired elements, the execution will block until build() calculates the new value.

However, the latter will return the old value and asynchronously calculate the new value and insert it into the cache. At this time, the expiration time of the flushed element will restart the timing calculation.

statistics

Caffeine can record statistics about cache usage:

LoadingCache<String, DataObject> cache = Caffeine.newBuilder()
  .maximumSize(100)
  .recordStats()
  .build(k -> DataObject.get("Data for " + k));
cache.get("A");
cache.get("A");
 
assertEquals(1, cache.stats().hitCount());
assertEquals(1, cache.stats().missCount());

We pass recordStats to it, and recordStats creates an implementation of StatsCounter. Each statistics related change will be pushed to this object.

summary

In this article, we became familiar with Java's Caffeine cache library. We learned how to configure and populate the cache, and how to select the appropriate expiration or refresh policy as needed.

🌟🌟🌟🌟🌟🌟🌟🌟🌟🌟🌟🌟🌟🌟🌟🌟🌟🌟

Welcome to my blog: blog.dongxishaonian.tech

Follow the author's official account and push various original / high-quality technical articles ⬇️

Tags: Java Spring Spring Boot

Posted by tobeyt23 on Tue, 31 May 2022 08:35:32 +0530