Key Concepts

Introduction

While delving into how Ehcache caches are constructed and used, this page also serves as an annotated glossary of the basic vocabulary of caching.

Definitions

  • cache-hit: When a data element is requested of the cache and the element exists for the given key, it is referrred to as a cache hit (or simply 'hit').
  • cache-miss: When a data element is requested of the cache and the element does not exist for the given key, it is referred to as a cache miss (or simply 'miss').
  • system-of-record: The core premise of caching assumes that there is a source of truth for the data. This is often referred to as a system-of-record (SOR). The cache acts as a local copy of data retrieved from or stored to the system-of-record.
  • SOR: See system-of-record.

Key Ehcache Classes

Ehcache consists of a CacheManager, which manages caches. Caches contain Elements, which are essentially name value pairs. Caches are physically implemented, either in-memory or on disk.

CacheManager

Creation of, access to, and removal of caches is controlled by the CacheManager.

CacheManager Creation Modes

CacheManager supports two creation modes: singleton and instance.

Ehcache 2.5 and higher does not allow multiple CacheManagers with the same name to exist in the same JVM. CacheManager() constructors creating non-Singleton CacheManagers can violate this rule, causing an error. If your code may create multiple CacheManagers of the same name in the same JVM, avoid this error by using the static CacheManager.create() methods, which always return the named (or default unnamed) CacheManager if it already exists in that JVM. If the named (or default unnamed) CacheManager does not exist, the CacheManager.create() methods create it.

Singleton Mode

Ehcache-1.1 supported only one CacheManager instance which was a singleton. CacheManager can still be used in this way using the static factory methods.

Instance Mode

From ehcache-1.2, CacheManager has constructors which mirror the various static create methods. This enables multiple CacheManagers to be created and used concurrently. Each CacheManager requires its own configuration.

If the Caches under management use only the MemoryStore, there are no special considerations. If Caches use the DiskStore, the diskStore path specified in each CacheManager configuration should be unique. When a new CacheManager is created, a check is made that there are no other CacheManagers using the same diskStore path. If there are, a CacheException is thrown. If a CacheManager is part of a cluster, there will also be listener ports which must be unique.

Mixed Singleton and Instance Mode

If an application creates instances of CacheManager using a constructor, and also calls a static create method, there will exist a singleton instance of CacheManager which will be returned each time the create method is called together with any other instances created via constructor. The two types will coexist peacefully.

Ehcache

All caches implement the Ehcache interface. A cache has a name and attributes. Each cache contains Elements.

A Cache in Ehcache is analogous to a cache region in other caching systems.

Cache elements are stored in the MemoryStore. Optionally they also overflow to a DiskStore.

Element

An element is an atomic entry in a cache. It has a key, a value and a record of accesses. Elements are put into and removed from caches. They can also expire and be removed by the Cache, depending on the Cache settings.

As of ehcache-1.2 there is an API for Objects in addition to the one for Serializable. Non-serializable Objects can use all parts of Ehcache except for DiskStore and replication. If an attempt is made to persist or replicate them they are discarded without error and with a DEBUG level log message.

The APIs are identical except for the return methods from Element. Two new methods on Element: getObjectValue and getKeyValue are the only API differences between the Serializable and Object APIs. This makes it very easy to start with caching Objects and then change your Objects to Seralizable to participate in the extra features when needed. Also a large number of Java classes are simply not Serializable.

Cache Usage Patterns

There are several common access patterns when using a cache. Ehcache supports the following patterns:

  • cache-aside (or direct manipulation)
  • cache-as-sor (a combination of read-through and write-through or write-behind patterns)
  • read-through
  • write-through
  • write-behind (or write-back)

cache-aside

Here, application code uses the cache directly.

This means that application code which accesses the system-of-record (SOR) should consult the cache first, and if the cache contains the data, then return the data directly from the cache, bypassing the SOR.

Otherwise, the application code must fetch the data from the system-of-record, store the data in the cache, and then return it.

When data is written, the cache must be updated with the system-of-record. This results in code that often looks like the following pseudo-code:

public class MyDataAccessClass 
{
  private final Ehcache cache;
  public MyDataAccessClass(Ehcache cache)
  {
    this.cache = cache;
  }

  /* read some data, check cache first, otherwise read from sor */
  public V readSomeData(K key) 
  {
     Element element;
     if ((element = cache.get(key)) != null) {
         return element.getValue();
     }
      // note here you should decide whether your cache
     // will cache 'nulls' or not
     if (value = readDataFromDataStore(key)) != null) {
         cache.put(new Element(key, value));
     } 
     return value;
  }
  /* write some data, write to sor, then update cache */
  public void writeSomeData(K key, V value) 
  {
     writeDataToDataStore(key, value);
     cache.put(new Element(key, value);
  }

cache-as-sor

The cache-as-sor pattern implies using the cache as though it were the primary system-of-record (SOR). The pattern delegates SOR reading and writing activies to the cache, so that application code is absolved of this responsibility.

To implement the cache-as-sor pattern, use a combination of the following read and write patterns:

  • read-through
  • write-through or write-behind

Advantages of using the cache-as-sor pattern are:

  • less cluttered application code (improved maintainability)
  • easily choose between write-through or write-behind strategies on a per-cache basis (use only configuration)
  • allow the cache to solve the "thundering-herd" problem

A disadvantage of using the cache-as-sor pattern is:

  • less directly visible code-path

read-through

The read-through pattern mimics the structure of the cache-aside pattern when reading data. The difference is that you must implement the CacheEntryFactory interface to instruct the cache how to read objects on a cache miss, and you must wrap the Ehcache instance with an instance of SelfPopulatingCache. Compare the appearance of the read-through pattern code to the code provided in the cache-aside pattern. (The full example is provided at the end of this document that includes a read-through and write-through implementation).

write-through

The write-through pattern mimics the structure of the cache-aside pattern when writing data. The difference is that you must implement the CacheWriter interface and configure the cache for write-through or write-behind. A write-through cache writes data to the system-of-record in the same thread of execution, therefore in the common scenario of using a database transaction in context of the thread, the write to the database is covered by the transaction in scope. More details (including configuration settings) can be found in the User Guide chapter on Write-through and Write-behind Caching.

write-behind

The write-behind pattern changes the timing of the write to the system-of-record. Rather than writing to the System of Record in the same thread of execution, write-behind queues the data for write at a later time.

The consequences of the change from write-through to write-behind are that the data write using write-behind will occur outside of the scope of the transaction.

This often-times means that a new transaction must be created to commit the data to the system-of-record that is separate from the main transaction. More details (including configuration settings) can be found in the User Guide chapter on Write-through and Write-behind Caching.

cache-as-sor example

public class MyDataAccessClass 
{
private final Ehcache cache;
public MyDataAccessClass(Ehcache cache)
{
   cache.registerCacheWriter(new MyCacheWriter());
   this.cache = new SelfPopulatingCache(cache);
}
/* read some data - notice the cache is treated as an SOR.  
* the application code simply assumes the key will always be available
*/
public V readSomeData(K key) 
{
   return cache.get(key);
}
/* write some data - notice the cache is treated as an SOR, it is 
* the cache's responsibility to write the data to the SOR. 
*/
public void writeSomeData(K key, V value) 
{
   cache.put(new Element(key, value);
}
/**
* Implement the CacheEntryFactory that allows the cache to provide
* the read-through strategy
*/
private class MyCacheEntryFactory implements CacheEntryFactory
{
   public Object createEntry(Object key) throws Exception
   {
       return readDataFromDataStore(key);
   }    
}
/**
* Implement the CacheWriter interface which allows the cache to provide
* the write-through or write-behind strategy.
*/
private class MyCacheWriter implements CacheWriter 
   public CacheWriter clone(Ehcache cache) throws CloneNotSupportedException;
   {
       throw new CloneNotSupportedException();
   }
    public void init() { }
   void dispose() throws CacheException { } 
    void write(Element element) throws CacheException;
   {
       writeDataToDataStore(element.getKey(), element.getValue());
   }
    void writeAll(Collection elements) throws CacheException
   {
       for (Element element : elements) {
           write(element);
       }
   }
    void delete(CacheEntry entry) throws CacheException
   {
       deleteDataFromDataStore(element.getKey());
   }
    void deleteAll(Collection entries) throws CacheException
   {
       for (Element element : elements) {
           delete(element);
       }
   }
}
}

Copy Cache

A Copy Cache can have two behaviors: it can copy Element instances it returns, when copyOnRead is true and copy elements it stores, when copyOnWrite to true.

A copy on read cache can be useful when you can't let multiple threads access the same Element instance (and the value it holds) concurrently. For example, where the programming model doesn't allow it, or you want to isolate changes done concurrently from each other.

Copy on write also lets you determine exactly what goes in the cache and when. i.e. when the value that will be in the cache will be in state it was when it actually was put in cache. All mutations to the value, or the element, after the put operation will not be reflected in the cache.

A concrete example of a copy cache is a Cache configured for XA. It will always be configured copyOnRead and copyOnWrite to provide proper transaction isolation and clear transaction boundaries (the state the objects are in at commit time is the state making it into the cache). By default, the copy operation will be performed using standard Java object serialization. We do recognize though that for some applications this might not be good (or fast) enough. You can configure your own CopyStrategy which will be used to perform these copy operations. For example, you could easily implement use cloning rather than Serialization.

More information on configuration can be found here: copyOnRead and copyOnWrite cache configuration.