Serializers
All stores but the on-heap one need some form of serialization/deserialization of objects to be able to store and retrieve mappings. This is because they cannot internally store plain java objects but only binary representations of them.
Serializer is the Ehcache abstraction solving this: every cache that has at least one store that cannot store by reference is going to use a pair of Serializer instances, one for the key and another one for the value.
A Serializer is scoped at the cache level and all stores of a cache will be using and sharing the same pair of serializers.
How is a serializer configured?
There are two places where serializers can be configured:
at the cache level where one can use
CacheConfigurationBuilder.withKeySerializer(Class<? extends Serializer<K>> keySerializerClass),
CacheConfigurationBuilder.withKeySerializer(Serializer<K> keySerializer),
CacheConfigurationBuilder.withValueSerializer(Class<? extends Serializer<V>> valueSerializerClass),
and
CacheConfigurationBuilder.withValueSerializer(Serializer<V> valueSerializer),
which allow by instance or by class configuration.
at the cache manager level where one can use
CacheManagerBuilder.withSerializer(Class<C> clazz, Class<? extends Serializer<C>> serializer) If a serializer is configured directly at the cache level, it will be used, ignoring any cache manager level configuration.
If a serializer is configured at the cache manager level, upon initialization, a cache with no specifically configured serializer will search through its cache manager's registered list of serializers and try to find one that directly matches the cache's key or value type. If such search fails, all the registered serializers will be tried in the added order to find one that handles compatible types.
For instance, let's say you have a Person interface and two subclasses: Employee and Customer. If you configure your cache manager as follows:
CacheManagerBuilder.newCacheManagerBuilder().withSerializer(Employee.class,
EmployeeSerializer.class).withSerializer(Person.class, PersonSerializer.class)
then configuring a Cache<Long, Employee> would make it use the EmployeeSerializer while a Cache<Long, Customer> would make it use the PersonSerializer.
A Serializer configured at the cache level by class will not be shared to other caches when instantiated.
Note: Given the above, it is recommended to limit Serializer registration to concrete classes and not aim for generality.
Bundled implementations
By default, cache managers are pre-configured with specially optimized Serializer that can handle the following types, in the following order:
java.io.Serializable java.lang.Long java.lang.Integer java.lang.Float java.lang.Double java.lang.Character java.lang.String byte[] All bundled Serializer implementations support both persistent and transient caches.
A consequence of providing serializers registered by default is that you will not be able to register a generic Serializer for Number or any other super type and expect it to be picked instead of the default ones for the types listed above.
However, registering a different Serializer for one of the given type means it will be used instead of the default.
Lifecycle: instances vs. class
When a Serializer is configured by providing an instance, it is up to the provider of that instance to manage its lifecycle. It will need to dispose of any resource the serializer might hold, persisting or reloading the serializer's state.
When a Serializer is configured by providing a class either at the cache or cache manager level, since Ehcache is responsible for creating the instance, it also is responsible for disposing of it. If the Serializer implements java.io.Closeable then close() will be called when the cache is closed and the Serializer no longer needed.
Writing your own serializer
Serializer defines a very strict contract. So if you're planning to write your own implementation you have to keep in mind that the class of the serialized object MUST be retained after deserialization, that is:
object.getClass().equals(
mySerializer.read(mySerializer.serialize(object)).getClass() )
This is especially important when you are planning to write a serializer for an abstract type, e.g. a serializer of type com.pany.MyInterface should
deserialize a
com.pany.MyClassImplementingMyInterface when the serialized object is of class
com.pany.MyClassImplementingMyInterfacereturn a
com.pany.AnotherClassImplementingMyInterface object when the serialized object is of class
com.pany.AnotherClassImplementingMyInterfaceImplement the following interface, from package org.ehcache.spi.serialization:
/**
* Defines the contract used to transform type instances to and
* from a serial form.
* <P>
* Implementations must be thread-safe.
* </P>
* <P>
* When used within the default serialization provider, there are additional
* requirements.
* The implementations must define either or both of the two constructors:
* <dl>
* <dt><code><i>Serializer</i>(ClassLoader loader)</code>
* <dd>This constructor is used to initialize the serializer for transient caches.
* <dt><code><i>Serializer</i>(ClassLoader loader,
* org.ehcache.core.spi.service.FileBasedPersistenceContext context)</code>
* <dd>This constructor is used to initialize the serializer for persistent caches.
* </dl>
* The {@code ClassLoader} value may be {@code null}. If not {@code null}, the
* class loader
* instance provided should be used during deserialization to load classes needed
* by the deserialized objects.
* </P>
* <p>
* The serialized object's class must be preserved; deserialization of the serial
* form of an object must
* return an object of the same class. The following contract must always be true:
* <p>
* <code>object.getClass().equals( mySerializer.read(mySerializer.serialize(object))
* .getClass() )</code>
* </p>
* </p>
*
* @param <T> the type of the instances to serialize
*
* @see SerializationProvider
*/
public interface Serializer<T> {
/**
* Transforms the given instance into its serial form.
*
* @param object the instance to serialize
*
* @return the binary representation of the serial form
*
* @throws SerializerException if serialization fails
*/
ByteBuffer serialize(T object) throws SerializerException;
/**
* Reconstructs an instance from the given serial form.
*
* @param binary the binary representation of the serial form
*
* @return the de-serialized instance
*
* @throws SerializerException if reading the byte buffer fails
* @throws ClassNotFoundException if the type to de-serialize to cannot be found
*/
T read(ByteBuffer binary) throws ClassNotFoundException, SerializerException;
/**
* Checks if the given instance and serial form {@link Object#equals(Object)
* represent} the same instance.
*
* @param object the instance to check
* @param binary the serial form to check
*
* @return {@code true} if both parameters represent equal instances,
* {@code false} otherwise
*
* @throws SerializerException if reading the byte buffer fails
* @throws ClassNotFoundException if the type to de-serialize to cannot be found
*/
boolean equals(T object, ByteBuffer binary) throws ClassNotFoundException,
SerializerException;
}
As the Javadoc states, there are some constructor rules, see the section
Persistent vs. transient caches for that.
You can optionally implement java.io.Closeable. If you do, Ehcache will call close() when a cache using such a serializer gets disposed of, but only if Ehcache instantiated the serializer itself.
ClassLoaders
When Ehcache instantiates a serializer itself, it will pass it a ClassLoader via the constructor. Such class loader must be used to access the classes of the serialized types as they might not be available in the current class loader
Persistent vs. transient caches
All custom serializers must have a constructor with the following signature:
public MySerializer(ClassLoader classLoader) {
}
Attempting to configure a serializer that lacks such a constructor on a cache using either of CacheConfigurationBuilder.withKeySerializer(Class<? extends Serializer<K>> keySerializerClass) or CacheConfigurationBuilder.withValueSerializer(Class<? extends Serializer<V>> valueSerializerClass) will cause an exception upon cache initialization.
But if an instance of the serializer is configured using either of CacheConfigurationBuilder.withKeySerializer(Serializer keySerializer) or CacheConfigurationBuilder.withValueSerializer(Serializer valueSerializer) it will work since the instantiation is done by the user code itself.
Registering a serializer that lacks such a constructor at the cache manager level will prevent it from being chosen for caches.
Custom serializer implementations could have some state that is used in the serialization/deserialization process. When configured on a persistent cache, the state of such serializers needs to be persisted across restarts.
To address these requirements you can have a StatefulSerializer implementation. StatefulSerializer is a specialized Serializer with an additional init method with the following signature:
public void init(StateRepository repository) {
}
The StateRepository.getPersistentStateHolder(String name, Class<K> keyClass, Class<V> valueClass, Predicate<Class<?>> isClassPermitted, ClassLoader classLoader) provides a StateHolder (a map like structure) that you can use to store any relevant state. Here name is the name of the StateHolder which maps objects of keyClass to objects of valueClass. The Predicate isClassPermitted authorizes the classes for deserialization as part of key or value deserialization. If a Class fails the isClassPermitted test, a RuntimeException is thrown. The deserialization uses the ClassLoader to resolve classes.
Note: StateRepository.getPersistentStateHolder(String name, Class<K> keyClass, Class<V> valueClass) has been deprecated in favour of the above method which takes in isClassPermitted and classLoader also as parameters.
The StateRepository is provided by the authoritative tier of the cache and hence will have the same persistence properties of that tier. For persistent caches it is highly recommended that all state is stored in these holders as the users won't have to worry about the persistence aspects of this state holder as it is taken care of by Ehcache.
In the case of a disk persistent cache, the contents of the state holder will be persisted locally on to the disk.
For clustered caches, the contents are persisted in the cluster itself so that other clients using the same cache can also access the contents of the state holder.