Clusters: An Overview
Universal Messaging provides guaranteed message delivery across public, private, local and wide-area infrastructures.
A Universal Messaging cluster consists of Universal Messaging servers working together to provide high availability and reliability. An individual Universal Messaging server in the cluster is referred to as a cluster node. If one cluster node becomes unavailable, another cluster node takes over the messaging operations. For the clients, a cluster appears to be a single Universal Messaging server. A Universal Messaging cluster protects your messaging system from the following failures, and provides support for business contingency and disaster recovery:
Application and service failures
System and hardware failures
Site failures
Universal Messaging clusters can be created, configured and administered using either Universal Messaging Enterprise Manager, the Universal Messaging administration APIs, or Command Central.
Note:
Some of the industry-standard protocols that are supported by Universal Messaging do not include full support for clustered operation, e.g. clients may not automatically fail over to another server if the server they are connected to fails. For specific protocols please consult their documentation to understand their capabilities and limitations. Universal Messaging's native APIs and JMS provider implementation have full support for clustering.
Clustering Realms
A Universal Messaging cluster is a collection of Universal Messaging realms (servers) that contain common messaging resources such as channels/topics or queues. Each clustered resource exists in every realm within the cluster. Whenever the state of a clustered resource changes on one realm, the state change is updated on all realms in the cluster. For example, if an event is popped from a clustered queue on one realm, it is popped from all realms within the cluster.
Creating a cluster of Universal Messaging realms ensures that applications either publishing / subscribing to channels, or pushing / popping events from queues, can connect to any of the realms and view the same state. If one of the realms in the cluster is unavailable, client applications can automatically reconnect to any of the other cluster realms and carry on from where they were.
Clustering also offers a convenient way to replicate content between servers and ultimately offers a way to split large numbers of clients over different servers in different physical locations.
Universal Messaging provides built-in support for clustering in the form of Universal Messaging Clusters and Universal Messaging Clusters with Sites. Universal Messaging clients can also use the same clustering functionality to communicate with individual, non-clustered Universal Messaging realms in Shared Storage (see section "Shared Storage Configurations" in
About Active/Passive Clustering) server configurations.
From a client perspective, a cluster offers resilience and high availability. Universal Messaging clients automatically move from realm to realm in a cluster as required or when specific realms within the cluster become unavailable to the client for any reason. The state of all client operations is maintained so a client moving will resume whatever operation they were previously carrying out.
The following diagram represents a typical three-realm cluster distributed across three physical locations:
Tip:
Since the underlying purpose of a cluster is to provide resilience and high availability, we advise against running all the servers in a cluster on a single physical or virtual machine in a production environment.
Clustered Resources
A Universal Messaging realm server is a container for a number of messaging resources that can be clustered:
Universal Messaging Channels
JMS Topics
Universal Messaging Queues
JMS Queues
Data Groups
Access Control Lists
Resource Attributes including Type, Capacity, TTL
Client Transactions on Universal Messaging Resources
Within the context of a cluster, a single instance of a channel, topic or queue can exist on every node within the cluster. When this is the case all attributes associated with the resource are also propagated amongst every realm within the cluster. The resource in question can be written to or read from any realm within the cluster.
The basic premise for a Universal Messaging cluster is that it provides a transparent entry point to a collection of realms that share the same resources and are, in effect, a mirror image of each other.
A Universal Messaging cluster achieves this by the implementation of some basic concepts described below:
Configuration Option 1: Universal Messaging Clusters
This approach offers the following features:
Active/Active
Transparent Client Failover
Transparent Realm Failover
Universal Messaging clusters are our recommended solution for high availability and redundancy. State is replicated across all active realms.
With 51% of realms required to form a functioning cluster (see the section
Quorum for an explanation of this percentage figure), this is an ideal configuration for fully automatic failover across a minimum of three realms.
Configuration Option 2: Universal Messaging Clusters with Sites
This approach offers the following features:
Active/Active
Transparent Client Failover
Semi-Transparent Realm Failover
Universal Messaging Clusters with Sites provide most of the benefits of Universal Messaging Clusters but with less hardware and occasional manual intervention.
This configuration is designed for two sites, such as Production and Disaster Recovery sites, containing an equal number of realms (for example, one realm on each site or two realms on each site). In such a configuration, if the communication between the sites is lost, neither site can achieve the quorum of 51% of reachable realms required for a functioning cluster. This situation can be resolved by defining one of the sites to be the so-called
prime site. If the prime site contains exactly 50% of reachable realms in the cluster, the prime site is allowed to form a functioning cluster. Failover is automatic should the "non-prime" site fail, and requires manual intervention only if the prime site fails. For details of clusters with sites, see the section
Clusters with Sites.
Note:
Switching the prime site MUST be a manual operation by an administrator who can confirm that the previous prime site is indeed down and not merely disconnected from the other sites. Attempts to automate this process raises the risk of "split brain" situations, in which loss of data is very likely.
Configuration Option 3: Shared Storage Configurations
This approach offers the following features:
Active/Passive
Transparent Client Failover
Manual Realm Failover
As an alternative to native Universal Messaging Clusters, Shared Storage configurations (see section "Shared Storage Configurations" in
About Active/Passive Clustering) can be deployed to provide disaster recovery options.
This approach does not make use of Universal Messaging's built-in cluster features, but instead allows storage to be shared between multiple realms - of which only one is active at any one time.
In general, we recommend the use of Universal Messaging Clusters or Universal Messaging Clusters with Sites in preference to shared storage configurations.
Planning for Cluster Implementation and Deployment
A Universal Messaging cluster with three or more servers (active/active cluster) is the recommended approach for clustering. This approach supports high availability and resilience, reduces outage during failover, and uses standard (local) disks.
The table below will help you decide on the clustering approach for your Universal Messaging clustering solution.
Clustering Approach | Automatic client and server failover? | Vendor-specific cluster software required? | Shared storage required? | Minimum number of servers required? |
Active/Active cluster with three or more servers | Yes | No | No Uses standard (local) disk | 3 |
Active/Active cluster with sites Note: Administrator must manually set the "IsPrime" flag to the other site if the site with the "IsPrime" flag fails. | Automatic client failover. Semi-automatic server failover. | No | No Uses standard (local) disk | 2 |
Active/Passive cluster with shared storage | Yes | Yes | Yes | 2 |