Analytic Engine Clustering

Analytic Engine clustering distributes the Optimize information processing load across multiple Analytic Engines, either to facilitate system high availability or to maximize Analytic Engine data throughput. Analytic Engine clustering requires a single Terracotta Server or a Terracotta Server Array to coordinate system throughput. A Terracotta Server Array (TSA) is a combined hardware software solution to managing data processing that facilitates scaling and clustering. It uses in memory storage and processing rather than a conventional database. Terracotta based clustering is appropriate for mission-critical implementations and for some systems that experience high levels of data throughput. Depending on your system availability needs and resources, you can configure either a single node TSA or a distributed TSA.

This section describes important considerations and potential costs and benefits of implementing Analytic Engine clustering. Also, it explains how to implement clustering. Planning, implementing, and configuring a TSA involves many variables and considerations that are beyond the scope of this document. For information about planning, configuring, and running a Terracotta Server Array, refer to the Terracotta website and product documentation. For information about installing Terracotta in a webMethods environment, refer to Installing Software AG Products.

Optimize users should understand that clustering is not appropriate for every system configuration, and they should weigh the requirements and effort of implementation against potential benefits. In general, a non-clustered system is appropriate if you have adequate hardware resources to manage your data load with a single Analytic Engine. Also, a non-clustered configuration is appropriate if hardware is limited and your system is not mission critical, and you are not concerned with system downtime in the event of a hardware/software failure.

Also, while clustering can increase system data throughput, it may or may not be an effective means of enhancing system performance, depending on your specific configuration and data volume. In fact, configuring two Analytic Engine nodes typically causes a reduction in data throughput on individual machines due to the TSA overhead involved in managing data across nodes, though this throughput penalty should disappear as you add more Analytic Engine nodes.

Clustering is an appropriate option for Optimize users who require system high availability and want to minimize the risk of a single Analytic Engine-related failure point. In a clustered system, data is distributed across all nodes and redundancy is enforced so that failure of a single node will not result in system down time. For most system configurations, a two node Analytic Engine cluster provides a reasonable balance of high availability and system throughput. Note that with a TSA based clustering configuration, you can add Analytic Engine nodes at any time without any system re-configuration.

Also, note that while a clustered system configured with an appropriate TSA and multiple Analytic Engines nodes can provide a significant degree of system high availability, you must plan for and implement database and Broker high availability separately, if total system high availability is required for your system configuration. In addition, to meet typical high availability requirements, you must configure a distributed TSA as opposed to a single node implementation, which requires appropriate hardware and a fairly complex system configuration.

From a performance perspective, clustering is most effective in maximizing data throughput for situations that involve a loading characteristics with many KPI instances and a small number of events per minutes occurring for each KPI instance. The quantity of rules defined against the KPI instances also affects performance. If you have multiple rules defined against each KPI instance, increasing the number of Analytic Engine nodes will generally maximize data throughput.

Analytic Engine clustering is available for all Optimize environments that use a TSA. To implement clustering, you must set up the appropriate number of machines running Analytic Engines and configure the environment as described in Defining Analytic Engine Cluster Settings. The system will automatically configure the sag.opt.clusterable.caches.xml file located on each computer in the cluster, based on the number of machines/Analytic Engines available.