Reaction to Failure
In the real-life scenario, various situations may challenge a running system: Hardware failure, network issues, force majeure, to name a few.
In such situations, the highly available services of a cluster show their strength.
The following topics illustrate how Adabas Cluster reacts to such a challenge.
Loss of Primary Node
Here, we are assuming that the primary node is lost. There are only two secondary nodes still running.
Election of New Primary Node (Quorum)
A new primary node is automatically elected among the remaining components of the cluster. This process is called a quorum. In fact, the alphabetically first node that is up to date becomes the new primary, see CLUSTER_NODE_NAME setting. The decisive factor here is the SYNCED adaopr display status, which avoids a split-brain situation.
Secondary Switching to Primary
For the time being, the cluster runs on two nodes. Now, the former secondary node establishes as the primary node. Adabas Cluster does this automatically, fully transparent to the client application.
Clients with open transactions will receive a
response 9, plus a
subcode (
transaction backed out automatically). The clients (or their application logic) have the opportunity to decide whether to re-run the unsuccessful transaction.
Adding New Secondary - Delta State
Now, a replacement for the missing secondary node can be established. Once this node runs, this new secondary receives a Delta-State, effectively synchronizing it with the other nodes.
From here on, the original 3-node cluster has been re-instantiated.