Startup, Shutdown, Planned Outages and Changes

Universal Messaging 10.11 | Operations Guide | Startup, Shutdown, Planned Outages and Changes

This section covers how to start up and shut down the Universal Messaging server. It also looks at aspects to be taken into account while planning an outage for server maintenance.

Organization of a Startup

For details on how to start the Universal Messaging server, and different modes (Automatic and Manual) in which it can be started, refer to the section Starting the Realm Server of the Universal Messaging Installation Guide.

To check that the realm server has started properly, examine the realm server log file nirvana.log in the <DataDir> location for text similar to the following:

[Tue Nov 27 13:02:27.538 GMT 2018] [ServerStarterThread] Startup:
Realm Server Startup sequence completed

Organization of a Shutdown

For details on how to stop the Universal Messaging server, refer to the section Stopping the Realm Server of the Universal Messaging Installation Guide.

To check that the realm server has shut down properly, examine the realm server log file nirvana.log in the <DataDir> location for text similar to the following:

[Tue Nov 27 13:00:30.312 GMT 2018] [ServerStarterThread] Shutdown:
Realm Server completed shutdown sequence, process exiting normally
--------- Log File Closed ---------

Maintenance Window (Planned Outage)

Consider the following points before shutting down the Universal Messaging realm or cluster for a scheduled maintenance:

Clients (other than Integration Server clients) that are publishing the messages are either stopped or suspended.

In the case of Integration Server clients, ensure that the "Enable CSQ" option under the Producer Settings of the Universal Messaging connection alias settings is enabled. This will ensure that when the Integration Server publishes documents using this connection alias, it writes the documents to the client-side queue if the Universal Messaging server is unavailable. Integration Server will publish the messages written to the client-side queue when the Universal Messaging server is back online. Client-side queuing can also be enabled by passing "true" to the useCSQ flag while invoking the pub.jms:send and pub.jms:sendAndWait services. The "Maximum CSQ Size (messages)" property of the Universal Messaging connection alias settings should be set to an appropriate value depending on the publishing rate and planned outage window size.

For related information, refer to the description of Creating a Universal Messaging Connection Alias in the chapter Configuring Integration Server for webMethods Messaging in the Integration Server Administrator's Guide. Refer also to the descriptions of pub.jms:send and pub.jms:sendAndWait in the JMS Folder section of the Built-In Services Reference Guide, which is included in the Integration Server documentation set.

All the channels and queues are drained or emptied before stopping or suspending the clients consuming the messages.

Note:
The above points are not recommendations but rather points to be considered while designing the system as well as planning for a scheduled maintenance.

Application of Fixes

Consider the following points when applying a fix to a realm or a cluster of realms:

Before applying the fix, perform the following steps:

1. Back up the realm data and configurations, following the backup procedure.

2. Shut down the realm, following the maintenance window procedure.

Apply the fix to the realm using Software AG Update Manager.

After applying the fix, perform the following steps:

Start the realm.

Perform a configuration health check by running the HealthChecker tool. The checks to perform are:

DurableSubscriberLargeStoreCheck

FixLevelCheck

JNDIStatusCheck

JoinMismatchCheck

ResourcesSafetyLimitsCheck

ServerProtectionConsistencyCheck

For more information about working with the HealthChecker tool, see Running a Configuration Health Check.

Rolling Update of Servers in a Cluster

A rolling update of the Universal Messaging servers in an active/active cluster enables you to apply a fix to a server in the cluster while the other servers remain online. You shut down only the server to which you are applying the fix and not the whole cluster.

Consider the following information before starting the procedure:

You must apply the fix to all servers in the cluster.

Software AG recommends that you first apply the fix to the secondary nodes and then to the primary node.

To install a fix to the servers in an active/active cluster, perform the following steps on each node:

1. Shut down the node.

The node disconnects from the cluster.

2. Apply the fix to the node.

3. Start the node.

The node reconnects to the cluster and is operational again.

Clients connected to the cluster should be able to tolerate a brief loss of connection:

When the node to which a client is connected is shut down, the client loses connection to the node and the cluster. Then the client must re-establish the connection.

When the primary node is shut down, the whole cluster briefly goes offline until a new primary node is elected.