Terracotta DB 10.2 | Terracotta Server Administration Guide | Cluster Tool
 
Cluster Tool
The cluster tool is a command-line utility that allows administrators of the Terracotta Server Array to perform a variety of cluster management tasks. For example, the cluster tool can be used to:
*Configure or re-configure a cluster
*Obtain the status and configuration information of running servers
*Dump the state of running servers
*Stop the running servers
*Take backups of running servers
The cluster tool script is located in tools/cluster-tool/bin under the product installation directory as cluster-tool.bat for Windows platforms, and as cluster-tool.sh for Unix/Linux.
Usage Flow
The following is a typical flow in a cluster setup and usage:
1. Create Terracotta configuration files for each stripe in the deployment. See the section The Terracotta Configuration File for details.
2. Start up the servers in each stripe. See the section Starting and Stopping the Terracotta Server for details.
3. Make sure the stripes are online and ready.
4. Configure the cluster using the configure command of the cluster tool. See the section The "configure" Command" below for details.
5. Check the current status of the cluster or specific servers in the cluster using the status command. See the section The "status" Command below for details.
Cluster Tool commands
The cluster tool provides several commands. To list them and their respective options, run cluster-tool.sh (or cluster-tool.bat on Windows) without any arguments, or use the option -h (long option --help).
The following section provides a list of options common to all commands, and thus need to be specified before the command name:
Precursor options
1. -v (long option --verbose)
This option gives you a verbose output, and is useful to debug error conditions.
2. -srd (long option --security-root-directory)
This option can be used to communicate with a server which has TLS/SSL-based security configured. For more details on setting up security in a Terracotta cluster, see the section SSL/TLS Security Configuration in Terracotta.
Note: If this option is not specified while trying to connect to a secure cluster, the command will fail with a SECURITY_CONFLICT error.
3. -t (long option --timeout)
This option lets you specify a custom timeout value (in milliseconds) for connections to be established in cluster tool commands.
Note: If this option is not specified, the default value of 30,000 ms (or 30 seconds) is used.
Each command has the option -h (long option --help), which can be used to display the usage for the command.
The following is a comprehensive explanation of the available commands:
The "configure" Command
The configure command creates a cluster from the otherwise independent Terracotta stripes, taking as input a mandatory license key. No functionality is available on the server until a valid license is installed. See the section Licensing for details.
All servers in any given stripe should be started with the same configuration file. The configure command configures the cluster based on the configuration(s) of the currently known active server(s) only. If there is a configuration mismatch among the active and passive server(s) within the same stripe, this command can configure the cluster while taking down any passive server(s) with configuration mismatches. This validation also happens upon server restart and changes will prevent the server from starting. See the section on the reconfigure command for more information on how to update server configurations.
The command will fail if any of the following checks do not pass:
1. License checks
a. The license is valid.
b. The provided configuration files do not violate the license.
2. Configuration checks
*The provided configuration files are consistent across all the stripes.
The following configuration items are validated in the configuration files:
1. config:
a. offheap-resource
Offheap resources present in one configuration file must be present in all the files with the same sizes.
b. data-directories
Data directory identifiers present in one configuration file must be present in all the files. However, the data directories they map to can differ.
2. service
a. security
Security configuration settings present in one configuration file must match the settings in all the files.
For more details on setting up security in a Terracotta cluster, see the section SSL/TLS Security Configuration in Terracotta.
b. backup-restore
If this element is present in one configuration file, it must be present in all the files.
3. failover-priority
The failover priority setting present in one configuration file must match the setting in all the files.
Refer to the section The Terracotta Configuration File for more information on these elements.
The servers section of the configuration files is also validated. Note that it is not validated between stripes but rather against the configuration used to start the servers themselves.
*server
*host
It must be a strict match
*name
It must be a strict match
*tsa-port
It must be a strict match
Note: Once a cluster is configured, a similar validation will take place upon server restart. It will cause the server to fail to start if there are differences.
Usage:

configure -n CLUSTER-NAME [-l LICENSE-FILE] TC-CONFIG [TC-CONFIG...]
configure -n CLUSTER-NAME [-l LICENSE-FILE] -s HOST[:PORT] [-s HOST[:PORT]]...
Parameters:
*-n CLUSTER-NAME
A name that is to be assigned to the cluster.
*-l LICENSE-FILE
The path to the license file. If you omit this option, the cluster tool looks for a license file named license.xml in the location tools/cluster-tool/conf under the product installation directory.
*TC-CONFIG [TC-CONFIG ...]
A whitespace-separated list of configuration files (minimum 1) that describes the stripes to be added to the cluster.
*-s HOST[:PORT] [-s HOST[:PORT]...
The host:port(s) or host(s) (default port being 9410) of running servers, each specified using the -s option. Any one server from each stripe can be provided. However, multiple servers from the same stripe will work as well. The cluster will be configured with the configurations which were originally used to start the servers.
Note: The command configures the cluster only once. To update the configuration of an already configured cluster, the reconfigure command should be used.
Examples
*The example below shows a successful execution for a two stripe configuration and a valid license.
./cluster-tool.sh configure -l ~/license.xml -n tc-cluster
~/tc-config-stripe-1.xml ~/tc-config-stripe-2.xml
Configuration successful
License installation successful

Command completed successfully
*The example below shows a failed execution because of an invalid license.
./cluster-tool.sh configure -l ~/license.xml
-n tc-cluster ~/tc-config-stripe-1.xml ~/tc-config-stripe-2.xml

Error (BAD_REQUEST): com.terracottatech.LicenseException: Invalid license
*The example below shows a failed execution with two stripe configurations mis-matching in their offheap resource sizes.
./cluster-tool.sh configure -n tc-cluster -l
~/license.xml ~/tc-config-stripe-1.xml ~/tc-config-stripe-2.xml

Error (BAD_REQUEST): Mismatched off-heap resources in provided config files:
[[primary-server-resource: 51200M], [primary-server-resource: 25600M]]
The "reconfigure" Command
The reconfigure command updates the configuration of a cluster which was configured using the configure command. With reconfigure, it is possible to:
1. Update the license on the cluster.
2. Add new offheap resources, or grow existing ones.
3. Add new data directories.
4. Add new configuration element types.
The command will fail if any of the following checks do not pass:
1. License checks
a. The new license is valid.
b. The new configuration files do not violate the license.
2. Stripe checks
a. The new configuration files have all the previously configured servers.
b. The order of the configuration files provided in the reconfigure command is the same as the order of stripes in the previously configured cluster.
3. Configuration checks
a. Stripe consistency checks
The new configuration files are consistent across all the stripes. For the list of configuration items validated in the configuration files, refer to the section The "configure" Command above for details.
b. Offheap checks
The new configuration has all the previously configured offheap resources, and the new sizes are not smaller than the old sizes.
c. Data directories checks
The new configuration has all the previously configured data directory names.
d. Configuration type checks
The new configuration has all the previously configured configuration types.
Usage:

reconfigure -n CLUSTER-NAME TC-CONFIG [TC-CONFIG...]
reconfigure -n CLUSTER-NAME -l LICENSE-FILE -s HOST[:PORT] [-s HOST[:PORT]]...
reconfigure -n CLUSTER-NAME -l LICENSE-FILE TC-CONFIG [TC-CONFIG...]
Parameters:
*-n CLUSTER-NAME
The name of the configured cluster.
*TC-CONFIG [TC-CONFIG ...]
A whitespace-separated list of configuration files (minimum 1) that describe the new configurations for the stripes.
*-l LICENSE-FILE
The path to the new license file.
*-s HOST[:PORT] [-s HOST[:PORT]]...
The host:port(s) or host(s) (default port being 9410) of servers, each specified using the -s option.
Servers in the provided list will be sequentially contacted for connectivity, and the command will be executed on the first reachable server.
reconfigure command usage scenarios:
1. License update
When it is required to update the license, most likely because the existing license has expired, the following reconfigure command syntax should be used:
reconfigure -n CLUSTER-NAME -l LICENSE-FILE -s HOST[:PORT] [-s HOST[:PORT]]...
Note: A license update does not require the servers to be restarted.
2. Configuration update
When it is required to update the cluster configuration, the following reconfigure command syntax should be used:
reconfigure -n CLUSTER-NAME TC-CONFIG [TC-CONFIG...]
The steps below should be followed in order:
a. Update the Terracotta configuration files with the new configuration, ensuring that it meets the reconfiguration criteria mentioned above.
b. Run the reconfigure command with the new configuration files.
c. Restart the servers with the new configuration files for the new configuration to take effect.
3. License and configuration update at once
In the rare event that it is desirable to update the license and the cluster configuration in one go, the following reconfigure command syntax should be used:
cluster-tool.sh reconfigure -n
CLUSTER-NAME -l LICENSE-FILE TC-CONFIG [TC-CONFIG...]
The steps to be followed here are the same as those mentioned in the Configuration update section above.
Examples
*The example below shows a successful re-configuration of a two stripe cluster tc-cluster with new stripe configurations.
./cluster-tool.sh reconfigure -n tc-cluster
~/tc-config-stripe-1.xml ~/tc-config-stripe-2.xml
License not updated (Reason: Identical to previously installed license)
Configuration successful

Command completed successfully.
*The example below shows a failed re-configuration because of a license violation.
./cluster-tool.sh reconfigure -n tc-cluster
-l ~/license.xml -s localhost:9410

Error (BAD_REQUEST): Cluster offheap resource is not within the limit of the license.
Provided: 409600 MB, but license allows: 102400 MB only
*The example below shows a failed re-configuration of a two stripe cluster with new stripe configurations having fewer data directories than existing configuration.
./cluster-tool.sh reconfigure -n tc-cluster
~/tc-config-stripe-1.xml ~/tc-config-stripe-2.xml

License not updated (Reason: Identical to previously installed license)
Error (CONFLICT): org.terracotta.exception.EntityConfigurationException:
Entity: com.terracottatech.tools.client.TopologyEntity:topology-entity
lifecycle exception:
Entity: com.terracottatech.tools.client.TopologyEntity:topology-entity
lifecycle exception:
Entity: com.terracottatech.tools.client.TopologyEntity:topology-entity
lifecycle exception: org.terracotta.entity.ConfigurationException:
Mismatched data directories. Provided: [use-for-platform, data],
but previously known: [use-for-platform, data, myData]
The "status" Command
The status command displays the status of a cluster, or particular server(s) in the same or different clusters..
Usage:
status -n CLUSTER-NAME -s HOST[:PORT] [-s HOST[:PORT]]...
status -s HOST[:PORT] [-s HOST[:PORT]]...
Parameters:
*-n CLUSTER-NAME
The name of the configured cluster.
*-s HOST[:PORT] [-s HOST[:PORT]]...
The host:port(s) or host(s) (default port being 9410) of running servers, each specified using the -s option.
When provided with option -n, servers in the provided list will be sequentially contacted for connectivity, and the command will be executed on the first reachable server. Otherwise, the command will be individually executed on each server in the list.
Examples
*The example below shows the execution of a cluster-level status command.
./cluster-tool.sh status -n tc-cluster -s localhost
Cluster name: tc-cluster
Stripes in the cluster: 2
Servers in the cluster: 4
Server{name='server-1', host='localhost', port=9410},
Server{name='server-2', host='localhost', port=9610} (stripe 1)
Server{name='server-3', host='localhost', port=9710},
Server{name='server-4', host='localhost', port=9910} (stripe 2)
Total configured offheap: 102400M
Backup configured: true
SSL/TLS configured: false
IP whitelist configured: false
Data directories configured: data, myData

| STRIPE: 1 |
+--------------------+----------------------+--------------------------+
| Server Name | Host:Port | Status |
+--------------------+----------------------+--------------------------+
| server-1 | localhost:9410 | ACTIVE-COORDINATOR |
| server-2 | localhost:9610 | PASSIVE-STANDBY |
+--------------------+----------------------+--------------------------+

| STRIPE: 2 |
+--------------------+----------------------+--------------------------+
| Server Name | Host:Port | Status |
+--------------------+----------------------+--------------------------+
| server-3 | localhost:9710 | ACTIVE-COORDINATOR |
| server-4 | localhost:9910 | PASSIVE-STANDBY |
+--------------------+----------------------+--------------------------+
*The example below shows the execution of a server-level status command. No server is running at localhost:9510, hence the UNKNOWN status.

./cluster-tool.sh status -s localhost:9410 -s localhost:9510 -s localhost:9910
+----------------------+--------------------------+----------------+
| Host:Port | Status | Cluster |
+----------------------+--------------------------+----------------+
| localhost:9410 | ACTIVE-COORDINATOR | tc-cluster |
| localhost:9510 | UNKNOWN | UNKNOWN |
| localhost:9910 | PASSIVE-STANDBY | tc-cluster |
+----------------------+--------------------------+----------------+

Error (PARTIAL_FAILURE): Command completed with errors.
The "dump" Command
The dump command dumps the state of a cluster, or particular server(s) in the same or different clusters. The dump of each server can be found in its logs.
Usage:
dump -n CLUSTER-NAME -s HOST[:PORT] [-s HOST[:PORT]]...
dump -s HOST[:PORT] [-s HOST[:PORT]]...
Parameters:
*-n CLUSTER-NAME
The name of the configured cluster.
*-s HOST[:PORT] [-s HOST[:PORT]]...
The host:port(s) or host(s) (default port being 9410) of running servers, each specified using the -s option.
When provided with option -n, servers in the provided list will be sequentially contacted for connectivity, and the command will be executed on the first reachable server. Otherwise, the command will be individually executed on each server in the list.
Examples
*The example below shows the execution of a cluster-level dump command.
./cluster-tool.sh dump -n tc-cluster -s localhost:9910
Command completed successfully.
*The example below shows the execution of a server-level dump command. No server is running at localhost:9510, hence the dump failure.
./cluster-tool.sh dump -s localhost:9410 -s localhost:9510 -s localhost:9910
Dump successful for server at: localhost:9410
Connection refused from server at: localhost:9510
Dump successful for server at: localhost:9910
Error (PARTIAL_FAILURE): Command completed with errors.
The "stop" Command
The stop command stops the cluster, or particular server(s) in the same or different clusters.
Usage:
stop -n CLUSTER-NAME -s HOST[:PORT] [-s HOST[:PORT]]...
stop -s HOST[:PORT] [-s HOST[:PORT]]...
Parameters:
*-n CLUSTER-NAME
The name of the configured cluster.
*-s HOST[:PORT] [-s HOST[:PORT]]...
The host:port(s) or host(s) (default port being 9410) of running servers, each specified using the -s option.
When provided with the option -n, servers in the provided list will be sequentially contacted for connectivity, and the command will be executed on the first reachable server. Otherwise, the command will be individually executed on each server in the list.
Examples
*The example below shows the execution of a cluster-level stop command.
./cluster-tool.sh stop -n tc-cluster -s localhost
Command completed successfully.
*The example below shows the execution of a server-level stop command. No server is running at localhost:9510, hence the stop failure.
./cluster-tool.sh stop -s localhost:9410 -s localhost:9510 -s localhost:9910
Stop successful for server at: localhost:9410
Connection refused from server at: localhost:9510
Stop successful for server at: localhost:9910
Error (PARTIAL_FAILURE): Command completed with errors.
The "ipwhitelist-reload" Command
The ipwhitelist-reload command reloads the IP whitelist on a cluster, or particular server(s) in the same or different clusters. See the section Securing TSA Access using a Permitted IP List for details.
Usage:
ipwhitelist-reload -n CLUSTER-NAME -s HOST[:PORT] [-s HOST[:PORT]]...
ipwhitelist-reload -s HOST[:PORT] [-s HOST[:PORT]]...
Parameters:
*-n CLUSTER-NAMEThe name of the configured cluster.
*-s HOST[:PORT] [-s HOST[:PORT]]...
The host:port(s) or host(s) (default port being 9410) of running servers, each specified using the -s option.
When provided with the option -n, servers in the provided list will be sequentially contacted for connectivity, and the command will be executed on the first reachable server. Otherwise, the command will be individually executed on each server in the list.
Examples
*The example below shows the execution of a cluster-level ipwhitelist-reload command.
./cluster-tool.sh ipwhitelist-reload -n tc-cluster -s localhost
IP white-list reload successful for server at: localhost:9410
IP white-list reload successful for server at: localhost:9610
IP white-list reload successful for server at: localhost:9710
IP white-list reload successful for server at: localhost:9910
Command completed successfully.
*The example below shows the execution of a server-level ipwhitelist-reload command. No server is running at localhost:9510, hence the IP whitelist reload failure.
./cluster-tool.sh ipwhitelist-reload -s localhost:9410
-s localhost:9510 -s localhost:9910
IP white-list reload successful for server at: localhost:9410
Connection refused from server at: localhost:9510
IP white-list reload successful for server at: localhost:9910
Error (PARTIAL_FAILURE): Command completed with errors.
The "backup" Command
The backup command takes a backup of the running Terracotta cluster.
Usage:
backup -n CLUSTER-NAME -s HOST[:PORT] [-s HOST[:PORT]]...
Parameters:
*-n CLUSTER-NAME
The name of the configured cluster.
*-s HOST[:PORT] [-s HOST[:PORT]]...
The host:port(s) or host(s) (default port being 9410) of running servers, each specified using the -s option.
When provided with the option -n, servers in the provided list will be sequentially contacted for connectivity, and the command will be executed on the first reachable server. Otherwise, the command will be individually executed on each server in the list.
Examples
*The example below shows the execution of a cluster-level successful backup command. Note that the server at localhost:9610 was unreachable.
./cluster-tool.sh backup -n tc-cluster -s localhost:9610 -s localhost:9410

PHASE 0: SETTING BACKUP NAME TO : 996e7e7a-5c67-49d0-905e-645365c5fe28
localhost:9610: TIMEOUT
localhost:9410: SUCCESS
localhost:9710: SUCCESS
localhost:9910: SUCCESS

PHASE (1/4): PREPARE_FOR_BACKUP
localhost:9610: TIMEOUT
localhost:9910: NOOP
localhost:9410: SUCCESS
localhost:9710: SUCCESS

PHASE (2/4): ENTER_ONLINE_BACKUP_MODE
localhost:9710: SUCCESS
localhost:9410: SUCCESS

PHASE (3/4): START_BACKUP
localhost:9710: SUCCESS
localhost:9410: SUCCESS

PHASE (4/4): EXIT_ONLINE_BACKUP_MODE
localhost:9710: SUCCESS
localhost:9410: SUCCESS
Command completed successfully.
*The example below shows the execution of a cluster-level failed backup command.
./cluster-tool.sh backup -n tc-cluster -s localhost:9610
PHASE 0: SETTING BACKUP NAME TO : 93cdb93d-ad7c-42aa-9479-6efbdd452302
localhost:9610: SUCCESS
localhost:9410: SUCCESS
localhost:9710: SUCCESS
localhost:9910: SUCCESS

PHASE (1/4): PREPARE_FOR_BACKUP
localhost:9610: NOOP
localhost:9410: SUCCESS
localhost:9710: SUCCESS
localhost:9910: NOOP

PHASE (2/4): ENTER_ONLINE_BACKUP_MODE
localhost:9410: BACKUP_FAILURE
localhost:9710: SUCCESS

PHASE (CLEANUP): ABORT_BACKUP
localhost:9410: SUCCESS
localhost:9710: SUCCESS
Backup failed as some servers '[Server{name='server-1', host='localhost', port=9410},
[Server{name='server-2', host='localhost', port=9710}]]',
failed to enter online backup mode.

Copyright © 2010-2019 | Software AG, Darmstadt, Germany and/or Software AG USA, Inc., Reston, VA, USA, and/or its subsidiaries and/or its affiliates and/or their licensors.
Innovation Release