Software AG Products 10.11 | Administrating API Gateway | Operating API Gateway | Monitoring API Gateway | Node-level Monitoring | Application Monitoring | Monitoring Terracotta | Infrastructure Metrics
 
Infrastructure Metrics
Infrastructure metrics include system metrics and container metrics. For information about container metrics, see Container Metrics.
System Metrics
Monitor the following metrics to analyze the health of Terracotta server.
*CPU usage
*Disk usage
*Memory usage
If the metrics return an exceeded threshold value, consider the severity as mentioned below and perform the possible actions that Software AG recommends to identify and debug the problem and contact Software AG for further support.
Note:
The threshold values, configurations, and severities that are mentioned throughout this section are the guidelines that Software AG suggests for an optimal performance of API Gateway. You can modify these thresholds or define actions based on your operational requirements.
To generate thread dump and heap dump for monitoring various system metrics, see Troubleshooting: Monitoring Terracotta Server Array.
Monitor
Description
CPU usage
If the CPU usage of the system is above the recommended threshold value, consider the severity as mentioned:
Above 80% threshold for 15 minutes continuously, Severity: WARNING
Above 90% threshold for 15 minutes continuously, Severity: CRITICAL
The steps to identify the causes of higher CPU usage are as follows:
1. Identify the process that consumes the highest CPU.
2. Generate the thread dump.
3. Analyze the thread dump and logs to identify the problem.
4. Monitor the process closely. If the process fails, it should recreate.
5. Check if the active-passive quorum is intact using the following script: SAGInstallDirectory/Terracotta/server/bin/server-stat.sh
6. Check if API Gateway clients can establish the connection to Terracotta cluster using the following REST endpoint GET /rest/apigateway/health/engine
Disk usage
If the disk usage of the Terracotta server shows a higher value, rotate logs based on a fixed size and fix the number of rotated files to be persisted.
Memory usage
If the memory usage is above the recommended threshold value, consider the severity as mentioned:
Above 80% threshold, Severity: WARNING
Above 90% threshold, Severity: CRITICAL
The steps to identify the causes of higher memory usage are as follows:
*Identify the process that consumes more memory.
*Start the Terracotta Management Console (TMC) and check the heap usage, off-heap usage and warnings.
*Analyze the memory dump and Terracotta logs to identify the issue.
*Monitor the process closely.
*Check if the active-passive quorum is intact using the following script: SAGInstallDirectory/Terracotta/server/bin/server-stat.sh
*Check if API Gateway clients can establish the connection to Terracotta cluster using the following REST endpoint GET /rest/apigateway/health/engine