API Gateway 10.11 | Administering API Gateway | Operating API Gateway | Monitoring API Gateway | Node-level Monitoring | Application Monitoring | Monitoring API Gateway | Application Metrics
 
Application Metrics
Monitor the following metrics to analyze API Gateway health.
*Threads statistics
*Service errors
*Memory usage of JVM
*HTTP or HTTPS requests
Note:
The threshold values, configurations, and severities that are mentioned throughout this section are the guidelines that Software AG suggests for an optimal performance of API Gateway. You can modify these thresholds or define actions based on your operational requirements.
For details about how to generate thread dump, heap dump and log locations, see Troubleshooting: Monitoring API Gateway.
If the metrics return an exceeded threshold value, consider the severity as mentioned and perform the possible actions that Software AG recommends to identify and debug the problem and contact Software AG for further support.
Monitor the Threads statistics
Metric
Description
sag_is_service_threads
Checks the percentage of total number of threads used for service execution where the threads are obtained from the server thread pool.
If the threads usage is above the recommended threshold value for more than 15 minutes, consider the severity as mentioned:
*Above 80% threshold, Severity: ERROR
*Above 90% threshold, Severity: CRITICAL
The steps to identify the causes of higher threads usage are as follows:
1. Identify the process that consumes the highest number of threads.
2. Generate the thread dump.
3. Analyze thread dump to identify the thread locks.
4. Analyze the logs of all API Gateway instances in the node.
Monitor the Service errors
Metric
Description
sag_is_number_​service_errors
Checks the number of services that results in errors or exceptions.
If service errors are encountered, consider the severity as ERROR.
The steps to identify the causes of service errors are as follows:
1. Check the cluster status of API Gateway using the following REST endpoint: GET /rest/apigateway/health/engine to know if API Gateway is healthy and is in a cluster mode.
2. Check the server logs for any exception from SAGInstallDirectory\IntegrationServer\instances\instance_name\logs\server.log.
Monitor the Memory usage of JVM
Metric
Description
sag_is_used_​memory_bytes
Checks the percentage of total used memory of JVM.
If the memory usage is above the recommended threshold value for more than 15 minutes, consider the severity as mentioned:
*Above 80% threshold, Severity: ERROR
*Above 90% threshold, Severity: CRITICAL
The steps to identify the causes of higher memory usage of JVM are as follows:
1. Check the cluster status of API Gateway using the following REST endpoint: GET /rest/apigateway/health/engine to know if API Gateway is healthy and is in a cluster mode.
2. Generate the heap dump.
3. Analyze the logs of all the API Gateway instances.
4. Identify the server that has an issue and restart the server if required.
5. Perform the following actions after restarting the server:
a. Check for the readiness of API Gateway.
b. Check the cluster status of API Gateway using the following REST endpoint: GET /rest/apigateway/health/engine to know if API Gateway is healthy and is in a cluster mode.
c. Check the resource availability of all the required system resources like memory, heap, and disk.
d. Check API Data Store connectivity with API Gateway server.
e. Check the Terracotta client logs for errors in Terracotta communication for a cluster set up.
Monitor the HTTP or HTTPS requests
Metric
Description
sag_is_http_requests
Checks the percentage of total number of HTTP or HTTPS requests since the last statistics poll.
The statistics poll interval is controlled by the watt.server.stats.pollTime server configuration parameter and the default interval is 60 seconds.
If the total number of HTTP or HTTPS requests since the last statistics poll is above the threshold limit that is based on the Throughput Per Second (TPS) value, consider the severity as ERROR.
Log monitoring
In addition to the metrics, to monitor the logs regularly, perform the following steps:
1. Check for the availability of all logs frequently.
2. Check if the log rotation works as configured for all file types.
3. Check the size of the log file to know if it is greater than the configured values.
To monitor the logs in different levels, check the availability of logs in FATAL, ERROR or WARNING level.