Software AG Products 10.11 | Integrate Software AG Products Using Digital Event Services | webMethods API Gateway Documentation | Administrating API Gateway | Operating API Gateway | Monitoring | Node-level Monitoring | Monitoring API Data Store | Container Monitoring Metrics
 
Container Monitoring Metrics
If you have installed API Gateway through docker or k8s, Software AG recommends you to monitor the following metrics. When the metrics exceed the threshold value, you can consider the severity as mentioned below and perform the possible actions that Software AG recommends to identify, analyze, and debug the problem.
*PodNotReady
If the pod status is not ready for more than 10 minutes, you can consider the severity as CRITICAL and check the pod console log to find a status or exception. You must ensure that either the issue with the existing pod is resolved or a new pod is created.
*DeploymentReplicasMismatch
If the pod replicas' count is not equal to number to pods in ready state, even after ten minutes, you can consider the severity as CRITICAL and check the pod console log, and identify and resolve the new pod provisioning issue.
*NodeNotReady
If a newly created or scaled pod is not ready in the Kubernetes cluster even after 15 minutes of deployment, you can consider the severity as CRITICAL and check the autoscaling settings and node provisioning events, logs, and identify and resolve the issue discovered from the logs.
*StatefulSetReplicasMismatch
If the Statefulset replicas mismatch for longer than five minutes, you can consider the severity as CRITICAL and check the pod console logs to find the status/exception and resolve the same.
Note:
Statesulset is a workload API that manages the deployment and scaling of a set of pods.
*PodCrashLooping
If Elasticsearch pods are stopping and restarting continuously for more than 10 minutes, you can consider the severity as CRITICAL and describe the pod status and check for any error. Restart the pod and check the startup logs. Ensure that all the required system resources are available to the pod and check the cluster health.
*PVC_Usage
If only 10% of the persistent volume is free at any given point of time, you can consider the severity as CRITICAL and check cluster health and perform the same clean up that you would perform for the ES metrics.
*PVC_Error
If the persistent volume status shows XXX at any given point of time, you can consider the severity as CRITICAL and check the API Data Store cluster status.
Analyze Trend
You can use external tools for dashboarding operations and visualizing metrics and logs.