API Gateway 10.11 | Administering API Gateway | Operating API Gateway | Monitoring API Gateway | Node-level Monitoring | Application Monitoring | Monitoring API Data Store | Infrastructure Metrics | Container Metrics
 
Container Metrics
If you have installed API Gateway through Docker or Kubernetes, Software AG recommends monitoring the following metrics to check if API Data Store container is healthy. When the metrics exceed the threshold value, consider the severity as mentioned below and perform the possible actions that Software AG recommends to identify, analyze, and debug the problem.
Metric
Description
PodNotReady
If the pod status is not ready for more than 10 minutes, consider the severity as CRITICAL and check the pod console log to find a status or exception. Ensure that either the issue with the existing pod is resolved or a new pod is created.
DeploymentReplicas​Mismatch
If the pod replicas' count is not equal to number to pods in ready state, even after 10 minutes, consider the severity as CRITICAL and check the pod console log, and identify and resolve the new pod provisioning issue.
NodeNotReady
If a newly created or scaled pod is not ready in the Kubernetes cluster even after 15 minutes of deployment, consider the severity as CRITICAL and check the autoscaling settings and node provisioning events, logs, and identify and resolve the issue discovered from the logs.
StatefulSetReplicas​Mismatch
If the Statefulset replicas mismatch for longer than 5 minutes, consider the severity as CRITICAL and check the pod console logs to find the status or exception and resolve the same.
Note:
Statefulset is a workload API that manages the deployment and scaling of a set of pods.
PodCrashLooping
If API Data Store pods are stopping and restarting continuously for more than 10 minutes, consider the severity as CRITICAL and describe the pod status and check for any error. Restart the pod and check the startup logs. Check the availability of system resources and cluster health.
PVC_Usage
If only 10% of the persistent volume is free at any given point of time, consider the severity as CRITICAL and check cluster health and perform the same clean up that you would perform for the API Data Store metrics.
PVC_Error
If the persistent volume status shows XXX at any given point of time, consider the severity as CRITICAL and check the API Data Store cluster status.
Analyze Trend
You can use external tools for dashboarding operations and visualizing metrics and logs.