API Gateway 10.11 | Using API Gateway | Policies | System-defined Stages and Policies | Traffic Monitoring | Monitor SLA
 
Monitor SLA
This policy monitors a set of run-time performance conditions for an API, and sends alerts to a specified destination when the performance conditions are violated. This policy enables you to monitor run-time performance for one or more specified applications. You can configure this policy to define a Service Level Agreement (SLA), which is a set of conditions that defines the level of performance that an application should expect from an API. You can use this policy to identify whether the API threshold rules are met or exceeded. For example, you might define an agreement with a particular application that sends an alert to the application if responses are not sent within a certain maximum response time. You can configure SLAs for each API or application combination.
Parameters like success count, fault count and total request count are immediate monitoring parameters and the evaluation happens immediately after the limit is breached. The rest of the parameters are Aggregated monitoring parameters whose evaluation happens once the configured interval is over. If there is a breach in any of the parameters, an event notification ( Monitor event) is sent to the configured destination. In a single policy, multiple action configurations behave as AND condition. The OR condition can be achieved by configuring multiple policies.
The table lists the properties that you can specify for this policy:
Property
Value
Action Configuration. Specifies the type of action to be configured.
Name
Specifies the name of the metric to be monitored.
You can select one of the available metrics:
*Availability. Indicates whether the native API is available to the clients as specified in the current interval. API Gateway calculates the availability of the native API based on the alert interval specified and it is calculated from the instant the API activation takes place. The availability of the API is calculated as = (time for which the native API is up / total interval of time) x 100. This value is measured in %.
For example, if you set Availability as less than 90, then whenever the availability of the native API falls below 90%, in the specified time interval, API Gateway generates an alert. Suppose, the alert interval is set as 1 minute (60 seconds) and if there are 7 API invocations at various times in that 1 minute with a combination of up and down as shown in the table, the availability is calculated as follows:
Request #
Invocation time (the second at which the API is invoked)
Service status
Up time
1
5
Up
5 (from start to now)
2
15
Up
10 (between 1 and 2)
3
30
Down
15 (between 2 and 3)
4
40
Down
0 (since last state is Down)
5
45
Up
0
6
50
Down
5 (between 5 and 6)
7
55
Up
0
5 (remaining 5 seconds considered as Up inline with last state)
Total
40 (Availability is 67%)
As the availability of the native API calculated is 66.67% and falls below 90%, API Gateway generates an alert. The API is considered to be down for the ongoing request when API Gateway receives a connection related error from the native API in the outbound call. If the API does not respond with an HTTP response, then it is considered as down.
*Average Response Time. Indicates the average time taken by the service to complete all invocations in the current interval. The average is calculated from the instant the API activation takes place for the configured interval.
For example, if you set an alert for Average response time greater than 30 ms with an interval of 1 minute then on API activation, the monitoring interval starts and the average of the response time of all runtime invocations for this API in 1 minute is calculated. If this is greater than 30 ms, then a monitor event is generated. If this is configured under Monitor SLA policy with an option to configure applications so that application specific SLA monitoring can be done, then the monitoring for the average response time is done only for the specified application.
*Fault Count. Indicates the number of faults returned in the current interval. The HTTP status codes greater than or equal to 400, returned from API Gateway are considered as fault request transactions. This includes the downtime errors as well.
*Maximum Response Time. Indicates the maximum time to respond to a request in the current interval.
*Minimum Response Time. Indicates the minimum time to respond to a request in the current interval.
*Success Count. Indicates the number of successful requests in the current interval.
*Total Request Count. Indicates the total number of requests (successful and unsuccessful) in the current interval.
Operator
Specifies the operator applicable to the metric selected.
Select one of the available operator: Greater Than, Less Than, Equals To.
Value
Specifies the alert value for which the monitoring is applied.
Destination
Specifies the destination where the alert is to be logged.
Select the required options:
*API Gateway
*API Portal
*CentraSite
Note:
This option is applicable only for the APIs published from CentraSite to API Gateway.
*Digital Events
*Elasticsearch
*Email (you can add multiple email addresses by clicking ).
Note:
If an email alias is available, you can type the email alias in the Email Address field with the following syntax, ${emailaliasname}. For example, if test is the email alias, then type ${test}.
*JDBC
*Local Log: You can select the severity of the messages to be logged (logging level) from the Log Level drop-down list. The available log levels are ERROR, INFO, and WARN.
Note: 
*Set the Integration Server Administrator's logging level for API Gateway to match the logging levels specified for the run-time actions (go to Settings > Logging > Server Logger). For example, if a Log Invocation action is set to the logging level of Error, you must also set Integration Server Administrator's logging level for API Gateway to Error. If the action's logging level is set to a low level (Warning-level or Information level), but Integration Server Administrator's logging level for API Gateway is set to a higher level (Error-level), then only the higher-level messages are written to the log file.
*Entries posted to the local log are identified by a product code of YAI and suffixed with the initial alphabet of the logging level selected. For example, for an error level, the entry appears as [YAI.0900.0002E].
*SNMP
*List of destinations configured using the Custom destinations section. For details on publishing to custom destinations, see How Do I Publish API-specific Traffic Monitoring Data to a Custom Destination?.
Alert Interval
Specifies the time period (in minutes) in which to monitor performance before sending an alert if a condition is violated.
The timer starts once the API is activated and resets after the configured time interval. If and API is deactivated the interval gets reset and on API activation its starts afresh.
Unit
Specifies the unit of measurement of the Alert Interval configured, to monitor performance, before sending an alert. For example:
*Minutes
*Hours
*Days
*Calendar Week. The time interval starts on the first day of the week and ends on the last day of the week. By default, the start day of the week is set to Monday.
For example:
*If an API is activated on a Wednesday and Alert Interval is set to 1, the time interval ends on Sunday, that is, 5 days.
*If an API is activated on a Wednesday and Alert Interval is set to 2, the time interval still ends on Sunday, but the period is two calendar weeks, that is 12 days.
You can change the start day of the week using the extended setting startDayOfTheWeek in the Administration > General > Extended settings section. Restart the API Gateway server for the changes to take effect.
*Calendar Month. The time interval starts on the first day of the month and ends on the last day of the month.
For example:
*If an API is activated in the month of August and Alert Interval is set to 1, the time interval ends on the last day of August.
*If an API is activated in the month of August and Alert Interval is set to 2, the time interval ends in two calendar months, that is on the last day of September.
Alert Frequency
Specifies how frequently to issue alerts for the counter-based metrics (Total Request Count, Success Count, Fault Count).
Select one of the options:
*Only Once. Triggers an alert only the first time one of the specified conditions is violated.
*Every Time. Triggers an alert every time one of the specified conditions is violated.
Alert Message
Specifies the text to be included in the alert.
Consumer Applications
Specifies the application to which this Service Level Agreement applies.
You can type a search term to match an application and click to add it.
You can add multiple applications or delete an added application by clicking .