Monitoring periodic status
In addition to the status that is shown on the card for a model, it is possible to enable generation of periodic status published as
Cumulocity IoT operations or events. See
Configuration on setting the
status_device_name and
status_period_secs tenant options.
Each operation has the following parameters:
Parameter | Description |
models_running | Information about deployed models that are running. |
models_failed | Information about deployed models that have failed. |
apama_status | The Apama correlator status metrics. Many status names correspond to the key names in the Apama REST API. The values are returned by the getValues() action of the com.apama.correlator.EngineStatus event and exposed via the REST API. |
Model status
The following information is published for each deployed model that is currently running or has failed:
Name | Description |
mode | The mode of the deployed model. It is SIMULATION for models deployed in simulation mode. Otherwise, it is PRODUCTION. |
modeProperties | Any mode-specific properties of the model. This includes the start and end time of the simulation for models running in the SIMULATION mode. |
numModelEvaluations | The total number of times the model has been evaluated since it was deployed. |
numBlockEvaluations | The total number of times that the blocks have been evaluated in the model since it was deployed. This is the sum of the count of evaluation for each block in the model. |
avgBlockEvaluations | The average number of blocks that have been evaluated per model evaluation. |
numOutputGenerated | The total number of outputs generated by the model since it was deployed. |
This information about each model provides insight into the performance or working of models. For example, a model with a much larger number of numBlockEvaluations than another model might indicate that it is consuming most resources even though it might have low numModelEvaluations. Similarly, it can be used to find out whether a model is producing output at the expected rate relative to the number of times it is evaluated.
You can monitor the status using the Apama REST API or the Management interface which is an EPL plug-in. See the following topics in the Apama product documentation for further information:
"Managing and Monitoring over REST" in
Deploying and Managing Apama Applications, and
"Using the Management interface" in
Developing Apama Applications.
Slowest chain status
When chains of models with a high throughput are deployed across multiple workers, it may happen that the chain falls behind in processing input events, creating a backlog of input events that are still to be processed. These chains are referred to as slow chains. A message is written to the correlator log if the slowest chain is delayed by more than 1 second. For example:
Analytics Builder chain of models "Model 1", "Model 2", "Model 3" is slow by 3 seconds.
See
Accessing the correlator log for information on where to find the correlator log.
The following information on the slowest chain is also available in the periodic status that is published as Cumulocity IoT operations or events, within the apama_status parameter:
Name | Description |
user-analyticsbuilder.slowestChain.models | All models contained in the slowest chain. |
user-analyticsbuilder.slowestChain.delaySec | The number of seconds the chain lags behind in processing the input events. |
Example
The following is an example of the status operation data that is published by Cumulocity IoT:
{
"creationTime": "2020-07-23T21:48:54.620+02:00",
"deviceId": "6518",
"deviceName": "apama_status",
"id": "8579",
"self": "https://myown.iot.com/devicecontrol/operations/8579",
"status": "PENDING",
"models_running": {
"Package Tracking": {
"mode": "SIMULATION",
"modeProperties":{"startTime":1533160604, "endTime":1533160614},
"numModelEvaluations": 68,
"numBlockEvaluations": 967,
"avgBlockEvaluations": 14.2,
"numOutputGenerated": 50
}
},
"models_failed": {
"Build Pipeline ": {
"mode": "PRODUCTION",
"numModelEvaluations": 214,
"numBlockEvaluations": 671,
"avgBlockEvaluations": 3.13,
"numOutputGenerated": 4
}
},
"apama_status": {
"user-analyticsbuilder.slowestChain.models": "\"Model 1\", \"Model 2\", \"Model 3\"",
"user-analyticsbuilder.slowestChain.delaySec": "3",
"user-analytics-oldEventsDropped": "1",
"numJavaApplications": "1",
"numMonitors": "27",
"user-httpServer.eventsTowardsHost": "1646",
"numFastTracked": "183",
"user-httpServer.authenticationFailures": "4",
"numContexts": "5",
"slowestReceiverQueueSize": "0",
"numQueuedFastTrack": "0",
"mostBackedUpInputContext": "<none>",
"user-httpServer.failedRequests": "4",
"slowestReceiver": "<none>",
"numInputQueuedInput": "0",
"user-httpServer.staticFileRequests": "0",
"numReceived": "1690",
"user-httpServer.failedRequests.marginal": "1",
"numEmits": "1687",
"numOutEventsUnAcked": "1",
"user-httpServer.authenticationFailures.marginal": "1",
"user-httpServer.status": "ONLINE",
"numProcesses": "48",
"numEventTypes": "228",
"virtualMemorySize": "3177968",
"numQueuedInput": "0",
"numConsumers": "3",
"numOutEventsQueued": "1",
"uptime": "1383561",
"numListeners": "207",
"numOutEventsSent": "1686",
"mostBackedUpICQueueSize": "0",
"numSnapshots": "0",
"mostBackedUpICLatency": "0",
"numProcessed": "1940",
"numSubListeners": "207"
}
}