Testing query execution
When writing queries, as with any programming, it is important to test that the query is behaving as expected. Testing can be as simple as a small Apama project with the event definitions, the queries, and an .evt file of events to send to the query. You can use this project to check whether the query sends out the correct events. In Software AG Designer, use the Engine Receive view to observe the output of the query. Whether or not a query is written to send output events, you can add log statements to the query file to verify whether it has or has not triggered.
Be sure to test queries in an environment that is separate from your production environment. Of course, preventing problems is the best way to avoid the need to troubleshoot so ensure that queries are sufficiently tested before deploying them.
The following background information and troubleshooting tips provide some guidance. See also
Overview of query processing.
Exceptions in queries
In a query, exceptions can occur in the following places:
Procedural code in a
find statement block
having clause
retain clause
select clause
wait clause
All
where clauses
All
within clauses
An exception in the inputs block (retain or within clause) or the find block's wait or within clause causes the query to terminate. If there is an exception elsewhere, the query continues to process incoming events. An exception that occurs in a where or having clause causes the Boolean expression to evaluate to false.
Event ordering in queries
Unlike EPL monitors, the order in which queries process events is not necessarily the order in which they were sent into the correlator. In particular, if two events that will be processed by the same query with the same key value are sent very close together in time (both events received less than about .1 seconds of each other) then they may be processed as if they had been sent in a different order. For example, consider a query that is looking for an A event followed by an A event. If two A events with the same key arrive 1 millisecond apart then the events might not be processed in the order in which they were sent.
Queries use multiple threads to process events and to scale across multiple correlators on multiple machines. To do this efficiently, there is no enforcement that the events are processed in order. However, when events that have the same key arrive roughly about .5 seconds apart or more then out-of-order processing is typically avoided provided the system can keep up with the load. Therefore, you want to specify a query so that it operates on partitions in which the arrival of consecutive events is spaced far enough apart. For example, consider a query that operates on credit card transaction events, which could mean thousands of events per second. You want to partition this query on the credit card number so that there is one event or less per partition per second. By following this recommendation, it becomes possible to process events that are generated at rates of up to 10,000 events per second.
When creating an
.evt file for testing purposes, the recommendation is to begin the file with a
&FLUSHING(1) line to cause more predictable and reliable event-processing behavior. See
Event timing.
Query diagnostics
To help you monitor queries that are running on a given correlator, Apama provides data about active queries in DataViews. See
Monitoring running queries.
When deploying Apama queries it is possible to enable generation of diagnostic information. These are log statements that explain some of the internal workings of the query evaluation. In particular, events coming into the query and the contents of the windows before the pattern is evaluated are both logged. This can aid understanding of how the query evaluation occurs. If a query is misbehaving then providing this diagnostics logging to Apama support can help in understanding the issue.
Note:
Diagnostic logs contain the event data. You may want to consider using fake data rather than real data if the real data is sensitive.
Logging in where statements
It can be useful to modify a query so that rather than including the expression that needs to be evaluated in a where clause, the query calls an action on the query to execute the expression used by the where clause. This allows logging of inputs and the result of the expression. For example, instead of a query that contains the following:
find A as a -> B as b where a.x >= b.x { ...
Write the query this way:
action compareAB(A a, B b) returns boolean {
log "compareAB; inputs: A as a = "+a.toString()+ ", B as b = "+b.toString();
boolean r:= (a.x >= b.x);
log "compareAB; result is "+r.toString();
return r;
}
find A as a -> B as b where compareAB(a, b) { ...
You can then use these log statements to check if the query is behaving as expected.
Divide and conquer
One of the advantages of testing a query with a known set of input events is that it is possible to see how changing the query affects the results. For example, if a query is not matching any events and has many within and without clauses, try removing all of them. One way to do this is to place them onto separate lines and use // as a comment at the beginning of the lines in the source view. If the query still does not fire, use query diagnostics to check that events are being evaluated. If the query is firing, then add within and without clauses one at a time until the query stops firing. The problem is at the condition that stops it from firing when it should.
Query performance
A critical factor that affects the performance of queries is the size of the windows specified in the
inputs block of the query. Aim for windows that contain no more than 100 events. Depending on the distributed cache used to store data, it may also be necessary to change the number of parallel contexts per correlator. Experiment with different values for the number of worker contexts. See also
Overview of query processing.
Using external clocking when testing
When testing queries, as well as switching into single context execution, it is often useful to use external clocking. This allows
&TIME events to be sent into the correlator to simulate the passage of time, which allows queries involving long durations (for example, multiple days) to be tested easily. To ensure the correct ordering of processing between events and
&TIME events, you should also include
&FLUSHING(1) at the beginning of the event file, before any events. See
Externally generating events that keep time (&TIME events) and
Event timing.