Aggregating event field values
A find statement can specify a pattern that aggregates event field values in order to find data based on many sets of events. A pattern that aggregates values specifies the every modifier in conjunction with select and having clauses.
Based on a series of values, an aggregate function computes a single value, such as the average of a series of numbers. See the
API Reference for EPL (ApamaDoc) for detailed information on all built-in aggregate functions.
Note:
If a built-in aggregate function does not meet your needs, you can use EPL to write a custom aggregate function. A custom aggregate function that you want to use in a query must either be a bounded function or it must support both bounded and unbounded operation. See
Defining custom aggregate functions.
For example, the following query watches for a withdrawal amount that is greater than some threshold multiplied by the average withdrawal amount of the ATMWithdrawal events in the window, which might be as many as 20 events. This query uses the last() aggregate function to identify the event added to the window most recently and uses the avg() aggregate function to find the average withdrawal amount of the events in the window. The having clause must evaluate to true for the query to send the SuspiciousTransaction event, passing the transaction ID of the suspicious withdrawal. You can use either as or the colon (:) as the coassignment operator.
using com.apama.aggregates.avg;
using com.apama.aggregates.last;
query FindSuspiciouslyLargeATMWithdrawals {
parameters {
float THRESHOLD;
}
inputs {
ATMWithdrawal() key accountId retain 20;
}
find every ATMWithdrawal as w
select last(w.transactionId) as tid
having last(w.amount) > THRESHOLD * avg(w.amount){
send SuspiciousTransaction(tid) to SuspiciousTxHandler;
}
}
To use an aggregate function in a find statement, specify the every modifier and specify one or more select or having clauses. A select clause indicates that aggregate values are to be computed. Each select clause specifies a projection expression and a projection coassignment. The projection expression can use coassignments from the pattern if the coassignments are within a single aggregate function call. For example, the following pattern computes the average value of the x member of event type A in the query's input and coassigns that average value to aax.
find every A as a select avg(a.x) as aax
A select clause can use parameter values. For example the following two select clauses are both valid if there is a parameter param:
find every A as a
select avg(param * a.x) as apax
select param * avg(a.x) as paax
You can specify multiple select clauses to produce multiple aggregate values.
In an aggregating find statement, only the projection expression can use the coassignments from the pattern. The procedural block of code can use projection coassignments and any parameters, but it cannot use coassignments from the pattern.
The first() and last() built-in aggregate functions are useful if you want to refer to the coassignment value of the oldest or newest event, respectively, in the window.
The following example determines the average price of trades other than your own:
find every Trade as t
where t.buyer != myId and t.seller != myId
select wavg(t.price, t.amount) as avgprice
Match sets used in aggregations
In find statements without the every modifier, only the most recent set of events that match the pattern are used to invoke the procedural code block. With the every modifier, every set of events that matches the pattern is available for use by the aggregate function, provided that the latest event is present in one of the sets of events. Any events or combinations of events that do not match the pattern or do not match the where clause, or are invalidated due to a within or without clause, are ignored; their values are not used in the aggregate calculation.
For example, consider the following find statement:
find every A as a -> B as b
where b.x >= 2
select avg(a.x + b.x) as aabx {
print aabx.toString();
}
The following table shows what happens as events are added to the window.
Event Added to Window | Match Sets | Average Of | Value of aabx |
A(1) | None | | |
A(2) | None | | |
B(2) | A(1), B(2) A(2), B(2) | 3 and 4 | 3.5 |
B(1) | None because B(1) causes the where clause to be false. | | |
B(3) | A(1), B(2) A(2), B(2) A(1), B(3) A(2), B(3) | 3, 4, 4, and 5 | 4 |
Note:
Only coassignments that definitely have a value may be used in aggregates. Or-terms that are on one side of an
or operator in the pattern may not be used in aggregate expressions (see also
Query or operator).
Using aggregates in namespaces
As with event types, an aggregate function is typically defined in a namespace. To use an aggregate function, specify its fully-qualified name or a using statement. The built-in aggregate functions are in the com.apama.aggregates namespace. For example, to use the avg() aggregate function you would specify the following in the query:
using com.apama.aggregates.avg;
Filtering unwanted invocation of procedural code
Each select clause defines an aggregate value to be produced. You can also specify one or more having clauses to restrict when the procedural code is invoked. For example, consider the following find statement:
find every A as a
select avg(a.x) as aax
having avg(a.x) > 10.0 {
print aax.toString();
}
This example calculates the average value of a.x for the set of A events in the window. However, it executes the procedural block only when the average value of a.x is greater than 10.0.
Multiple having clauses
You can specify multiple having clauses and you can use parameter values in having clauses. For example,
find every A as a
select avg(a.x) as aax
select sum(a.y) as aay
having avg(a.x) > 10.0
having sum(a.y) > param1
having max(a.z) < param2
{
print aax.toString(), + " : " + aay.toString();
}
When you specify more than one having clause it is equivalent to specifying the and operator, for example:
...
having avg(a.x) > 10.0 or sum(a.y) > param1
having max(a.z) < param2
...
is equivalent to
...
having ( avg(a.x) > 10.0 or sum(a.y) > param1 ) and ( max(a.z) < param2 )
...
Using a select coassignment in a having clause
Rather than specifying an aggregate expression twice, once in a select clause and then subsequently in a having clause, it is possible to refer to the aggregate value by using the select coassignment name. For example:
find every A as a
select avg(a.x) as aax
having avg(a.x) > 10.0 {
print aax.toString();
}
You can rewrite that as follows:
find every A as a
select avg(a.x) as aax
having aax > 10.0 {
print aax.toString();
}
Using a having clause without a select clause
When you want to test for an aggregate condition but you do not want to use the aggregate value, you can specify a having clause without specifying a select clause. For example,
find every A as a
having avg(a.x) > 10.0 {
print "Average value is greater than ten!";
}