Built-in aggregate functions
EPL provides the built-in aggregate functions listed in the table below. All of these functions are available for either bounded or unbounded use.
How to make use of the built-in aggregate functions
The built-in aggregate functions reside in the com.apama.aggregates package. To use a built-in aggregate function in a query, you must do one of the following:
Specify the full name of the aggregate function. For example:
select com.apama.aggregates.sum(x)
For each aggregate function you want to use in your code, add a
using statement. This lets you specify aggregate function names without specifying the package name. For example:
using com.apama.aggregates.mean;
using com.apama.aggregates.stddev;
...
...select MeanSD( mean(s), stddev(s) );
Insert the using statement after the optional package declaration and before any other declarations in the .mon file.
Overview of the built-in aggregate functions
The argument names (for example, value and weight) in the following table are placeholders for expressions. Additional information about some of these functions is provided after the table.
Aggregate Function | Argument Type | Return Type | Result Description |
avg(value) or mean(value) | decimal or float | Same as value | The arithmetic mean of the values in the window. The avg() and mean() functions do exactly the same thing. They are aliases for each other. |
count() | no argument | integer | The number of items in the window, including any not-a-number (NaN) items. |
count(predicate) | boolean | integer | The number of items for which the argument is true. You can specify, for example, count(value !=""). The predicate expression is re-evaluated each time. |
countUnique(value) | string | integer | The number of unique strings within the window. |
first(value) | decimal, float, integer, string, boolean or location | Same as value | The earliest value in the window being aggregated over. |
last(value) | decimal, float, integer, string, boolean or location | Same as value | The latest value in the window being aggregated over. |
max(value) | decimal, float or integer | Same as value | The maximum value. |
min(value) | decimal, float or integer | Same as value | The minimum value. |
nth(value,index) | decimal, float, integer, string, boolean or location for the value integer for the index | Same as value | The value of the specified item in the index position, starting with the earliest item in the window (item 0) and moving toward the latest item. nth(value,0) returns the same item as first(value). A negative index gets the items from the end of the window (-1 means the last item, -2 means the second last item, and so on). |
percentile(value,rank) | float or decimal for value float for rank | Same as value | The value that r percent (0<= r <= 100) of the data in the window is equal-to or less-than. |
stddev(value) | decimal or float | Same as value | The standard deviation of the values. |
stddev2(value) | decimal or float | Same as value | The sample standard deviation of the values. |
sum(value) | decimal, float or integer | Same as value | The sum of the values. |
wavg(value,weight) | decimal,decimal or float,float | Same as value | The weighted average of the values where each value is weighted by the corresponding weight. |
Calculations by the built-in aggregate functions might be affected by underflow and overflow. For example, adding a very large number to the collection that the sum() function operates on, then adding a very small number, and then removing the very large number will probably result in 0.0, and not the very small number. Just adding the very small number would result in behavior that you would expect. As with the rest of EPL, the overflow and underflow characteristics are as defined for IEEE 64-bit floating point numbers.
Deprecated built-in aggregate functions
The following functions are deprecated. It is recommended that you use the alternative functions mentioned in the table below.
Aggregate Function | Argument Type | Return Type | Result Description |
count(value) | decimal or float | integer | The number of items where the decimal or float value is not NaN. Use the alternative predicate aggregate function count(not value.isNaN()) instead. |
prior(value,index) | decimal, integer or float, integer | Same as value | The value of the specified decimal or float item in the index position, starting with the most recent item in the window (item 0 ) and moving toward the earliest item. prior(value,0) returns the same item as last(value). Use the alternative function nth(value,index) instead. |
Positional functions
For the first(), last(), nth(), and prior() (deprecated) functions, all values (NaN, +-, ∞, and so on) are treated the same, and position in the window is the only thing that matters.
Operating on empty windows
Except for the sum() and count() functions, if the window being aggregated over is empty or insufficiently large, then the result is not-a-number (NaN). The sum() and count() functions return zero if the window is empty.
IEEE special values in aggregate functions
Several of the built-in aggregate functions take decimal or float arguments. It is possible for a decimal or float value to be one of the following:
Positive infinity
Negative infinity
Not-a-number (NaN)
A finite number
The positional aggregates first(), last(), nth() and prior() (deprecated) are agnostic to the values in them and return the selected item regardless of its value. If the selected item does not exist (for example, selecting the fifth item from a window of three items), then the aggregate returns NaN. The index for nth() and prior() (deprecated) must not be negative. If it is, the correlator terminates the monitor instance.
All the remaining (arithmetic) aggregate functions that take float or decimal arguments ignore any NaN items that are in the window being aggregated. The result is the aggregate of the window without the NaN items. If you want to count all items including NaN items, then use the count() aggregate function that takes no arguments.
The behavior of arithmetic aggregate functions over windows that contain positive and negative infinities varies depending on the particular function. The result is either an infinity, NaN or a finite value. The table below shows for a window containing one or more positive infinities and no negative infinities, one or more negative infinities and no positive infinities, or at least one positive and at least one negative infinity, which aggregate function gives which result. In the case of the wavg() function, the result depends on whether the infinity is the value or the weight.
Input | Outputs Positive Infinity | Outputs Negative Infinity | Outputs NaN | Outputs Finite Value |
Positive Infinity | max() mean() sum() wavg(value) | | stddev() wavg(weight) | min() |
Negative Infinity | | mean() min() sum() wavg(value) | stddev() wavg(weight) | max() |
Both | max() | min() | mean() stddev() sum() wavg(value) | |
The following table shows the results for the percentile() function. The output depends on the input for the rank. As with the aggregate functions that are listed in the previous table, NaN inputs for the percentile() function are ignored and are not counted.
Input | Outputs Positive Infinity | Outputs Negative Infinity | Outputs NaN | Outputs Finite Value |
Positive Infinity | If ordinal rank corresponds to a positive infinity value | | | If ordinal rank does not correspond to a positive infinity value |
Negative Infinity | | If ordinal rank corresponds to a negative infinity value | | If ordinal rank does not correspond to a negative infinity value |
Both | If ordinal rank corresponds to a positive infinity value | If ordinal rank corresponds to a negative infinity value | If ordinal rank lies between a negative infinity value and a positive infinity value | If ordinal rank does not correspond to a positive or negative infinity value |
See also:
Select clause for information about the
select clause in Apama queries.