Understanding queries
Apama queries allow business analysts and developers to create scalable applications to process events originating from very large populations of real-world entities. Scaling, both vertically (same machine) and horizontally (across multiple machines), is inherent in Apama query applications. Scaled-out deployments, involving multiple machines, will use distributed cache technology to maintain and share application state. This makes it easy to deploy across multiple servers, and keep the application running even if some servers are taken down for maintenance or fail.
Apama queries are designed to be easy to develop for both the business analyst and the application developer. Graphical tools to specify the application design and full round-trip engineering allows both the business analyst and the developer to work on the same queries. At the developer level, an Apama query is defined using the Apama event processing language, EPL.
Apama's visual Query Designer in Software AG Designer enables business analysts to easily create new queries and to view and review existing queries.
Use cases for queries
Apama queries are well suited to problems that:
Map to a large set of partitions
Have continuous availability and/or scalability requirements
Do not require sub-millisecond latency
Partitions may correspond to customer accounts, transactions being tracked, devices or some other entity. In a query application, the correlator processes the events in each partition independently of other partitions.
Advantages of Apama queries over Apama monitors:
Platform provides active-active availability. That is, queries can be run in a cluster, where every node in the cluster contributes processing resources. The number of nodes can be changed dynamically without losing state.
Scale out across multiple servers
Declarative pattern specification
Query evaluation is based purely on past event history. Other than events, queries have no state and so they behave uniformly over time.
Disadvantages of Apama queries compared to Apama monitors:
Higher latency than monitors. Latency is of the order of milliseconds to seconds rather than microseconds to milliseconds. Exact values depend on the deployment and the types of events being processed.
Apama monitors allow you to write custom and more powerful EPL applications that do not have the declarative and structural bounds that queries have.
To take advantage of the scalability and availability that the queries platform offers, the problem your application needs to solve should meet one or more of the following requirements:
Different partitions for a given query must be completely independent. However, different queries can use different partition keys for the same event types. For example, one query may partition ATM withdrawals by
cardNumber, and another by
atmId.
The average number of events in each window should be low. The recommendation is less than 50 events. For example, if ATM withdrawals are partitioned by
cardNumber then a window that retains withdrawals for a three-day period is fine because the typical number of withdrawals per card is likely to be low. While it is possible to have hundreds of withdrawals for a single card number, that would be an exceptional case and probably indicative of suspicious behavior.
Other than the history of events, no state is required. Queries do not provide for state to be stored. However, it is possible to mix monitors and queries in the same deployment.
The time between events destined for the same partition would typically be long, that is, more than a few seconds between events.
The exact ordering between events is not critical. A query may treat two events for the same partition that occur close in time as having occurred in an order that is different from the order in which they were sent.
Query application examples
Some examples of use cases for queries include:
Customer relation management — monitoring transactions between a retailer or service provider and individual customers. For example, queries can identify:
Transactions that are implausible and indicate fraudulent activity. See the ATM fraud sample application, which is in the
samples\queries directory of your Apama installation.
Users who have not yet registered an optional account on their service provider's website. See the unregistered users sample application.
Customers who may be interested in a particular retail offer.
Tracking parcels — monitoring parcels to determine when one is failing to progress through the distribution system for a certain amount of time, or is in danger of not arriving at its destination.
In all of these cases, the problem can be easily partitioned (by customer account or parcel), and the number of events per partition is likely to be low and spread out in time.