Overview of query processing

When Apama executes queries, it does so in parallel, making use of multiple CPU cores as available. This is good for performance, but uses more resources on the hosts running the correlator and can, in edge cases, cause events to be processed in an order that is different from the order in which they were delivered to the correlator. To simplify testing, a serial mode is supported where events are processed in order, no matter how quickly they are sent.

Apama processes queries as follows:

1. Based on the inputs section of a query, the query subsystem creates listeners for the required events.

2. Running Apama queries receive events sent on the default channel and on the com.apama.queries channel.

3. Events matching those listeners are forwarded to the query subsystem that processes the events.

4. The events are processed in parallel. That is, multiple threads of execution are employed, thereby achieving vertical scaling on machines that have multiple cores.

5. The query subsystem must locate the relevant events for the query partition. That is, the previously encountered events that are still current according to the defined event windows for that query. The information in the incoming event, that is, the key, is all that is required to locate these events.

6. The window contents are updated, adding the new event and discarding any events that are no longer current.

7. The system then checks the updated window contents to determine if there are any new pattern matches.

In a single correlator solution, events in a particular partition are held in one or more Apama MemoryStore records. The key from the incoming event is used to locate these records. In a multi-correlator solution, the records are held in a distributed cache, accessed by means of the MemoryStore API. All of this is internal, however, you should consider timing constraints when deciding whether a query-based solution is appropriate for a given problem. See Understanding queries.

After injecting a query into a correlator, events may be immediately sent to that query. If necessary, Apama stores these events until the query is prepared. That is, the query might be opening local/remote stores. Events are delivered when the query is ready to process them. There is no guarantee that the order in which the events arrived in the correlator is the same order in which the query processes them. See Event ordering in queries.

When testing, either send events at a realistic event rate, with pauses in between each set of events, or use single context mode. To send events with pauses, you can place BATCH entries in the .evt file. See Event timing.

By default, the query subsystem determines the size of the machine it is running on (the number of cores) and scales accordingly. If other services are affected by the load on the host machine, or for testing, then send one of the following events to the correlator (for example, by creating an .evt file in Software AG Designer and sending it as part of the Run Configuration) to configure how the correlator executes queries: