Defining cross-joins with two from clauses
A cross-join is defined with two from clauses, one for each stream, optionally including window definitions. A simple example of this is:
from p1 in leftPairs retain 2
from p2 in rightPairs retain 2
select sum(p1.num * p2.num);
This is illustrated in the following diagram, whose notation differs from the previous diagrams. Here, for each time point there are two columns, one for each side of the join. The first column, with purple events, represents the items from the first from clause and the second column, with cyan events represents the items from the second from clause. Events in bold arrived during this activation of the stream query and the boxes enclose the windows for each side. As in the previous diagrams, the output is given for each of the three kinds of projections.
The query before the diagram corresponds to the aggregate projection. The three queries shown here are:
Simple istream projection:
from p1 in leftPairs retain 2
from p2 in rightPairs retain 2
select p1.num * p2.num
Simple rstream projection:
from p1 in leftPairs retain 2
from p2 in rightPairs retain 2
select rstream p1.num * p2.num
Aggregate projection:
from p1 in leftPairs retain 2
from p2 in rightPairs retain 2
select sum(p1.num * p2.num);
As shown in the diagram, in a cross-join whenever an item arrives in a window, it is joined to every item in the other window to produce a separate output item for each combination.
Because the number of output items is the product of the size of the two windows, cross-joins are normally used for joins between at least one of:
A window of size 1.
A stream where you have omitted the window definition.
If both sides of the join omit the window definition, then for output to occur an item must arrive on each stream during the same activation of the query.
A more concrete example can be seen here:
spreads :=
from a in all com.apama.demo.marketdata.Depth(symbol=order.Instrument_1)
retain 1
from b in all com.apama.demo.marketdata.Depth(symbol=order.Instrument_2)
retain 1
select (a.midPrices[0] - b.midPrices[0]);
This query generates the spread between the latest prices for the two identified stocks. In each from clause, the window contains one item. Whenever a new item arrives in one window, the query executes the calculation defined in the select clause and outputs the result.
To generate a running mean and a standard deviation for this spread value, you can define the following query:
stream<MeanSD> averages := from s in spreads within 20.0
select MeansSD(mean(s),stddev(s));
Then, to obtain all three current values for the spread, the mean and the standard deviation, you can perform a join between the spreads stream and the averages stream:
stream<SpreadMeanSD> all := from s in spreads
from a in averages
select SpreadMeansSD(s, a.mean, a.stddev);
This query outputs a result only when there is an item currently in both spreads and averages.
In a cross-join, you cannot specify more than two from clauses.
CAUTION:
Be aware that cross-joins have the potential to generate a great quantity of output. It is preferable to use cross-joins only where the window size/duration of any window involved in the cross-join is small. For example, putting 8000 events through a 100x100 cross-join produces 1.6 million output events. You cannot specify a cross-join in a query that contains an unbounded window.