In RAQL SELECT aggregationFunction(x) OVER(ORDER BY y) FROM s is semantically equivalent to SELECT aggregationFunction (x) OVER(ORDER BY y ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM s . The SQL Standard however defines the semantics to correspond with SELECT aggregationFunction (x) OVER(ORDER BY y RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM s which means that not only the current row but all "peer" rows (rows identical to the current row with respect to the ordering) are contained in the window. That means that a rolling aggregate without an explicit window frame specification currently yields different results in RAQL and any SQL standard compliant DBMS.
In order to ensure a high level of compliance with the SQL standard, window frames are also allowed to be specified using the RANGE key word.
Semantics
Window frames defined using the ROWS keyword are easy to understand. The window frame clause ROWS BETWEEN x PRECEDING AND y FOLLOWING simply defines two offsets x and y that for each row in the input partition define the first and the last row number of the corresponding window relative to the current row number. PRECEDING and FOLLOWING only indicate whether the offset is negative (the row number precedes the current row number) or positive respectively.
Table 2. Example
rowNumber |
ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING |
ROWS BETWEEN 3 PRECEDING AND 1 PRECEDING |
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING |
1 |
Window = [1,4] |
Window = [] |
Window = [1,10] |
2 |
Window = [1,5] |
Window = [1,1] |
Window = [2,10] |
3 |
Window = [1,6] |
Window = [1,2] |
Window = [3,10] |
4 |
Window = [1,7] (from row number 4-3 to 4+3) |
Window = [1,3] |
Window = [4,10] |
5 |
Window = [2,8] |
Window = [2,4] |
Window = [5,10] |
6 |
Window = [3,9] |
Window = [3,5] |
Window = [6,10] |
7 |
Window = [4,10] |
Window = [4,6] |
Window = [7,10] |
8 |
Window = [5,10] |
Window = [5,7] |
Window = [8,10] |
9 |
Window = [6,10] |
Window = [6,8] |
Window = [9,10] |
10 |
Window = [7,10] |
Window = [7,9] |
Window = [10,10] |
On the other hand the semantics of window frame specifications using the RANGE keyword does not depend on the physical row numbers but on the actual values of a column. Still, the window frame clause defines two offsets x and y, but these offsets are not added to the row number but to an actual column value. Hence, any window frame specification involving RANGE and an offset other than UNBOUNDED PRECEDING, UNBOUNDED FOLLOWING or CURRENT ROW requires the window to be ordered on a single column and this column needs to have a numeric data type (in order to allow for offset arithmetics).
Table 3. Example
rowNumber |
salary |
SELECT avg(salary) OVER (ORDER BY salary RANGE BETWEEN 300 PRECEDING AND 300 FOLLOWING) FROM ... |
considered window |
|
1 |
1750 |
1850 |
[1450,2050] |
salaries between 1750 - 300 and 1750 + 300 |
2 |
1900 |
1850 |
[1600,2200] |
|
3 |
1900 |
1850 |
[1600,2200] |
|
4 |
2200 |
2100 |
[1900,2500] |
|
5 |
2400 |
2300 |
[2100,2700] |
|
6 |
2750 |
2866,67 |
[2450,3050] |
|
7 |
2900 |
2980 |
[2600,3200] |
|
8 |
2950 |
2980 |
[2650,3250] |
|
9 |
3100 |
2980 |
[2800,3400] |
|
10 |
3200 |
2980 |
[2900,3500] |
|
Table 4. Example
rowNumber |
salary |
SELECT avg(salary) OVER (ORDER BY salary RANGE BETWEEN 300 PRECEDING AND 300 FOLLOWING) FROM ... |
considered window |
|
1 |
1750 |
1 |
[1750,1750] |
salaries between 1750 - 0 and 1750 + 0 |
2 |
1900 |
2 |
[1900,1900] |
|
3 |
1900 |
2 |
[1900,1900] |
|
4 |
2200 |
1 |
[2200,2100] |
|
5 |
2400 |
1 |
[2400,2400] |
|
6 |
2750 |
1 |
[2750,2750] |
|
7 |
2900 |
1 |
[2900,2900] |
|
8 |
2950 |
1 |
[2950,2950] |
|
9 |
3100 |
1 |
[3100,3100] |
|
10 |
3200 |
1 |
[3200,3200] |
|
Note: The offsets in a RANGE frame clause define logical offsets in the ordered sequence defined by the order by clause. If the order by clause specifies a descending order (see next example) the offset 10 PRECEDING translates to "a value that precedes the current value in the current sequence and differs at most by 10 (years/units/...)"
Table 5. Example
rowNumber |
age |
SELECT COUNT(*) OVER (ORDER BY age DESC RANGE 10 PRECEDING) FROM ... |
considered window |
|
1 |
75 |
1 |
[85, 75] |
ages between 75 + 10 and 75 - 0 |
2 |
68 |
2 |
[78, 68] |
|
3 |
68 |
2 |
[72, 62] |
|
4 |
56 |
2 |
[66, 56] |
|
5 |
55 |
3 |
[65, 55] |
|
6 |
51 |
3 |
[61, 51] |
|
7 |
48 |
5 |
[58, 48] |
|
8 |
48 |
5 |
[58, 48] |
|
9 |
47 |
6 |
[57, 47] |
|
10 |
43 |
5 |
[53, 43] |
|
Affected Queries
As mentioned above, the new RANGE specifier was introduced to ensure compliance with the SQL standard. There are however only very few cases where existing queries should be affected by those changes. Only queries specifying a window function with
If the order-by expressions are unique, than for each row there are no peers with respect to the ordering and the implicitly added window frame clause RANGE UNBOUNDED PRECEDING is equivalent to ROWS UNBOUNDED PRECEDING and hence the query results would be the same as before. If however the order-by specification is not unique then there may be rows having one or more peer rows so that the query result differs from the result in former RAQL releases and the user might want to explicitly add the window frame clause ROWS UNBOUNDED PRECEDING to yield the former results.