Window functions over RANGEs

In RAQL SELECT aggregationFunction(x) OVER(ORDER BY y) FROM s is semantically equivalent to SELECT aggregationFunction (x) OVER(ORDER BY y ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM s . The SQL Standard however defines the semantics to correspond with SELECT aggregationFunction (x) OVER(ORDER BY y RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM s which means that not only the current row but all "peer" rows (rows identical to the current row with respect to the ordering) are contained in the window. That means that a rolling aggregate without an explicit window frame specification currently yields different results in RAQL and any SQL standard compliant DBMS.

In order to ensure a high level of compliance with the SQL standard, window frames are also allowed to be specified using the RANGE key word.

Semantics

Window frames defined using the ROWS keyword are easy to understand. The window frame clause ROWS BETWEEN x PRECEDING AND y FOLLOWING simply defines two offsets x and y that for each row in the input partition define the first and the last row number of the corresponding window relative to the current row number. PRECEDING and FOLLOWING only indicate whether the offset is negative (the row number precedes the current row number) or positive respectively.

Table 2. Example

rowNumber

ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING

ROWS BETWEEN 3 PRECEDING AND 1 PRECEDING

ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING

1

Window = [1,4]

Window = []

Window = [1,10]

2

Window = [1,5]

Window = [1,1]

Window = [2,10]

3

Window = [1,6]

Window = [1,2]

Window = [3,10]

4

Window = [1,7] (from row number 4-3 to 4+3)

Window = [1,3]

Window = [4,10]

5

Window = [2,8]

Window = [2,4]

Window = [5,10]

6

Window = [3,9]

Window = [3,5]

Window = [6,10]

7

Window = [4,10]

Window = [4,6]

Window = [7,10]

8

Window = [5,10]

Window = [5,7]

Window = [8,10]

9

Window = [6,10]

Window = [6,8]

Window = [9,10]

10

Window = [7,10]

Window = [7,9]

Window = [10,10]

On the other hand the semantics of window frame specifications using the RANGE keyword does not depend on the physical row numbers but on the actual values of a column. Still, the window frame clause defines two offsets x and y, but these offsets are not added to the row number but to an actual column value. Hence, any window frame specification involving RANGE and an offset other than UNBOUNDED PRECEDING, UNBOUNDED FOLLOWING or CURRENT ROW requires the window to be ordered on a single column and this column needs to have a numeric data type (in order to allow for offset arithmetics).

Table 3. Example

rowNumber

salary

SELECT avg(salary) OVER (ORDER BY salary RANGE BETWEEN 300 PRECEDING AND 300 FOLLOWING) FROM ...

considered window

 

1

1750

1850

[1450,2050]

salaries between 1750 - 300 and 1750 + 300

2

1900

1850

[1600,2200]

 

3

1900

1850

[1600,2200]

 

4

2200

2100

[1900,2500]

 

5

2400

2300

[2100,2700]

 

6

2750

2866,67

[2450,3050]

 

7

2900

2980

[2600,3200]

 

8

2950

2980

[2650,3250]

 

9

3100

2980

[2800,3400]

 

10

3200

2980

[2900,3500]

 

Table 4. Example

rowNumber

salary

SELECT avg(salary) OVER (ORDER BY salary RANGE BETWEEN 300 PRECEDING AND 300 FOLLOWING) FROM ...

considered window

 

1

1750

1

[1750,1750]

salaries between 1750 - 0 and 1750 + 0

2

1900

2

[1900,1900]

 

3

1900

2

[1900,1900]

 

4

2200

1

[2200,2100]

 

5

2400

1

[2400,2400]

 

6

2750

1

[2750,2750]

 

7

2900

1

[2900,2900]

 

8

2950

1

[2950,2950]

 

9

3100

1

[3100,3100]

 

10

3200

1

[3200,3200]

 

Note: The offsets in a RANGE frame clause define logical offsets in the ordered sequence defined by the order by clause. If the order by clause specifies a descending order (see next example) the offset 10 PRECEDING translates to "a value that precedes the current value in the current sequence and differs at most by 10 (years/units/...)"

Table 5. Example

rowNumber

age

SELECT COUNT(*) OVER (ORDER BY age DESC RANGE 10 PRECEDING) FROM ...

considered window

 

1

75

1

[85, 75]

ages between 75 + 10 and 75 - 0

2

68

2

[78, 68]

 

3

68

2

[72, 62]

 

4

56

2

[66, 56]

 

5

55

3

[65, 55]

 

6

51

3

[61, 51]

 

7

48

5

[58, 48]

 

8

48

5

[58, 48]

 

9

47

6

[57, 47]

 

10

43

5

[53, 43]

 

Affected Queries

As mentioned above, the new RANGE specifier was introduced to ensure compliance with the SQL standard. There are however only very few cases where existing queries should be affected by those changes. Only queries specifying a window function with

If the order-by expressions are unique, than for each row there are no peers with respect to the ordering and the implicitly added window frame clause RANGE UNBOUNDED PRECEDING is equivalent to ROWS UNBOUNDED PRECEDING and hence the query results would be the same as before. If however the order-by specification is not unique then there may be rows having one or more peer rows so that the query result differs from the result in former RAQL releases and the user might want to explicitly add the window frame clause ROWS UNBOUNDED PRECEDING to yield the former results.