Partitions and Windows
The
Group and Analyze Rows with Row Detail example in Getting Started introduced a simple partition using the Over clause to perform some analysis and add these results to each row in the dataset. This topic includes examples of:
Multi-Level Partitions
The column expression you use in a partition definition for Over clauses can use a list of multiple columns to define multi-level partitions, just like column expressions for Group By clauses. The following example segments US legislators into partitions for each chamber of Congress, state and political party to determine the number of legislators for each party in each state and legislative chamber:
Windows as Relative Subsets Within Partitions
Windows define subsets of rows within a partition that are relative to the current row based on row position. You define windows by adding rows between to the partition definition, such as:
This example is centered where the number of preceding and following rows is equal. Windows can be asymmetric, using different numbers of preceding and following rows or omitting either. Analytic functions are applied to just those rows within the window based on the current row and the result is added to the current row.
Windows are useful for time-based datasets, although they are not limited to this. With time-based datasets, each row represents a different slice of data for a specific time. The following example uses windows with a time-based dataset to calculate moving averages:
The moving average is calculated over the rows in the window relative to the current row so each row potentially has a different moving average. Moving averages typically show a smoother trend for the column.
Windows to Show Selected Sibling Values
With window analytic functions, such as lag() or firstvalue(), you can select values for specific siblings for each row using window definitions. These functions return the column value for a sibling as shown in this example:
The lag() function selects the column value of the preceding sibling, while first_value() selects the column value of the first preceding sibling based on the partition or window definition. In this example, Prev is added to each row, using lag() to get the last sensor reading for the current row. First is also added to each row and gets the value 5 seconds previous, based on the window size (10 preceding rows) and readings every half second.
Here is an example of the results of this query: