The RAQL Query Syntax
Once you have the dataset loaded, you can use RAQL queries to analyze and manipulate the data. RAQL is very similar to the Structured Query Language (SQL) used to access data in a database. It is:
Case-insensitive.
WHERE,
where and
WhEre are all valid.
A focused subset of SQL features. Some SQL clauses and syntax are not found in
RAQL.
Designed to handle streaming data.
Like SQL, a RAQL query is composed of clauses:
We will explore a simple use of most of these query clauses in this topic. See
RAQL
Queries for a synopsis of the valid expressions for each of these clauses along with links to other examples.
As with SQL, the Select and From clauses are required. All other clauses are optional.
The Select clause determines which columns to include in the result and can also perform analysis when it is used with either the Over clause or the Group By clause. The From clause determines which dataset to query, or can define a subquery to use as the source of data.
Set operation clauses allow the query to retrieve an additional dataset, using another query, and then join, combine or filter these datasets to derive a more complex dataset. The Join set operation, for example, matches rows in both datasets based on a condition and adds columns from both datasets to the joined row.
The Where clause filters rows from the dataset. Order By sorts the result rows. And Limit determines the maximum number of result rows that the query can return.
The Over and Group By clauses both group dataset rows into different sets based on an expression. These groups determine the scope of rows that are used in analytic functions in the Select clause. The Having clause filters the set of groups that are returned in a Group By clause.
Over and Group By are mutually exclusive as they have different affects on the data returned by the query. The Over clause performs calculations and adds the calculations as additional columns to each row. Group By instead performs calculations and returns just the calculations for each group.
Most RAQL clauses also support the use of functions within their expressions. RAQL functions come in two varieties:
Plain functions, that perform some simple transformation to the values of a column for each row, such as
upper() to change text to upper-case.
Analytic functions, more commonly known as aggregate or window functions, perform calculations using multiple rows in a group, partition or window defined in the Over or Group By clauses.
Aggregate analytic functions use all rows in the current scope, such as sum(). While window analytic functions use specific rows, such as rownumber(). These functions include simple arithmetic as well as statistical functions, machine learning functions or other analysis algorithms.
RAQL provides a set of built-in functions (plain and analytical) as well as a way for you to define your own functions. See
Built-In
RAQL
Functions for more information.