Supported Data Formats for RAQL

RAQL can perform queries on datasets in the following formats:

The first line in a CSV dataset must identify column names for the data.

RAQL replaces any white space character in column names with an underscore (_). For example, "First Name" becomes First_Name.

Column names that are numeric have "column_" prepended to the name. For example, "2010" becomes "column_2010".

JDBC Result Sets: returned from databases when SQL queries or stored procedures are invoked.

XML: data must be well-formed. In addition, RAQL has the following limitations for XML data:

The structure of the XML should be flat, with a single set of repeating nodes (the rows) that contain a single level of elements (the columns) with simple content (text only). Data in any nodes that are ancestors of the repeating 'rows' is not accessible.

Data in attributes may not be accessible in some situations.

JSON: data must be well-formed. The structure of the JSON should be flat, with a single array of objects (the rows) that contain name/value pairs (the columns) with simple content (number, string, boolean). Data in any objects that are ancestors of the repeating 'rows' is not accessible.

{
"records": {
"record": [
{
"itemId": "N2390",
"price": 145.2,
...
},
{
"itemId": "G88",
"price": 16.95,
...
},
...
]
}
}

Assuming that the above JSON data is available in file sales.json, the following EMML sample executes RAQL on it:

Java Objects: loaded in In-Memory Stores by external systems. Java objects must:

Be plain Java objects or beans with properties for each column of data in the dataset.

Be serializable. This is required when In-Memory Stores use both local memory for the MashZone NextGen Server and memory from additional BigMemory hosts. See In-Memory Dataset Management for more information.

Have search attributes defined in the configuration for the declared In-Memory Store where they will be stored. Search attributes provide the extraction class and other information that maps Java object properties to dataset columns and allows RAQL to access and work with the data.