Data Access as Streaming Datasets
Datasets loaded with the
<loadfrom> extension statement always use streaming, making some data available before all results have been received. Streaming also prevents the creation of a document object model (DOM). For examples, see
Group and Analyze Rows and
Group and Analyze Rows with Row Detail in Getting Started.
Datasets loaded with
<directinvoke>,
<sql> or <variable> statements also use streaming
ifstream = "true". For examples, see
A Basic
RAQL
Query and
Getting Started with
MashZone NextGen
Analytics in Getting Started and
Load Data with <sql>.
Query results from a
<raql> extension statement also can use streaming to populate the output variable if
stream = "true". For an example, see
Use an
In-Memory Store
to Store and Load Datasets for
MashZone NextGen
Analytics in Getting Started. Storing datasets to the
In-Memory Store with
<storeto> also always uses streaming.
There are two critical differences to keep in mind when accessing streamed data in a mashup:
Streamed data requires
RAQL to access the data. Since no DOM exists, the data is not accessible in other
EMML statements using XPath.
The scope for mashup variables that are used to hold streamed data is limited to
oneEMML extension statement. Data is streamed to the receiving statement and then
discarded.
There are a few ways to handle streamed access when a mashup needs to process a stream in several statements:
Save the streamed dataset as a document-type variable with a DOM and use the DOM with
EMML statements.
Load the dataset as a stream multiple times in a mashup.
Separate each process of the dataset stream into different mashups and call these mashups in another mashup.