Append Query Results Repeatedly
In cases where a source dataset is regularly updated, updates can be appended to the existing data in the In-Memory Store.
The following example mimics time-based updates by appending rows to an existing
In-Memory Store using the
EMML <foreach> statement. This statement loops through the Manufacturing Plants dataset (introduced in
Getting Started with
MashZone NextGen
Analytics) and selects rows for a given country. These rows are then appended to an existing
In-Memory Store named
storeAppendPlants.
The example then pauses to ensure that the rows appended are distinct based on their timestamp, as shown in this example of the results:
Some points to keep in mind in this example:
The first loop of this mashup creates the
In-Memory Store the first time the mashup is run.
The JavaScript function in this example is used solely for effect, to ensure distinctly different timestamps for each append to the
In-Memory Store.
The complete EMML code for this example is shown here.
Note: | Be aware that this mashup will take over 50 seconds to run because of the deliberate pauses. |
<mashup xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
xsi:schemaLocation='http://www.openmashup.org/schemas/v1.0/EMML/../schemas/EMMLSpec.xsd'
xmlns='http://www.openmashup.org/schemas/v1.0/EMML'
xmlns:macro='http://www.openmashup.org/schemas/v1.0/EMMLMacro'
name='storeAppend'>
<output name='result' type='document' />
<variable name="plants" type="document" subType="csv"/>
<!-- Countries to interate over -->
<variable name="countries" type="document">
<countries>
<country>BELGIUM</country>
<country>CANADA</country>
<country>FRANCE</country>
<country>INDIA</country>
<country>ITALY</country>
<country>JAPAN</country>
<country>NETHERLANDS</country>
<country>SPAIN</country>
<country>SWEDEN</country>
<country>UNITED KINGDOM</country>
</countries>
</variable>
<variable name="searchFor" type="string" />
<!-- Loop for each country, retrieve plants for country and append to
in-memory store, pause 5 seconds so timestamps are distinct -->
<foreach variable='location' items='$countries/countries/country' >
<variable name="found" type="document" stream="true"/>
<assign fromexpr="$location/text()" outputvariable="$searchFor"/>
<directinvoke method='GET' stream='true' outputvariable='plants'
endpoint='http://mdc.jackbe.com/prestodocs/data/mfgplants.csv' />
<raql stream="true" outputvariable='found'>
select Country, Name, Active_Production_Lines,
Production_Lines_Under_Construction from plants
where Country = '{$location}'
</raql>
<storeto cache='storeAppendPlants' key='#unique' variable='found' />
<script type='text/javascript' outputvariable="result">
<![CDATA[
function pause(millis){
var date = new Date();
var curDate = null;
do { curDate = new Date(); }
while(curDate-date < millis);
}
pause(5000);
]]>
</script>
</foreach>
</mashup>
About Row Timestamps for Stored Datasets
In addition to keys, RAQL adds a timestamp to each row of a dataset for the date and time that row was added to the In-Memory Store. In some cases, datasets already have a column with a timestamp related to the original source of the data.
Because of potential name conflicts, the actual column name for the In-Memory Store timestamp, varies based on the dataset in question and the storage mode (append or overwrite):
If Existing Dataset | And Store Mode is | In-Memory Store Timestamp Column |
Has no column named timestamp. | Overwrite or append. | timestamp |
Has a column named timestamp. | Append rows to an existing dataset key. | timestamp In this case, the timestamp for the In-Memory Store will overwrite the original source timestamp, unless you provide an alias for the original column. |
Overwrite existing dataset. | _timestamp |