Normalizing Data for Joins, Grouping or Filtering
In many cases, the results from different mashables may contain similar data that is not expressed in identical forms. This makes the comparisons used in joins, grouping or filtering difficult or impossible until you normalize the data to a single representation.
When you need to normalize data for comparisons, the best method is to create a custom XPath function that you add to the Presto Server. This allows you to:
Use the custom function directly in the XPath expression that is comparing the disparate data. Thus the comparison is handled properly, but mashable results are not altered.
Reuse data normalization logic in any mashup that you publish to the
Presto Server where the custom function is deployed.
Note: | If the data normalization logic is specific to one mashup and you have no need to reuse this, you can also do data normalization using scripting and the <script> statement. |
See
Defining Custom XPath Functions for complete instructions on how to write custom XPath functions and deploy them in the
Presto Server. Once you have your custom function deployed, you simply declare a namespace for the function in any mashup script or macro and use the function in the appropriate mashup statement.
Example Custom XPath Function for Data Normalization
This example shows a very simple data normalization function and the mashup script that uses the custom function to join the results of two mashables. The example joins mortgage rate information from two web sites. One site refers to the APR, but the second uses custom terms specific to their mortgage products to refer to rates. To combine the results for comparison, the custom terminology must be normalized.
First, you create the custom XPath function logic in a Java class that extends org.oma.emml.client.EMMLUserFunction. This class looks something like this example:
package com.mycompany.mashups;
import org.oma.emml.client.EMMLUserFunction
...
public class MyFinanceFunctions extends EMMLUserFunction {
static Set mortgageAliases = new HashSet();
static { mortgageAliases.add("5/1 Orange Mortgage"); }
public static String mortgage(String data) {
if (mortgageAliases.contains(data))
return "5-Year ARM";
return data; }
}
Then compile and deploy the class to the
Presto Server that will host mashups that need to use this function. See
Defining Custom XPath Functions for instructions.