Mashups in EMML : Advanced Mashup Techniques : Handling HTML Responses
Handling HTML Responses
Document type variables in EMML typically contain well-formed XML. Responses from mashables or direct web clipping can return HTML.
With HTML responses, the MashZone NextGen Server converts the response to well-formed XHTML to ensure access to the data. Two issues can arise during parsing:
*Parsing errors for some responses. You can use the subtype attribute to handle parsing errors. See Subtype for Parsing Errors for an example.
*Differences in the XHTML result for mashups from 3.7 or earlier.
The parser that MashZone NextGen uses for conversion to XHTML changed in version 3.8. This can cause errors in mashups created in version 3.7 or earlier.
You can control which parser is used in the mashup using the htmlparser attribute. See JTidy for Backwards Compatible Parsing for an example.
Subtype for Parsing Errors
To prevent parsing errors for HTML responses, use subtype="HTML" on <output> or <variable>. For example:
<mashup xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xsi:schemaLocation="http://www.openmashup.org/schemas/v1.0/EMML/
../schemas/EMMLSpec.xsd"
xmlns="http://www.openmashup.org/schemas/v1.0/EMML"
name="GoogleWebClipping">

<input name="query" type="string" default="ruby"></input>
<output name="result" type="document"/>
<variable name="searchresult1" type="document" subtype="HTML"/>

<directinvoke outputvariable = "searchresult1"
endpoint="http://www.google.com/search?q={$query}"/>

<foreach variable="query" items="$searchresult1//xhtml:a[@class]">
<appendresult outputvariable="result">
<itemlink>{$query/@href}</itemlink>
</appendresult>
</foreach>
</mashup>
JTidy for Backwards Compatible Parsing
In versions 3.7 and earlier, MashZone NextGen used JTidy to parse HTML responses and convert them to well-formed XHTML. For versions 3.8 and later, MashZone NextGen uses TagSoup for parsing and converting HTML responses.
This change in parser may cause parsing errors or other problems in mashups from version 3.7 and earlier. You can choose to use JTidy to ensure backwards compatibility with htmlparser = "jtidy". For example:
<mashup xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xsi:schemaLocation="http://www.openmashup.org/schemas/v1.0/EMML/
../schemas/EMMLSpec.xsd"
xmlns="http://www.openmashup.org/schemas/v1.0/EMML"
name="JTidyParsing">

<input name="query" type="string" default="scala"/>
<output name="result" type="document"/>

<directinvoke outputvariable = "searchresult" htmlparser="jtidy"
endpoint="http://www.google.com/search?q={$query}"/>

<foreach variable="query" items="$searchresult//xhtml:h3[@class='r']/xhtml:a">
<appendresult outputvariable="result">
<itemlink>{$query/@href}</itemlink>
</appendresult>
</foreach>
</mashup>
Copyright © 2013-2016 Software AG, Darmstadt, Germany.

Product LogoContact Support   |   Community   |   Feedback