ino:explain

Retrieve information about query execution for analysis and optimization.

[ Syntax | Description | Compatibility | Example ]

Syntax

ino:explain (query[, level])

Description

This function provides information about the execution plan of a given query. As it is a Tamino-internal function, it has the namespace prefix ino. It takes as argument any valid query expression (OrExpr) and an optional level of explanation, which can be one of the values "path" and "tree". If level is omitted, then basic information about the processing steps involved is provided. It returns information about the execution plan of the query inside the regular <xql:result> node of the standard <ino:response>. This information is wrapped up in a new element <ino:explanation>.

The execution time of a query depends on the number and kind of processes that are needed to resolve the query. A query is processed in Tamino as follows:

Query Parser
It takes as input the query string and parses it. If it does not conform to the syntax rules of X-Query then an error message will be returned indicating the type of error. If the query can be successfully parsed, it delivers an input tree for the optimizer.
Query Optimizer
The optimizer tries to optimize queries on the level of X-Query by applying a number of transformations and using the information from the corresponding schema. It performs amongst others the following kinds of transformation as far as they are applicable:
- descendants expansion: abbreviated relative or absolute location paths are expanded into a disjunction of unabbreviated paths
- wildcard expansion: if you use * in a NameTest, then it will be expanded into a disjunction of all matched nodes. (e.g. insurance/* would be expanded into insurance/company or insurance/policynumber)
- path evaluation in predicate expressions: reformulate path references such as a/b/c[../d] into a/b[d]/c
- not() replacement: applying de Morgan rules, the optimizer tries to replace expressions that use a call of the not() function (e.g. an expression such as not(surname = "Atkins") would be transformed into surname != "Atkins"
Example: You use an abbreviated location path in your query such as in patient[.//surname ~= 'Atkins']. From the schema, the optimizer detects that the element surname can occur at six different positions in a document tree: patient/name/surname, patient/nextofkin/name/surname, and doctor/name/surname, where doctor can appear under patient/submitted, result/discharged, result/transferred, and result/deceased. The optimizer then transforms the filter expression into the following disjunction:
```
patient[./name/surname ~= 'Atkins' or
        ./nextofkin/name/surname ~= 'Atkins' or
        ./submitted/doctor/name/surname ~= 'Atkins' or
        ./result/discharged/doctor/name/surname ~= 'Atkins' or
        ./result/transferred/doctor/name/surname ~= 'Atkins' or
        ./result/deceased/doctor/name/surname ~= 'Atkins']
```
Instead of searching the complete document tree only these nodes must be visited to see if the predicate expression holds. As a result, the optimizer delivers a modified tree.
Processor-specific Optimizer
This component optimizes the tree with regard to the special needs of the next processing components. For both, the index processor and the postprocessor, a tree will be generated that best suits their needs.
Index Processor
This component evaluates all predicates containing indexed element nodes. It is further responsible for accessing documents and schemas from the database as well as for composing the XML document that contains the query result. It is possible that the index processor creates a superset of the query result which then has to be restricted in the next step. If no further processing is necessary, then the index processor returns the result set as an instance of xql:result.
Postprocessor
Since typically not every node is indexed, there is another processing stage that evaluates expression with non-indexed nodes. The postprocessor also restricts the result set if the index processor generated a superset by scanning a doctype or collection. Furthermore it also makes calls to any query functions. If invoked it will return the complete query result.

A call to ino:explain provides information about which processing components are involved, to what degree the query can be optimized, and the work load of the index processor and the postprocessor. According to the selected explanation level a different amount of information is returned inside an element called ino:explanation. This element uses two attributes, ino:document_processing and ino:preselection that are used as flags and indicate the way the query is processed. Inside ino:explanation a set of elements can appear that share the namespace prefix xop. They correspond to expressions in X-Query. For example, the element xop:matches represents the match operator '~=', and the node <xop:literal xop:value="Atkins" /> represents the literal string constant "Atkins". The sections below describing the explanation levels contain more information about which principal elements and attributes of the xop namespace are important and how they can be used for the purpose of query analysis and optimization.

Note:
The information that is returned by a call of ino:explain() shows the internal structure of query processing and is subject to change without prior notice if this is necessary because of improvements in the underlying mechanism.

No Explanation Level

In the query result only ino:explanation appears along with its two required attributes. They mean:

ino:preselection: indicates whether a full scan of the doctype or collection will be performed for the given query. If "TRUE" there is some restriction in the query that can be processed by the index processor, which means that there may be documents that can be rejected without calling the postprocessor. "FALSE" indicates a full scan of the doctype or collection.
ino:postprocessing: if "TRUE" then the postprocessor will be called.

To retrieve the execution plan for a query looking for patient whose surname contains "Atkins":

ino:explain(patient[.//surname ~= "Atkins"])

The result from the server looks like this (only showing the relevant <xql:result> node):

<xql:result>
  <ino:explanation ino:preselection="TRUE" ino:postprocessing="TRUE" />
</xql:result>

Explanation Level "path"

This level shows the query after the optimizer run. Each step of a location path is represented by its own xop:path element. Nesting of xop:path elements means traversing the location path one step further along the child axis. Any instance of xop:path uses these attributes:

xop:name: name of the element, always present
xop:searchtype: search type of the element as defined in the schema, only present if there are no child elements
xop:maptype: mapping type of the element as defined in the schema, if none is defined, the value is "no".

Using the example from above the returned ino:explanation node contains the following series of xop:path elements (the larger middle part deleted for brevity):

<xop:path xop:name="patient" xop:maptype="native">
  <xop:path xop:name="name" xop:maptype="infofield">
    <xop:path xop:name="surname" xop:maptype="infofield" />
  </xop:path>
  <xop:path xop:name="nextofkin" xop:maptype="infofield">
    <xop:path xop:name="name" xop:maptype="infofield">
      <xop:path xop:name="surname" xop:maptype="infofield" />
    </xop:path>
  </xop:path>
  ...
  <xop:path xop:name="address" xop:maptype="infofield" />
</xop:path>

The "patient" element contains the element name which in turn contains the element surname, each of them with their schema mapping type definition in the xop:maptype attribute.

Explanation Level "tree"

The structure of ino:explanation acknowledges the query trees that are built and modified during query processing. For each tree that is used during processing, there is a corresponding xop:querytree element that are distinguished by the attribute xop:treetype as follows.

Input Tree for Optimizer
This tree represents the original query and is always included in the output of ino:explain(). For example the query patient[.//surname ~= "Atkins"] is represented as follows (nested elements are indented for better readability):
```
<xop:querytree xop:treetype="input tree for optimization">
  <xop:list_context xop:collection="Patient" />
  <xop:element_children />
  <xop:nametest xop:name="patient" />
  <xop:filter>
    <xop:matches>
      <xop:transparent>
        <xop:curcontext />
        <xop:descendant_elem />
        <xop:nametest xop:name="surname" />
      </xop:transparent>
      <xop:literal xop:value="Atkins" />
    </xop:matches>
  </xop:filter>
  <xop:element_children />
  <xop:nametest xop:name="address" />
</xop:querytree>
```
From the collection "Patient" (<xop:list_context xop:collection="Patient" />), those child element nodes whose name equals "patient" (<xop:nametest xop:name="patient" />) are selected that satisfy the condition set in the filter expression (xop:filter). The filter contains an expression with the match operator (xop:matches) with two operands appearing in document order. The left operand is enclosed in xop:transparent as a sequence of elements that selects starting from the context node <xop:curcontext /> any descendant elements (<xop:descendant_elem />) whose name equals "surname" (<xop:nametest xop:name="surname" />). If the value of these descendant "surname" element nodes matches the literal value "Atkins" (<xop:literal xop:value="Atkins" />), then they satisfy the filter expression and form the node set from which all child element nodes with the name "address" should be selected as result of the query.

Output Tree from Optimizer
This tree is not returned when the optimizer run yields an empty result. Using our previous example, the optimizer converts the original query into a query containing a disjunction as outlined above. An excerpt of the resulting query tree is shown below. Only the first two clauses of the disjunction (xop:or) are complete; the structure of the other four clauses is "folded" and indicated by an ellipsis:

<xop:querytree xop:treetype="output tree from optimization">
  <xop:list_context xop:collection="Patient" />
  <xop:element_children />
  <xop:nametest xop:name="patient" xop:maptype="native" xop:key="id0000000181" />
  <xop:filter>
    <xop:or xop:preselectable="FALSE">
      <xop:matches xop:preselectable="FALSE">
        <xop:transparent>
          <xop:nametest xop:name="name" xop:maptype="infofield" xop:key="id0000000183" />
          <xop:element_children />
          <xop:nametest xop:name="surname" xop:maptype="infofield" xop:key="id0000000184" />
        </xop:transparent>
        <xop:literal xop:value="Atkins" />
      </xop:matches>
      <xop:matches xop:preselectable="FALSE">
        <xop:transparent>
          <xop:nametest xop:name="nextofkin" xop:maptype="infofield" xop:key="id0000000201" />
          <xop:element_children />
          <xop:nametest xop:name="name" xop:maptype="infofield" xop:key="id0000000203" />
          <xop:element_children />
          <xop:nametest xop:name="surname" xop:maptype="infofield" xop:key="id0000000204" />
        </xop:transparent>
        <xop:literal xop:value="Atkins" />
      </xop:matches>
      <xop:matches xop:preselectable="FALSE"> ...
      <xop:matches xop:preselectable="FALSE"> ...
      <xop:matches xop:preselectable="FALSE"> ...
      <xop:matches xop:preselectable="FALSE"> ...
    </xop:or>
  </xop:filter>
  <xop:element_children />
  <xop:nametest xop:name="address" xop:maptype="infofield" xop:key="id0000000190" />
</xop:querytree>

In addition to the transformation of the query, the following attributes have been added:

xop:key has as value the internal ID assigned to this node.
xop:maptype holds schema, possible values are "infofield", "native".
xop:preselectable indicates with the Boolean values "TRUE" and "FALSE" if this part of the query can be processed by the index processor.

Output Tree for Index Processor
This is the tree produced by the process-specific optimizer to be used by the index processor. It is not returned if database access is not necessary.

<xop:querytree xop:treetype="output tree for index processor">
  <xop:list_context xop:collection="Patient" />
  <xop:element_children />
  <xop:nametest xop:name="patient" xop:maptype="native" xop:key="id0000000181" />
  <xop:element_children />
  <xop:nametest xop:name="address" xop:maptype="infofield" xop:key="id0000000190" />
</xop:querytree>

Output Tree for Postprocessor
This is the tree produced by the process-specific optimizer to be used for the post processor. It is not returned if postprocessing is not necessary.

<xop:querytree xop:treetype="output tree for document processor">
  <xop:list_context xop:collection="Patient" />
  <xop:element_children />
  <xop:nametest xop:name="patient" />
  <xop:filter>
    <xop:matches>
      <xop:transparent>
        <xop:curcontext />
        <xop:descendant_elem />
        <xop:nametest xop:name="surname" />
      </xop:transparent>
      <xop:literal xop:value="Atkins" />
    </xop:matches>
  </xop:filter>
  <xop:element_children />
  <xop:nametest xop:name="address" />
</xop:querytree>

Compatibility

Neither in XPath nor in XSLT is there an equivalent for this Tamino-specific function.

Example

Retrieve information about the execution plan of a query looking for patients whose surnames contain "Atkins":

ino:explain(patient[.//surname ~= "Atkins"])

The result from the server looks as follows (only showing the relevant <xql:result> node):

<xql:result>
  <ino:explanation ino:preselection="FALSE" ino:postprocessing="TRUE"/>
</xql:result>

So there is no index on the surname element node and a postprocessor run is necessary. You can do the following to minimize processing costs:

If you know the schema, rewrite the abbreviated path:
```
patient[name/surname ~= 'Atkins']
```
If you know that the string value you're looking for is the complete value, then use a standard quality operator such as:
```
patient[name/surname = 'Atkins']
```
Define an index onto surname.