Retrieve information about query execution for analysis and optimization.
ino:explain (query[, level])
This function provides information about the execution plan of a given
query. As it is a Tamino-internal function, it has
the namespace prefix ino
. It takes as argument any valid query
expression (OrExpr
) and an optional
level of explanation, which can be one of the values
"path" and "tree". If
level is omitted, then basic information about the
processing steps involved is provided. It returns information about the
execution plan of the query inside the regular <xql:result>
node of the standard <ino:response>
. This information is
wrapped up in a new element <ino:explanation>
.
The execution time of a query depends on the number and kind of processes that are needed to resolve the query. A query is processed in Tamino as follows:
Query Parser
It takes as input the query string and parses it. If it does not
conform to the syntax rules of X-Query then an error message will be returned
indicating the type of error. If the query can be successfully parsed, it
delivers an input tree for the optimizer.
Query Optimizer
The optimizer tries to optimize queries on the level of X-Query by
applying a number of transformations and using the information from the
corresponding schema. It performs amongst others the following kinds of
transformation as far as they are applicable:
descendants expansion: abbreviated relative or absolute location paths are expanded into a disjunction of unabbreviated paths
wildcard expansion: if you use *
in a
NameTest
, then it will be expanded into a disjunction of all
matched nodes. (e.g. insurance/*
would be expanded into
insurance/company or insurance/policynumber)
path evaluation in predicate expressions: reformulate path
references such as a/b/c[../d]
into a/b[d]/c
not()
replacement: applying de Morgan rules, the
optimizer tries to replace expressions that use a call of the
not()
function (e.g. an expression such as not(surname =
"Atkins")
would be transformed into surname != "Atkins"
Example: You use an abbreviated location path in your query such as
in patient[.//surname ~= 'Atkins']
. From the schema, the optimizer
detects that the element surname
can occur at six different
positions in a document tree: patient/name/surname
,
patient/nextofkin/name/surname
, and
doctor/name/surname
, where doctor
can appear under
patient/submitted
, result/discharged
,
result/transferred
, and result/deceased
. The
optimizer then transforms the filter expression into the following
disjunction:
patient[./name/surname ~= 'Atkins' or ./nextofkin/name/surname ~= 'Atkins' or ./submitted/doctor/name/surname ~= 'Atkins' or ./result/discharged/doctor/name/surname ~= 'Atkins' or ./result/transferred/doctor/name/surname ~= 'Atkins' or ./result/deceased/doctor/name/surname ~= 'Atkins']
Instead of searching the complete document tree only these nodes must be visited to see if the predicate expression holds. As a result, the optimizer delivers a modified tree.
Processor-specific Optimizer
This component optimizes the tree with regard to the special needs of
the next processing components. For both, the index processor and the
postprocessor, a tree will be generated that best suits their needs.
Index Processor
This component evaluates all predicates containing indexed element
nodes. It is further responsible for accessing documents and schemas from the
database as well as for composing the XML document that contains the query
result. It is possible that the index processor creates a superset of the query
result which then has to be restricted in the next step. If no further
processing is necessary, then the index processor returns the result set as an
instance of xql:result
.
Postprocessor
Since typically not every node is indexed, there is another processing
stage that evaluates expression with non-indexed nodes. The postprocessor also
restricts the result set if the index processor generated a superset by
scanning a doctype or collection. Furthermore it also makes calls to any query
functions. If invoked it will return the complete query result.
A call to ino:explain
provides information about which
processing components are involved, to what degree the query can be optimized,
and the work load of the index processor and the postprocessor. According to
the selected explanation level a different amount of information is returned
inside an element called ino:explanation
. This element uses two
attributes, ino:document_processing
and
ino:preselection
that are used as flags and indicate the way the
query is processed. Inside ino:explanation
a set of elements can
appear that share the namespace prefix xop
. They correspond to
expressions in X-Query. For example, the element xop:matches
represents the match operator '~=
', and the node
<xop:literal xop:value="Atkins" />
represents the literal
string constant "Atkins". The sections below
describing the explanation levels contain more information about which
principal elements and attributes of the xop
namespace are
important and how they can be used for the purpose of query analysis and
optimization.
Note:
The information that is returned by a call of
ino:explain()
shows the internal structure of query processing and
is subject to change without prior notice if this is necessary because of
improvements in the underlying mechanism.
In the query result only ino:explanation
appears along with its two required attributes. They mean:
ino:preselection
: indicates whether a full
scan of the doctype or collection will be performed for the given query. If
"TRUE" there is some restriction in the query that
can be processed by the index processor, which means that there may be
documents that can be rejected without calling the postprocessor.
"FALSE" indicates a full scan of the doctype or
collection.
ino:postprocessing
: if
"TRUE" then the postprocessor will be called.
To retrieve the execution plan for a query looking for patient whose surname contains "Atkins":
ino:explain(patient[.//surname ~= "Atkins"])
The result from the server looks like this (only showing the relevant
<xql:result>
node):
<xql:result> <ino:explanation ino:preselection="TRUE" ino:postprocessing="TRUE" /> </xql:result>
This level shows the query after the optimizer run. Each step of a
location path is represented by its own xop:path
element. Nesting
of xop:path
elements means traversing the location path one step
further along the child axis. Any instance of xop:path
uses these
attributes:
xop:name
: name of the element, always
present
xop:searchtype
: search type of the element
as defined in the schema, only present if there are no child elements
xop:maptype
: mapping type of the element as
defined in the schema, if none is defined, the value is
"no".
Using the example from above the returned ino:explanation
node contains the following series of xop:path
elements (the
larger middle part deleted for brevity):
<xop:path xop:name="patient" xop:maptype="native"> <xop:path xop:name="name" xop:maptype="infofield"> <xop:path xop:name="surname" xop:maptype="infofield" /> </xop:path> <xop:path xop:name="nextofkin" xop:maptype="infofield"> <xop:path xop:name="name" xop:maptype="infofield"> <xop:path xop:name="surname" xop:maptype="infofield" /> </xop:path> </xop:path> ... <xop:path xop:name="address" xop:maptype="infofield" /> </xop:path>
The "patient" element contains the element
name which in turn contains the element surname, each of them with their schema
mapping type definition in the xop:maptype
attribute.
The structure of ino:explanation
acknowledges the query
trees that are built and modified during query processing. For each tree that
is used during processing, there is a corresponding xop:querytree
element that are distinguished by the attribute xop:treetype
as
follows.
Input Tree for Optimizer
This tree represents the original query and is always included in the
output of ino:explain()
. For example the query
patient[.//surname ~= "Atkins"]
is represented as follows (nested
elements are indented for better readability):
<xop:querytree xop:treetype="input tree for optimization"> <xop:list_context xop:collection="Patient" /> <xop:element_children /> <xop:nametest xop:name="patient" /> <xop:filter> <xop:matches> <xop:transparent> <xop:curcontext /> <xop:descendant_elem /> <xop:nametest xop:name="surname" /> </xop:transparent> <xop:literal xop:value="Atkins" /> </xop:matches> </xop:filter> <xop:element_children /> <xop:nametest xop:name="address" /> </xop:querytree>
From the collection "Patient"
(<xop:list_context xop:collection="Patient" />
), those child
element nodes whose name equals "patient"
(<xop:nametest xop:name="patient" />
) are selected that
satisfy the condition set in the filter expression (xop:filter
).
The filter contains an expression with the match operator
(xop:matches
) with two operands appearing in document order. The
left operand is enclosed in xop:transparent
as a sequence of
elements that selects starting from the context node <xop:curcontext
/>
any descendant elements (<xop:descendant_elem
/>
) whose name equals "surname"
(<xop:nametest xop:name="surname" />
). If the value of these
descendant "surname" element nodes matches the
literal value "Atkins" (<xop:literal
xop:value="Atkins" />
), then they satisfy the filter expression and
form the node set from which all child element nodes with the name
"address" should be selected as result of the
query.
Output Tree from Optimizer
This tree is not returned when the optimizer run yields an empty
result. Using our previous example, the optimizer converts the original query
into a query containing a disjunction as outlined above. An excerpt of the
resulting query tree is shown below. Only the first two clauses of the
disjunction (xop:or
) are complete; the structure of the other four
clauses is "folded" and indicated by an ellipsis:
<xop:querytree xop:treetype="output tree from optimization"> <xop:list_context xop:collection="Patient" /> <xop:element_children /> <xop:nametest xop:name="patient" xop:maptype="native" xop:key="id0000000181" /> <xop:filter> <xop:or xop:preselectable="FALSE"> <xop:matches xop:preselectable="FALSE"> <xop:transparent> <xop:nametest xop:name="name" xop:maptype="infofield" xop:key="id0000000183" /> <xop:element_children /> <xop:nametest xop:name="surname" xop:maptype="infofield" xop:key="id0000000184" /> </xop:transparent> <xop:literal xop:value="Atkins" /> </xop:matches> <xop:matches xop:preselectable="FALSE"> <xop:transparent> <xop:nametest xop:name="nextofkin" xop:maptype="infofield" xop:key="id0000000201" /> <xop:element_children /> <xop:nametest xop:name="name" xop:maptype="infofield" xop:key="id0000000203" /> <xop:element_children /> <xop:nametest xop:name="surname" xop:maptype="infofield" xop:key="id0000000204" /> </xop:transparent> <xop:literal xop:value="Atkins" /> </xop:matches> <xop:matches xop:preselectable="FALSE"> ... <xop:matches xop:preselectable="FALSE"> ... <xop:matches xop:preselectable="FALSE"> ... <xop:matches xop:preselectable="FALSE"> ... </xop:or> </xop:filter> <xop:element_children /> <xop:nametest xop:name="address" xop:maptype="infofield" xop:key="id0000000190" /> </xop:querytree>
In addition to the transformation of the query, the following attributes have been added:
xop:key
has as value the internal ID assigned to this
node.
xop:maptype
holds schema, possible values are
"infofield",
"native".
xop:preselectable
indicates with the Boolean values
"TRUE" and "FALSE" if
this part of the query can be processed by the index processor.
Output Tree for Index Processor
This is the tree produced by the process-specific optimizer to be used
by the index processor. It is not returned if database access is not
necessary.
<xop:querytree xop:treetype="output tree for index processor"> <xop:list_context xop:collection="Patient" /> <xop:element_children /> <xop:nametest xop:name="patient" xop:maptype="native" xop:key="id0000000181" /> <xop:element_children /> <xop:nametest xop:name="address" xop:maptype="infofield" xop:key="id0000000190" /> </xop:querytree>
Output Tree for Postprocessor
This is the tree produced by the process-specific optimizer to be used
for the post processor. It is not returned if postprocessing is not
necessary.
<xop:querytree xop:treetype="output tree for document processor"> <xop:list_context xop:collection="Patient" /> <xop:element_children /> <xop:nametest xop:name="patient" /> <xop:filter> <xop:matches> <xop:transparent> <xop:curcontext /> <xop:descendant_elem /> <xop:nametest xop:name="surname" /> </xop:transparent> <xop:literal xop:value="Atkins" /> </xop:matches> </xop:filter> <xop:element_children /> <xop:nametest xop:name="address" /> </xop:querytree>
Neither in XPath nor in XSLT is there an equivalent for this Tamino-specific function.
Retrieve information about the execution plan of a query looking for patients whose surnames contain "Atkins":
ino:explain(patient[.//surname ~= "Atkins"])
The result from the server looks as follows (only showing the relevant
<xql:result>
node):
<xql:result> <ino:explanation ino:preselection="FALSE" ino:postprocessing="TRUE"/> </xql:result>
So there is no index on the surname
element node and a
postprocessor run is necessary. You can do the following to minimize processing
costs:
If you know the schema, rewrite the abbreviated path:
patient[name/surname ~= 'Atkins']
If you know that the string value you're looking for is the complete value, then use a standard quality operator such as:
patient[name/surname = 'Atkins']
Define an index onto surname
.