XPath 1.0 and X-Query

This document briefly explains the concepts of XPath, which is the basis of the X-Query language. It then outlines the relationship between XPath 1.0 and X-Query.


XPath 1.0 in a Nutshell

You can use XPath to address parts of an XML document. XPath is also the basis for XML-related languages such as XSLT. In XPath, an XML document is regarded as a tree in document order (i.e. depth-first) containing seven different types of nodes. These node types are:

root node;
element node;
text node;
attribute node;
namespace node;
processing instruction node;
comment node.

You can address nodes of any type and number with the help of XPath expressions. The details of this data model are described in the section Data Model of the XPath specification.

The expression syntax of XPath includes location paths for addressing tree nodes, and function calls of a core library used for working with strings, numbers and booleans. A location path starts from either the document root node (absolute path) or the context node (relative path) and has one or more location steps. A location step consists of three parts:

  • an axis, which specifies the relationship between the set of selected nodes and the context node;

  • a node test, which specifies the type and name of the set of selected nodes; and

  • optional predicates, which further restrict the set of selected nodes.

There are thirteen axis directions, originating from the context node. The axis determines the initial node set, which is further refined by node tests and predicates. In XPath, you can specify a location path in either unabbreviated or abbreviated syntax. The following table lists the axes along with their direction (normal document order or reverse document order) and a short description. In the unabbreviated syntax a double colon '::' follows the name of the axis.

Axis Direction Meaning
ancestor:: reverse The parent node and its ancestors up to the root node
ancestor-or-self:: reverse The current node and its ancestors up to the root node
attribute:: implementation-defined All attached attribute nodes
child:: normal The immediate child nodes (default axis)
descendant:: normal All descendant child nodes
descendant-or-self:: normal The current node and all its descendant child nodes
following:: normal All nodes after the context node, excluding descendant nodes, attribute nodes and namespace nodes
following-sibling:: normal All following nodes that are siblings of the current node
namespace:: implementation-defined All attached namespace nodes
parent:: normal The parent node (or attaching node for attribute and namespace nodes)
preceding:: reverse All nodes before the context node, excluding descendant nodes, attribute nodes and namespace nodes
preceding-sibling:: reverse All preceding nodes that are siblings of the current node
self:: normal The current node

The node test determines the type and optionally the name of each node along the axis direction that is selected. For each axis, there is a principal node type: for the attribute axis, it is attribute; for the namespace axis, it is namespace; for other axes, it is element. You can select a node by applying one of the following node tests. The node is selected if the test evaluates to "true".

NodeTest Description
processing-instruction() A processing instruction node (regardless of name)
comment() A comment node
text() A text node
node() A node of any type (regardless of name)
processing-instruction('Literal') A processing instruction node with name Literal; if name is omitted, then the test is "true" for any processing instruction node
'Name' A node of the principal node type
'prefix:name' According to the axis used: an element node in the specified namespace, an attribute node in the specified namespace, or an empty node-set when using the namespace axis
'*' According to the axis used: all element nodes, all attribute nodes or all namespace nodes
'prefix:*' According to the axis used: all element nodes in the specified namespace, all attribute nodes in the specified namespace, or an empty node-set when using the namespace axis

The abbreviated syntax is as follows:

Abbreviation Description
no axis Nodes along the child:: axis satisfying node tests and optional predicates
@ Nodes along the attribute:: axis satisfying node tests and optional predicates
. The self::node(), which is the current node of any type
.. The parent::node(), which is the empty node-set if the current node is the root node; the attaching node if the current node is an attached node (of type attribute or namespace); otherwise the parent node
// /descendant-or-self::node()/ At the start of an expression, this denotes the absolute location path; elsewhere, it denotes the relative location path

The last, optional part of a location step is a predicate to further restrict the set of selected nodes according to a qualifying expression. This expression is enclosed in square brackets [ and ]. The resulting nodes are ordered according to the direction of the selected axis. If instead you use a node-set expression, then the nodes are ordered in document order. You can use three different types of values in a predicate expression:

  • A numeric value such as patient[2] or patient[last()]. The value is the proximity position of the node in the set, beginning with 1 for the originating context node.

  • A node-set expression such as medication[type] or type[@brand]. This predicate is true if the node set returned is not empty.

  • An expression such as medication[count(type)>2]. This predicate is true only if the expression is true.

See the XPath specification for more details. You can find examples based on the patient data set in the section Querying XML Sample Documents and in the reference section for the respective language elements.

From XPath to X-Query

This section documents the differences between expressions in X-Query and expressions in XPath. Since X-Query is based on the XPath specification, there are many more similarities than differences. However, X-Query differs from XPath in the following aspects:

  • The following XPath functions are supported:

    boolean
    ceiling
    count
    false
    floor
    last
    name
    not
    number
    position
    round
    starts-with
    string
    sum
    true

  • The following XPath functions are currently not supported, but can be implemented using Tamino server extensions, as long as they do not have a variable number of arguments:

    concat
    contains
    id
    lang
    local-name
    namespace-uri
    normalize-space
    string-length
    substring
    substring-after
    substring-before
    translate

  • X-Query supports the following additional operators that are not present in XPath:

    adj
    after
    before
    between
    intersect
    near
    sortby

  • X-Query has a binary "contains" operator ~= for text retrieval. There is no equivalent in XPath.

  • X-Query does not support the unabbreviated syntax of location paths using named axes.

  • X-Query does not support the use of variables.

XPath/X-Query and XML Schema

Historically, the development of XPath preceded the development of XML Schema; therefore, with a few exceptions, X-Query (which is closely related to XPath) is not related to XML Schema. In particular, an X-Query expression operates on string values or numeric values but not on typed values; in other words, an X-Query expression does not make use of the type information that is stored in the schema. For a strongly-typed view of the data, please consider using Tamino XQuery.