In this document you will learn about the nuts and bolts of Tamino XQuery. It will pave the way for a solid understanding of the whole language.
In XQuery, you use expressions. Expressions can be of different kinds, some of which can be nested in a general way. Each XQuery operator and function expects its operands to be of a certain type. This makes XQuery a functional, strongly-typed language.
Every expression evaluates to a sequence, which is an ordered collection of items. An item is either an atomic value or a node. An atomic value does not contain any other value and is either a primitive data type or a derived data type as defined in XML Schema. A node is one of the seven kinds element, attribute, namespace, text, comment, processing instruction or document node. It has an identity, because its creation is independent of its value.
A sequence can be empty, consist of only a single item (singleton sequence) or more items. Sequences have the following properties:
Sequences are ordered.
(input()/bib/book/author/first, input()/bib/book/author/last)
Even if last
elements appear before
first
elements in the document, in this sequence the
order is as follows: first first
elements, then
last
elements. The comma serves as concatenation
operator on sequences.
Note:
In XPath 1.0, sets and node sets were always kept in forward or
reverse document order, depending on the axis.
Sequences are always flat.
(1, 2, ("a", "b", "c"), 3, 4) ((1, (2)), (("a", "b", "c")), (3, 4))
Although you can use nested sequence constructors, the result is always a "flattened" sequence. Any nested sequence items will be arranged in the same order, as if there were no nestings at all. So, both example sequences are equivalent to:
(1, 2, "a", "b", "c", 3, 4)
Sequences may contain duplicates.
(input()/bib/book/author/first, input()/bib/book/author/last, input()/bib/book/author/first) (1, 2, 3, 4, 3, 2, 1)
Now that there is an order on a sequence, sequence items may occur more than once in a sequence. These duplicates can have the same value or the same node identity.
Note:
In XPath 1.0, a node could only appear once in a node set.
Remember that every expression in XQuery evaluates to a sequence. Even if we have an XQuery expression such as
let $x := 5 return $x * 30
that defines a local variable $x
and returns its value
multiplied by 30, the XQuery expression, strictly speaking, returns a sequence
with the single integer value 150.
In contrast to the let
variable the type of the sequence
for other expressions is constrained to be a special sequence. For example, a
for
variable is always an item (identical to a singleton
sequence):
for $bib in input()/bib return $bib
Note:
In XQuery, all keywords are written in lower case. It results in a
parsing error if you use mixed or upper case.
In Tamino XQuery, there are two functions that provide access to data
stored in a Tamino database. The function input()
takes no
parameters and is an implementation-defined method to assign nodes from a
source to the input sequence which is evaluated in a query expression.
In Tamino, it is always the current collection of a Tamino database that
input()
provides access to. The input sequence then consists of
all document nodes of the current collection. Similarly, you can use the
function collection()
to access nodes from a collection that may
be different from the default collection. The collection is specified as
parameter.
input() |
collection("XMP") |
input()/bib/book/title |
collection("XMP")/bib/book/title |
The first input()
expression returns the document instances
of all doctypes in the current collection. The second input()
expression returns a sequence of all title
elements
that are child nodes of book
elements that are child
nodes of the bib
document element. The
collection()
expressions on the right side correspond to the
input()
expressions on the left side, provided that the current
collection for the input()
expressions is XMP
.
In XPath 1.0, any expression locates nodes in a single document. However, in XQuery as well as in the previous X-Query language, expressions are evaluated with regard to a collection of documents. More precisely, the input for an expression is a sequence of document nodes in a collection.
In XQuery, you can conveniently compose your query result using constructors for new elements and attributes. With constructors, you can construct new element and attribute nodes within a query expression:
let $a := input()/bib/book/author return <index type="author"> { $a/last } { $a/first } </index>
This XQuery expression compiles a name index from all authors of the
book
doctype in the current collection. It
constructs an element index
with an attribute
type
indicating the type of index. The
index
contains two expressions enclosed in braces.
They evaluate to element nodes last
and
first
from all author
elements.
It is sufficient to literally write the start and end tags of an element to construct it. Whenever you need to evaluate some expression, you have to enclose it in braces.
XQuery uses path expressions to locate nodes in a document tree in much the same way as XPath 1.0 defined it originally:
let $b := input()/bib/book/author return $b/last input()/patient//type
The first expression returns the last
child
element nodes of all author
elements. The second
expressions returns all type
elements that are
descendant nodes of the patient
element. Here,
//
is the abbreviated syntax for
/descendant-or-self::node()/
.
The structure of a path expression has only slightly changed with regard
to XPath 1.0: A path expression consists of a sequence of steps which can be
distinguished into general steps and location steps. A general step is an
expression that evaluates to a node sequence, e.g. the input()
function that delivers the document nodes of the current collection. It can
only be the first step in a path expression. A location step consists of three
parts:
An axis, which specifies the relationship between the set of selected nodes and the context node,
A node test, which specifies type and/or name of the set of selected nodes, and
Zero or more predicates, which further restrict the set of selected nodes.
XQuery supports a number of axes. An axis originates in the context
node and determines the initial node sequence that is further refined by node
tests and predicates. In XQuery and XPath 2.0, you can specify a path in either
unabbreviated or abbreviated syntax. The following table lists each axis along
with its direction (normal document order or reverse document order) and a
short description. In the unabbreviated syntax, a double colon
'::
' follows the name of the axis.
Axis | Direction | Meaning |
---|---|---|
ancestor:: |
reverse | all ancestor nodes (parent, grandparent, great-grandparent, etc.) |
attribute:: |
implementation-defined | attached attribute nodes |
child:: |
normal | immediate child nodes (default axis) |
descendant:: |
normal | all descendant child nodes |
descendant-or-self:: |
normal | current node and all its descendant child nodes |
parent:: |
reverse | parent node (or attaching node for attribute and namespace nodes) |
self:: |
normal | the current node |
Tamino also supports the abbreviated notation of path expressions with axes. The following table shows how they correspond to the unabbreviated axes (as defined in the W3C XQuery specification):
Abbreviation | Description |
---|---|
no axis | nodes along the child:: axis satisfying
node tests and optional predicates
|
@ |
nodes along the attribute:: axis
satisfying node tests and optional predicates
|
. |
self::node() , which is the current node
of any type
|
.. |
parent::node() , which is the empty
sequence if the current node is the document node; the attaching node if the
current node is an attached node (of type attribute or namespace); otherwise
the parent node
|
// |
/descendant-or-self::node()/ , which is
the absolute path at the start of an expression, or the relative path
elsewhere
|
The following query expressions are thus equivalent:
1. |
for $a in input()/bib/book return $a/title |
for $a in input()/bib/book return $a/child::title |
2. |
for $a in input()/bib/book return $a/@* |
for $a in input()/bib/book return $a/attribute::* |
The node test determines the type and optionally the name of the nodes along the axis direction. For each axis, there is a principal node type: for the attribute axis, it is attribute; for other axes, it is element. You can select a node by applying one of the following node tests. The node is selected if the test evaluates to "true".
NodeTest | Description |
---|---|
processing-instruction() |
a processing instruction node (regardless of name) |
processing-instruction('Literal') |
a processing instruction node with name
Literal ; if name is omitted, then the test is
"true" for any processing instruction node
|
comment() |
a comment node |
text() |
a text node |
node() |
a node of any type (regardless of name) |
'Name '
|
a node of the principal node type with the specified name |
'prefix:name '
|
according to the axis used: an element or attribute node in the specified namespace with the specified local name |
'prefix:* '
|
according to the axis used: all element or attribute nodes in the specified namespace |
'*:name '
|
according to the axis used: all element or attribute nodes in the specified namespace with the specified local name (regardless of namespace) |
'* '
|
according to the axis used: all element or attribute nodes |
The last, optional part of a step is one or more predicates to filter
the sequence of selected nodes according to the predicate expression. This
expression is always enclosed in square brackets [
and
]
. A selected node is retained if the predicate truth value of the
predicate expression evaluates to "true".
The predicate truth value is derived by applying the following rules, in order:
If the value of the predicate expression is an atomic value of a numeric type, the predicate truth value is true if the value of the predicate expression is equal to the context position, and is false otherwise.
Otherwise, the predicate truth value is the effective boolean value of the predicate expression.
The effective boolean value of an expression is false if its operand is any of the following:
An empty sequence
The boolean value false
A zero-length value of type xs:string
or
xdt:untypedAtomic
numeric value that is equal to zero
Otherwise, fn:boolean
returns
"true".
The filtered node sequence is ordered according to the direction of the selected axis.
The XQuery type system is much richer than that of XPath 1.0. It uses the built-in data types as defined in XML Schema 1.0. The set of built-in data types consist of primitive types and derived types. They fall roughly into these categories:
Boolean values (true and false)
Numbers: decimals, floating-point numbers with single and double precision
Character Strings
Data types for dates, times, and durations (two of which are not yet defined in XML Schema)
XML-specific data types such as QName and NOTATION
In addition, there are derived types that are derived from the primitive types. In the XML schema documentation you will find a diagram that summarizes the primitive and derived types, which are all supported by Tamino XQuery.
Expressions and functions expect operands and parameters to be of a certain type. If the required type cannot be provided, type conversion is attempted. The following general methods can be applied:
Atomization takes place when an atomic value or a sequence of atomic values are expected. When atomizing a given value, the following cases can be distinguished: If the value is an atomic value or the empty sequence, then that value is returned. If the value is a single node, then the typed value of that node is returned. Otherwise an error is raised.
Atomization is used when processing arithmetic expressions, comparison expressions, function calls and sort expressions.
During processing of arithmetic expressions and
value comparisons, an atomic value can be promoted from one type to
another. As a general rule the value of a derived type can be promoted to its
base type. The value of the base type is the same as that of the original type.
For example, a value of type xs:long
can be promoted to its base
type xs:decimal
retaining its original value. Two further
promotions between base types are possible: a value of type
xs:decimal
can be promoted to xs:float
, the value
being as close as possible to the original value. And a value of type
xs:float
can be promoted to xs:double
also retaining
its original value.
A number of functions that operate on different types of data and perform various tasks are defined. Most of them are defined in the W3C specification XQuery 1.0 and XPath 2.0 Functions and Operators.
let $a := input()/bib/book return <p>Currently, there are { count($a) } books stored.</p>
In addition, Tamino XQuery provides further functions that perform
full-text operations or deal with special aspects of documents stored in
Tamino. These functions use the namespace
http://namespaces.softwareag.com/tamino/TaminoFunction, usually
prefixed by tf
. They do not belong to the standard namespace
http://www/w3.org/2002/08/xquery-functions, which is prefixed by
fn
. Since tf
is a predefined namespace prefix, you do
not have to qualify them with their namespace nor declare the namespace.
for $t in input()/bib/book where tf:containsText($t/title, "UNIX") return $t
for $a in input()/bib/book where $a/title = "TCP/IP Illustrated" return tf:getCollection($a)
The first query uses a text retrieval function to look for all books that contain the word "UNIX" in their title. The second query uses a comparison expression to look for all books whose title is equal to the string "TCP/IP Illustrated".