Tamino-Specific Extensions to the Physical Schema

The physical schema specifies physical storage information, for example for mapping and indexing.

This section describes the Tamino-specific extensions to the physical schema under the following headings:


Doctype Specific Extensions

The following topics are covered in this section:

Structure Index

This element specifies whether a structure index is created for the corresponding doctype. The element is of type xs:NMTOKEN. The tsd:structureIndex element can have one of the following values:

none

No structural indexing is done.

condensed

The repository registers the existence of an undeclared node for the doctype.

full

The repository registers both the existence of undeclared nodes and the instances in which they occur.

The default value is condensed.

Computed Indexes

Computed indexes defined for a doctype allow for user-defined indexing based on an indexing function defined in an XQuery module. For details refer to the section Advanced Indexes in the Performance Guide.

Definition of Unique Keys

For a description of how to define unique keys, refer to the topic Definition of Unique Keys in the section Physical Schema for Elements and Attributes.

Compression

The tsd:compress element specifies compression options for the physical storage of the data. It specifies whether instances belonging to the corresponding doctype are physically stored in a compressed or an uncompressed format (the latter is recommended for small data records only). Also, the user can specify maximum compression or optimal performance. The tsd:compress element is of type xs:NMTOKEN.

Note:
This option is not applicable for non-XML data.

The following values are allowed:

smart

This is the default value. Tamino checks the data to be stored and uses the best compromise between speed and size.

always

Always compress as much as possible. This choice is appropriate if you are primarily interested in reducing storage size. Especially for small documents, this minimizes disk space but increases retrieval time.

none

Do not compress small data records. Large documents are not affected by this setting. This setting is recommended if you expect most of your documents to be small (< 8000 characters) and you want to optimize processing speed, at the price of increased storage space.

off

Do no compression at all.

utf8

Each character is replaced by its UTF-8 representation. This can result in a compression factor of up to 4, depending on platform and data.

Built-In Indexing of Non-XML Data

Indexing of non-XML data is possible, but in a somewhat different manner than the indexing of XML data. The following example shows how to define a non-XML index using the tsd:nonXML element:

<?xml version = "1.0" encoding = "UTF-8"?>
<xs:schema 
 xmlns:xs = "http://www.w3.org/2001/XMLSchema"
 xmlns:tsd = "http://namespaces.softwareag.com/tamino/TaminoSchemaDefinition">
  <xs:annotation>
    <xs:appinfo>
      <tsd:schemaInfo name = "nonXML">
        <tsd:collection name = "Hospital"></tsd:collection>
        <tsd:doctype name = "X-Rays">
          <tsd:nonXML>
            <tsd:index>
              <tsd:text></tsd:text>
            </tsd:index>
          </tsd:nonXML>
        </tsd:doctype>
      </tsd:schemaInfo>
    </xs:appinfo>
  </xs:annotation>
  .
  .
  .
</xs:schema>

A plain text index is created for non-XML character data.

Physical Schema for Elements and Attributes

This section deals with the following topics:

The figures in The Schema Header show those parts of the metaschema describing information that can be associated with elements and attributes via xs:annotation and xs:appinfo using tsd:elementInfo and tsd:attributeInfo, respectively:

The meaning and usage of these extension elements and attributes are explained in detail in the Tamino XML Schema Reference Guide. A general explanation of the tsd:elementInfo/tsd:attributeInfo subtree follows.

Attaching Physical Schema Information to Specific Nodes

tsd:which

The tsd:which element enables you to specify different physical schema information represented by different tsd:physical elements inside a single tsd:elementInfo or tsd:attributeInfo (i.e. an element or attribute), if a node can be reached via multiple paths because of references to global elements or attributes. Physical schemas for different absolute path expressions can be grouped together. Therefore, if necessary, multiple tsd:which elements are allowed in one tsd:physical element to specify different possible access paths. If no tsd:which element is specified within a tsd:physical element, all possible paths not explicitly given in any tsd:which within a sibling tsd:physical have the same physical schema information.

The tsd:which element contains a string of type xs:string that describes an XPath expression, which must match the following constraints:

  • It must be an absolute path expression of the following form:

    /doctypeName/element/.../{currentElementName | @ currentAttributeName }

    The XPath expression must refer to an element or attribute defined in the schema (including the imported schema information).

  • Each XPath expression of each tsd:physical child of one element must be unique.

  • At most one tsd:physical without a tsd:which element is allowed.

The following rules apply for the default behavior of the tsd:which element:

  • The default for a non-recursive element is all absolute paths to this element.

  • If no tsd:which element is specified, the default applies.

  • For a recursive element, the following applies: If tsd:which is applied with the default option, it relates to the first stage of recursion.

  • Attachment of physical schema definition is not possible without the tsd:which element below recursive elements.

One situation where the usage of the tsd:which element is very advantageous is multi-path indexing. You can find an explained example including usage of the tsd:which element here.

context

The context attribute is used in tsd:elementInfo or tsd:attributeInfo elements specified within tsd:schemaInfo. Similar to tsd:which, its value is a path that determines the xs:element or xs:attribute node to which the tsd:elementInfo or tsd:attributeInfo element belongs.

Definition of Unique Keys

Starting with version 4.2, Tamino allows you to specify a uniqueness constraint at the doctype level. This means that you can indicate, during schema definition, fields or combinations of fields that Tamino should monitor, ensuring that they have a unique value within their doctype. The necessary checks are performed automatically when data are inserted and updated (or update-defined). An error message is issued if an attempt is made to violate a uniqueness constraint.

Unique keys are defined centrally in tsd:schemaInfo/tsd:doctype/tsd:logical. Below tsd:logical, you can define tsd:unique elements having one or more tsd:field children. These tsd:field children allow you to specify how the uniqueness constraint is composed of one or more fields. Each tsd:field child element has an attribute xpath of type xs:string.

Note:
The path is relative to the root element.

Caution:
If there are multiple tsd:field child elements of a given tsd:unique element, each child's xpath attribute must be distinct.

graphics/unique.png

Based on that, the following schema fragment shows how to define two unique keys for the structure displayed in the diagram:

<tsd:doctype name="A">
  .
  .
  .
  <tsd:unique name="CB-key">
    <tsd:field xpath="C" />
    <tsd:field xpath="B/@b" />
  </tsd:unique>
  <tsd:unique name="D-key">
    <tsd:field xpath="D" />
  </tsd:unique>
  .
  .
  .
</tsd:doctype>

In the example, two uniqueness constraints are defined: one for the combined key comprising the element C and the attribute b of element B; and the other one for element D.

Note:
You can define an arbitrary number of tsd:unique elements below the tsd:logical element.

The names of the fields to which the uniqueness constraint applies are specified using the xpath attribute of the tsd:field element. This attribute is mandatory. It is of type xs:string, with the additional requirement that its value must be a valid XPath expression, i.e. it must comply with the XML Path Language (XPath) Version 1.0 specification published by the W3C.

Valid xpath attributes are:

  • Any XPath expression that uses only the axes "child" and "attribute" and does not have more than one attribute;

  • The special case of a single dot ".", which denotes the contents of the root element as a unique constraint.

The following rules apply for unique key definitions:

  • Order of Components
    The order of components is not significant.

    Example: The following unique key definitions are equivalent:

    <tsd:doctype name="A">
      .
      .
      .
      <tsd:unique name="CD-key">
        <tsd:field xpath="C"/>
        <tsd:field xpath="D"/>
      </tsd:unique>
      .
      .
      .
    </tsd:doctype>
    

    and:

    <tsd:doctype name="A">
      .
      .
      .
      <tsd:unique name="CD-key">
        <tsd:field xpath="D"/>
        <tsd:field xpath="C"/>
      </tsd:unique>
      .
      .
      .
    </tsd:doctype>
    
  • Number of Possible Uniqueness Constraints per Doctype
    There is no theoretical limit to the number of uniqueness constraints that can be specified for one doctype.

  • Condition for Constraint Names
    Each constraint name must be unique within the doctype.

  • Condition for Number of Components of Constraint
    Each constraint must have one or more components. Empty constraints are not allowed

  • Exclusion of Double Declarations
    In any given constraint, the same component may not be used twice.

  • Mixing Elements and Attributes
    Elements and attributes may be mixed.

  • Mixing Datatypes
    Different datatypes may be mixed within one single unique key definition.

  • Mixing Collations
    The mixing of different collations within one single unique key definition is not supported.

  • Condition for Multiplicity of a Component
    Currently, each component of a unique key must have multiplicity of exactly 1. Further restrictions result from this constraint:

    • Variety list Not Allowed
      A uniqueness constraint must not reference a node of simple type with variety "list". This includes the predefined datatypes xs:ENTITIES, xs:IDREFS and xs:NMTOKENS.

    • Reference to Nillable Elements
      A uniqueness constraint must not refer to an element with nillable="true".

  • Variety union Not Allowed
    A uniqueness constraint must not reference a node of simple type with variety "union".

  • Condition for XPath Expressions
    XPath expressions are relative to the root element.

  • Addressing the Root Element
    The expression xpath="." refers to the root element.

  • XPath Expressions and Mapped Nodes
    XPath expressions must not point to nodes with Adabas, SQL or SXS mapping.

    For correct operation of the unique key constraints, all relevant nodes must be updated under the control of Tamino. This is true for all natively-stored data, but not for data under the control of an external system, nor for mapped data whose update cannot be fully controlled by Tamino.

  • Unique Key Definitions and Compound Index Definitions on Identical Fields
    There is no interaction between unique key definitions and compound index definitions (which are technically related and are discussed later in this document). You can define a unique key constraint and a compound index that are based on the same fields. It does not matter whether or not the fields appear in the same order.

  • Criterion for Identity of Extremely Long Keys
    In order to guarantee uniqueness, Tamino uses an index to keep track of all keys. Tamino truncates keys if they are too long. Currently, the length limit of a key in Tamino's unique key storage area is 1,004 bytes in UTF-8 representation. As a consequence, two keys are considered to be identical if the first 1,004 UTF-8 characters of both keys are identical.

  • Date/Time Types Not Allowed
    Unique constraints may not be defined on elements or attributes of the following types:

For more information about how to achieve the best performance when using the unique key feature, see the Performance Guide.

Indexing XML Data for Native Storage

This section discusses some aspects of the various kinds of indices that Tamino offers and of indexing in XML databases in general. In detail, the following topics are discussed here:

Why Define an Index?

One of the most important reasons for defining a schema for XML data is to index the data for "native"storage, i.e. storing the data in Tamino's internal XML store, although a schema may also be needed for validation and extension purposes.

The performance of database systems depends largely on correct indexing. This is just as true for Tamino as it is for other database systems. Correct indexing is very important, as it is a prerequisite for efficient database processing and retrieval.

However, if you have data for native storage for which no index is required, it is not necessary to make any descriptions, declarations or specifications at all, as native is the default storage option for XML data in Tamino, and no indexing is also the default.

What is an Index?

In general, an index is a look-up table for reducing the time taken to query data stored in a database. The look-up table usually stores the corresponding record number for each occurrence of a given value in a database field. It is possible to define an index:

  • on a single field;

  • on a combination of fields.

Example

For example, assume a document with the structure displayed in the picture below: a root element A has multiple children B, each having two children named C and D. Also assume that there are two instances of this document with the ino:id values 17 and 31 containing the names given in the lowest row of the picture.

graphics/index-tree.gif

Then the index look-up table for A/B/C based on these instances would look like this:

Index for A/B/C
John 17, 31
Mike 31
Paul 17

Similarly, the look-up table for an index for A/B/D would look like the following:

Index for A/B/D
Fuller 31
Miller 17
Smith 17, 31

However, Tamino does not use record numbers; rather, it uses the ino:id to identify data within the database.

Tamino also offers more complex types of indices: see the following sections.

Index Categories Supported by Tamino

There are two ways of classifying indices in Tamino:

  • Firstly, indices can be classified according to the search technology on which they are based. From this point of view, we distinguish between text and standard indices.

    Standard index

    A standard index is the classical database index. It is based on standard database indexing technology and is usually datatype independent.

    The complete node content is used as the index value. When a document is stored or updated, Tamino puts data into a specific index, thus enabling efficient queries.

    Tamino offers standard indexing support for all available XML Schema datatypes.

    Text index

    In contrast to a standard index, a text index uses special full-text searching technology based on a word-by-word analysis of text data. Single words are stored in the index.

    For more detailed information about text indexing, see the section Text Retrieval of the Advanced Concepts documentation.

    Structure index

    A special kind of index is the structure index that is described above.

  • On the other hand, different technologies can be applied to various index types. From this point of view, we distinguish between the following kinds of index definitions:

    Simple indices

    These indices represent the normal type of index. They are quite comparable to the indices that may be defined in standard database systems. Such indices have been available in Tamino versions both prior to and following version 4.2.1. The term "simple" in this context denotes that the index is less complex than the other types discussed here.

    Multi-path indices

    This type of index was introduced in Tamino version 4.2.1.

    For more information, see the section Multi-Path Indices.

    Compound indices

    This type of index was introduced in Tamino version 4.2.1.

    Compound indices may only be defined as standard indices.

    For more information, see the section Compound Indices.

    Reference indices

    This type of index was introduced in Tamino version 4.2.1. It is based on the idea that only a sub-tree of the document tree is indexed in this particular index.

    For more information, see the section The Reference Index: Indexing with Respect to Sub-Trees of a Document.

Note:
For non-XML data, it is also possible to define text indices. This option is described in the section Built-In Indexing of Non-XML Data.

For more information about achieving the best performance when using these advanced indices, see the appropriate section of the Performance Guide.

How to Define an Index in Tamino

This can be accomplished either with help from a tool or manually. These options are described in the following sections:

Defining an Index Using Tamino Schema Editor

Using the Tamino Schema Editor documentation, it is easy to define an index:

Start of instruction setTo define an index using the Tamino Schema Editor

  1. Choose the element or attribute that is to be indexed in the tree-display on the left side of the Tamino Schema Editor.

    On the right side of the Tamino Schema Editor, two areas labeled with:

    • Logical properties

    • Physical properties

    will appear.

  2. In the Physical properties area, set the value for the property <index> to "standard" if you intend to define a standard index, or to "text" if you intend to define a text index.

    Instances of the chosen element or attribute will be indexed to optimize queries containing relations like comparisons in retrieval expressions.

    Note:
    There is also a "standard+text" option available if you want to define both standard and text indices synchronously.

For more information, see the documentation of the Tamino Schema Editor.

Defining an Index Manually

Start of instruction setTo define an index manually

  • To define a simple index for an element or attribute manually, find the definition of the element or attribute in the schema file and insert a tsd:index element as a child element of the corresponding tsd:native element in the definition of the element or attribute.

To define advanced types of indices manually, add child elements to tsd:index that contain the appropriate information.

Simple text and standard Indices

Simple Indexing
Index Definition Using the tsd:index Element

The tsd:index element may contain:

A tsd:text element

This indicates that the node is indexed for full text retrieval.

A tsd:standard element

A standard index is built. This option is usually applied for data typing.

Neither

No indexing is performed.

A text index and a standard index may also be defined synchronously. In a simple index, neither the only tsd:text element nor the only tsd:standard element may have any child element.

XML Indexing Considerations

The challenge in indexing XML data lies in deciding which XML data you wish to store and what the important queries against the stored data are likely to be. Since the "meat" of the data often lies in terminal nodes or sub-trees, the guiding principle is therefore to define an access path, i.e. an index, to those terminal nodes that are likely to be queried. Subtrees that are not relevant for filter conditions do not need to be indexed.

Tamino provides default settings for attributes that allow you to generate a Tamino schema without any editing. By default, neither elements nor attributes are indexed. Thus, for a natively-stored node without index nothing at all needs to be specified. Furthermore, it is very simple to add indexing to a natively-stored node:

Start of instruction setTo create an index for a natively-stored node

  • Specifying an index element with a text child element (i.e. an tsd:index element with a tsd:text child element) on the root node causes full text indexing in the whole instance tree. Doing this again on individual terminal nodes causes double indexing; this may be useful in cases where you wish to make information accessible via full text searches across a whole doctype as well as via specific search expressions (for example, "Give me patient records that contain the string Atkins", as well as "Give me the record of the patient whose surname is Atkins".

A special case of defining schemas for XML data is the representation of recursive structures in a schema.

The practical management of indexing is explained in the section Examples.

The Reference Index: Indexing with Respect to Sub-Trees of a Document

The General Concept of the Reference Index

A simple index indicates that a node belongs to a certain document, but it does not contain any more precise information. This may lead to poor performance if the document contains similar or equal sub-trees occurring with multiplicity that would have to be searched. Tamino provides reference indexing so that you can tune your schema for better performance in these situations.

Reference indices provide improved support for very large documents and for high-complexity documents.

We speak of high complexity if sub-trees with multiplicity occur in the schema. The complexity is even greater if recursive structures occur in the schema.

The general idea behind reference definitions and reference index definitions is to keep an additional index containing only references for a predefined partial tree of the document and to identify these references for later use.

For example, if /Doc/A/B is specified as the reference node for the reference index within the contents of the tsd:refers element, B nodes will be registered in the reference index.

In more detail, this leads to the following consequences:

  • In contrast to earlier versions of Tamino, document fragments (partial sub-trees of the document tree) can now be specifically addressed.

  • The target node of a reference index is a "dedicated" node. Such nodes define a reference chain.

  • Special queries can be significantly accelerated by defining a reference index. A particularly relevant acceleration happens to queries that contain a logical AND operation, if the AND operation is performed relative to a reference node.

  • Appropriate index definition is even possible in complex schemas containing recursive definitions.

Example of a Reference Index

Again, assume a document with the structure displayed in the picture below: a root element A has multiple children B, each having two children named C and D. Also assume there are two instances with the ino:id values 17 and 31 containing the names given in the lowest row of the picture.

graphics/index-ref-tree.gif

We now introduce reference numbers to identify the B nodes that appear with multiplicity; we also replace the ino:ids in the right column of the traditional index look-up table by the reference numbers of the B nodes that appear with multiplicity.

The look-up table for an index for A/B/C based on these instances looks like this:

Reference Index for A/B/C:
John 7,9
Mike 10
Paul 8

Similarly, the look-up table for a reference index for A/B/D looks like the following:

Reference Index for A/B/D:
Fuller 9
Miller 8
Smith 7,10
Conclusion

The tsd:refers element of TSD can be used in two different ways, depending on the intention:

Reference Definition

You can define a reference based on either a standard index or a text index.

To define a standard index or a text index that references a particular path, add a tsd:refers element as a child element of the respective tsd:standard or tsd:text element. The content of this tsd:refers element is the path to the node that is to be used as a base for a reference. (This node is marked as a reference node.) It is given there as an absolute path according to the rules of the W3C's XPath language.

This means that not the whole document tree is covered by the index, but only the sub-tree below that path (in the example below it is /A/B, for example).

...
<xs:element name = "C" type = "xs:string">
...
  <tsd:elementInfo>
    <tsd:physical>
      <tsd:which>
        /Doc/A/B/C
      </tsd:which>
      <tsd:native>
        <tsd:index>
          <tsd:text>
            <tsd:refers>
              /Doc/A/B
            <tsd:refers>
          </tsd:text>
        </tsd:index>
      </tsd:native>
    </tsd:physical>
  </tsd:elementInfo>
...
</xs:element>
...

The following rules apply:

  • The element describing the index (in the contents of the accompanying tsd:which element) must have an ancestor that matches the XPath expression of the tsd:refers value.

  • Only absolute path expressions without wildcards are supported in this context.

As long as these conditions are fulfilled, recursive structures can be indexed.

Reference Index Definition

A node B to be referenced in indexing can be defined as follows:

...
<xs:element name = "B" maxOccurs = "unbounded">
  ...
  <tsd:elementInfo>
    <tsd:physical>
      <tsd:native>
        <tsd:index>
          <tsd:reference>
            <tsd:refers/>
          </tsd:reference>
        </tsd:index>
      </tsd:native>
    </tsd:physical>
  </tsd:elementInfo>
  ...
</xs:element>

If this node has a child element C that should have a reference index relative to B, this is accomplished as follows:

For more complex schemas and their corresponding documents, TSD's reference index definitions offer the possibility of two-stage modeling, e.g. /Doc/A/B refers to /Doc/A and /Doc/A refers to the document id.

For example, imagine an element named B that should have an ancestor named A and should be described by the following element definition schema fragment:

The element B is defined by this element definition from the schema fragment with a reference index definition that works in such a way that references pointing to node B should now point to its parent /Doc/A, for which a reference must have been defined.

Note:
This schema fragment sets up a reference chain from B over A to the document root. This reference chain can also be denoted as B->A->Doc.

Another motivation for using reference index definitions may come from the fact that recursive structures can also be indexed if the rules given below are not violated.

The following rules apply generally for reference index definitions:

  • If a tsd:which element has been specified in this context:

    The parent element of the element mentioned within the contents of the accompanying tsd:which must match the XPath expression specified below the tsd:refers element.

  • Only absolute path expressions without wildcards are allowed in the XPath expression.

  • A tsd:text or tsd:standard element may have only one single tsd:refers as a child element.

    Note:
    However, if you want to define more than one reference index on the same node, you can achieve this by using multiple tsd:text or tsd:standard elements.

  • tsd:reference may not be specified for attribute nodes.

The tsd:refers child element of the tsd:reference element behaves the same as it does in a reference definition as described above.

Reference index definitions can also be combined with other index definitions, for example multi-path indices and compound indices.

Constraints on Reference Indices
  • Multiple reference index definitions are not allowed on the same node.

    <xs:element name = "B" maxOccurs = "unbounded">
      ...
      <tsd:elementInfo>
        <tsd:physical>
          <tsd:native>
            <tsd:index>
              <tsd:reference>
                <tsd:refers>/Doc/A<tsd:refers>
              </tsd:reference>
    	         ...
            </tsd:index>
          </tsd:native>
        </tsd:physical>
      </tsd:elementInfo>
      ...
    </xs:element>
    

    The above fragment is valid.

    <xs:element name = "B" maxOccurs = "unbounded">
      ...
      <tsd:elementInfo>
        <tsd:physical>
          <tsd:native>
            <tsd:index>
              <tsd:reference>
                <tsd:refers>/Doc/A<tsd:refers>
              </tsd:reference>
    	         <tsd:reference>
                <tsd:refers>/Doc/C/D<tsd:refers>
              </tsd:reference>
    	         ...
            </tsd:index>
          </tsd:native>
        </tsd:physical>
      </tsd:elementInfo>
      ...
    </xs:element>
    

    The above fragment is invalid: multiple tsd:reference elements are not allowed.

  • Defining a reference index is only allowed if the structure index is set to either "condensed" or "full".

You can find a more detailed example below in the section Example 6: Defining a Reference Index. For example, the index definitions made there accelerate the processing of the following queries:

  • /Doc/A/B[C~='my' and D=1]
  • /Doc/A[B[C~='my' and D<1]]
  • /Doc[A/B[C~='my' and D<1]]

However, the following queries do not benefit from the definition of a reference index:

  • /Doc/A[B/C~='my' and B/D<1]]

    Note:
    Indeed, for this query the execution time is longer if a reference index is defined on B (compared to a simple index).

  • /Doc[A/B/C~='my' and A/B/D>2]	index lookup C and D

    Note:
    This query is more efficient with a simple index than with a reference index.

General Rules for Index Combination

The following rules concerning combination possibilities apply to reference indices:

  • A reference index can be defined along with a standard or a text index.

  • Reference indices can be combined with both compound indices and multi-path indices.

  • Within one tsd:index element, currently only one tsd:text element containing a tsd:refers element is allowed.

Advantages and Disadvantages of Reference Indexing
Advantages
  • A reference index can be used in combination with any other kind of index (standard, text, compound, multipath).

  • Improved selectivity by combining values relative to subtrees.

Disadvantages
  • Sorting over the index with tsd:refers is not possible.

  • The number of documents in a doctype is limited by the number of allowed entries.

Effect of Reference Index Definitions on Tamino Limits

The number of documents that can be stored in a Tamino database decreases significantly when reference indices are used, especially when multiple definitions of a reference index have been made in this doctype.

Effects on Performance

The correct choice of reference indices can have a significant effect on performance. This is discussed in more detail in the Performance Guide.

Multi-Path Indices

In contrast to previous versions of Tamino, where every index reflected an absolute XPath address, it is now possible to define an index in Tamino whose XPath address matches special criteria. This index is called a multipath index. The definition of multi-path indices is discussed here under the following headings:

Motivation

The reasons that lead to the introduction of multi-path indices are as follows:

  • Versions of Tamino prior to 4.2.1 did not support indices for recursive or highly nested structures. In order to improve the usability of Tamino in such situations and also to avoid possible loss of performance, as from Tamino 4.2.1 both recursive and highly nested structures can be indexed using multi-path indices.

  • In earlier Tamino versions, index support was missing for queries that contain wildcard expressions, for example /Play//Title. Queries containing wildcard expressions performed badly, due to the time-consuming analysis of all matching indices with respect to the given predicate.

The General Concept of Multi-Path Indexing

A multi-path index allows data from multiple paths to be collected in a single index.

Whereas in simple indexing each index reflects an absolute XPath address, this does not hold for a multi-path index.

A multi-path index is filled by the data of all nodes that match the various XPath conditions. This increases the performance for queries with wildcards significantly, as only the sum of all relevant entries is searched (and not all possible combinations, as in a simple index).

A multi-path index can be defined either as a text index or as a standard index, depending on which was specified as the parent element of the tsd:multiPath element in the schema.

It is also possible to combine multi-path indexing with any other kind of indexing within one node:

- Simple standard index;
- Simple text index;
- Compound index.

Practically, a multi-path index is defined as follows:

All nodes (this includes both element nodes and attribute nodes) that should contribute to the multi-path index are identified by a special label that is unique within the entire schema. This allows you to specify easily the values that should be assembled into each index.

In the schema definition for a multi-path index, this is simply expressed as an additional element tsd:multiPath which contains the common label to be used in the index definitions of all nodes that should be indexed. All nodes (i.e. elements and attributes) that have the same label in their index definition address the same index.

For example, as you can find in the contents of the tsd:multiPath element, the three labels MultiPathIndex0, MultiPathIndex2 and MultiPathIndex3 are used in the example below.

This approach offers the following solutions:

  • It is now possible to define an index on nodes that are part of a recursive schema. Multi-path indexing provides efficient indexing, even on recursive structures that could not be indexed in earlier versions of Tamino.

  • Support for highly connected schemas is offered.

    Highly connected schemas in this sense are schemas that contain nodes that are referenced many times from other nodes (for example by usage of the xs:any or xs:anyAttribute elements within the schema).

Example Schema Excerpts for Multi-Path Indexing

First we discuss a simple example, showing the use of multi-path indexing in a recursive situation.

Imagine, for example, an element title that may be addressed as a child element within a recursive schema. See the illustration below:

graphics/tsdmulti-path-rec.png

To create a multi-path index for the element title, add the following lines to its tsd:native element:

.
.
.
<tsd:index>
  <tsd:text>
    <tsd:multiPath>Title index<tsd:multiPath>
  </tsd:text>
</tsd:index>
.
.
.

This code creates a multi-path index named Title index that covers all possible access paths that lead to title by recursion. One single index covers all recursion levels.

Now let us have a look at a simple example for highly connected schemas.

Imagine, for example, an element title that can be addressed via multiple paths within a highly connected schema. See the illustration below:

graphics/tsdmultipath-ex-1.png

To create a multi-path index for title, simply add the same lines as in the preceding example to its tsd:native element:

.
.
.
<tsd:index>
  <tsd:text>
    <tsd:multiPath>Title index<tsd:multiPath>
  </tsd:text>
</tsd:index>
.
.
.

This code creates a multi-path index named Title index covering all access paths that lead to title. However, index generation does not depend on the path leading to title in this case.

The generated multi-path index accelerates queries such as:

Play [ .//Title ~= "King" ]

Finally, let us discuss a more complex example:

.
.
.
<!-- global Element Definition for Title -->
  <xs:element name = "Title" type = "xs:string"> 
    <xs:annotation>
      <xs:appinfo>
        <tsd:elementInfo>
          <tsd:physical>
            <!-- default which for global Element Definition Title -->
            <tsd:native>
              <tsd:index>
                <tsd:text>
                  <tsd:multiPath>MultiPathIndex0</tsd:>
                </tsd:text>
              </tsd:index>
            </tsd:native>
          </tsd:physical>
          .
          . 
          .
          <tsd:physical>
            <tsd:which>/Play/Act/Title</tsd:which>
            <tsd:native>
              <tsd:index>
                <tsd:text>
                  <tsd:multiPath>MultiPathIndex2</tsd:multiPath>
                </tsd:text>
              </tsd:index>
            </tsd:native>
          </tsd:physical>
          <tsd:physical>
            <tsd:which>/Play/Act/Scene/Title</tsd:which>
            <tsd:native>
              <tsd:index>
                <tsd:text>
                  <tsd:multiPath>MultiPathIndex3</tsd:multiPath>
                </tsd:text>
              </tsd:index>
            </tsd:native>
          </tsd:physical>
        </tsd:elementInfo>
      </xs:appinfo>
    </xs:annotation>
  </xs:element>
  .
  .
  .
  <!-- local Element Definition for /Play/Prologue /Title -->
  <xs:element name = "Title" type = "xs:string">
    <xs:annotation>
      <xs:appinfo>
        <tsd:elementInfo>
          <tsd:physical>
            <tsd:native>
              <tsd:index>
                <tsd:text>
                  <tsd:multiPath>MultiPathIndex2</tsd:multiPath>
                </tsd:text>
              </tsd:index>
            </tsd:native>
          </tsd:physical>
        </tsd:elementInfo>
      </xs:appinfo>
    </xs:annotation>
  </xs:element>
  .
  .
  .
  <!-- local Element Definition for /Play/Act/Abstract/Title -->
  <xs:element name = "Title" type = "xs:string">
    <xs:annotation>
      <xs:appinfo>
        <tsd:elementInfo>
          <tsd:physical>
            <tsd:native>
              <tsd:index>
                <tsd:text>
                  <tsd:multiPath>MultiPathIndex3</tsd:multiPath>
                </tsd:text>
              </tsd:index>
            </tsd:native>
          </tsd:physical>
        </tsd:elementInfo>
      </xs:appinfo>
    </xs:annotation>
  </xs:element>
  .
  .
  .

This definition creates three multi-path indices that cover all Title nodes, namely:

  • An index for /Play/*/Title, labeled as MultiPathIndex2, containing:

    • Data from the global title element restricted by the <tsd:which>/Play/Act/Title, and

    • Data from the local element in /Play/Prologue.

    This index covers the green area in the illustration.

  • An index for /Play/Act/*/Title, labeled as MultiPathIndex3, containing:

    • Data from the global title element restricted by the <tsd:which>/Play/Act/Scene/Title, and

    • Data from the local element in /Play/Act/Abstract.

    This index covers the yellow area in the illustration.

  • An index for all other Title nodes, labeled as MultiPathIndex0.

    This index covers the light brown area in the illustration.

This situation is depicted in the following graphic:

graphics/tsdmultipath1.png

You can find another example in the section Example 4: Defining of a Multipath Index. For example, the index definitions discussed there accelerate the processing of the following queries:

  • /Doc[A//E/F ='Hy']
    
  • /Doc/A[.//E/F ='Hy']
    
  • /Doc/A[B/C='my' and .//E/F='Hy']
    
General Rules for Index Combination

The following rules concerning combination possibilities apply to multi-path indices:

  • A multi-path index may be defined along with a standard or a text index.

  • A multi-path index can be combined with both compound indices and reference indices.

Constraints and Limitations

The following constraints apply to the definition of multi-path indices:

  • The combination of indices from different paths into one physical multi-path index requires that the participating indices all be of the same kind.

    The following combinations are invalid:

    • Combination of standard and text indices.

    • Combination of different datatypes:

      All datatypes of nodes (i.e. elements and attributes) contributing to the multi-path index must be of the same common datatype.

      For standard indices, a common XML Schema datatype must be chosen for all nodes.
      For compound indices, the datatypes of all parts that are combined must be the same.

    • Combination of different collation specifications:

      Items with non-matching tsd:collation elements are rejected. All child elements of tsd:collation must match.

    • Combination of compound and non-compound indices.

    • Combination of compound indices with different layouts (number of components and sequence of datatypes).

      Note:
      However, the paths may differ.

  • The label contained in the tsd:multiPath element must not be empty.

  • If you want to work with multi-path indices, the structure index must have been activated by setting it to either "condensed" or "full".

Advantages and Disadvantages of the Use of Multi-Path Indexing
Advantages

The main advantage of multi-path indexing is easy indexing of recursive and highly-connected structures.

Disadvantages

The main disadvantage of multi-path indexing is that sort operations are not supported.

Compound Indices

This section discusses the following topics related to compound indices:

Motivation

In some queries, combined value conditions on inner nodes (i.e. nodes that are neither the root note nor leaf nodes) appear with a multiplicity of more than one. For example, /A[B[C="x" and D="y"]] represents such a query. Characteristic for this construct is:

  • the multiplicity of B greater than 1, and:

  • there is an AND operation below B.

In general, such queries perform poorly using simple indices. This is because the simple index delivers a superset of the final result, which must then be filtered in a subsequent processing step. The compound indices that are available in Tamino since version 4.2.1 are ideally suited to accelerate such queries.

Note:
Reference indices are also well suited in this situation. To decide whether to use compound indexing or reference indexing, read the sections about the advantages and disadvantages of each kind of indexing in this document.

General Concept

A compound index is somewhat different from the other kinds of index discussed here. A simple index, a reference index and a multi-path index (as discussed above) are each bound to one specific element or attribute. This not true for a compound index, which comprises two or more different components, called fields. The definition, however, takes place in the inner node (B), which is the parent of both elements in the condition (C and D in the example).

To define a compound index, add a tsd:field element as a child element directly below the tsd:standard element.

The tsd:field element must appear at least twice (with different xpath attributes) to define a valid compound index.

Note:
If the tsd:field element appears as a child element of the tsd:standard element, in order to define a valid compound index it must appear at least twice, since at least two fields are required to set up a compound index. The xpath attributes of at least two tsd:field elements belonging to the same tsd:standard element must be different.

You can use the xpath attribute of the tsd:field element to specify a path to an element or attribute contributing to the compound index.

Important:
The xpath attribute is required.

A tsd:field element with a corresponding xpath attribute must be defined for each XPath address to be included in the compound index. Each path is relative to the element in which it is defined.

Note:
The order of the tsd:field elements is significant; it determines the order in which the values contribute to the index entry.

Supported xpath attributes are:

Sequences of element names (optionally followed by a single attribute name), separated by slashes

Any XPath expression that uses only the axes "child" and "attribute".

A single dot

The special case of a single dot ".", which indicates that the index is to be created from the value of the element and of the values of its attributes.

Caution:
It is possible, although generally not desirable, to define a schema with multiple equivalent definitions of compound indices. In this case, redundant indices are created.

Example for a Compound Index

Assume a document with the structure displayed below with a root element A having multiple children B, each having two children named C and D. Also assume there are two instances with the ino:id values 17 and 31 containing the names given in the lowest row of the picture.

graphics/index-tree.gif

The look-up table for a compound index for B(C,D) based on these instances is as follows:

Compound Index for B(C, D)
John_Fuller 31
John_Smith 17
Mike_Smith 31
Paul_Miller 17
Example Schema Excerpt for Compound Indexing

Let A be the root element of a document. A has child elements B and C (there may also be others). B has an attribute named b.

The following example defines a compound index for the combination of attribute b of element B and the element C:

<xs:element ...>
  .
  .
  .
  <tsd:elementInfo>
    <tsd:physical>
      <tsd:native>
        <tsd:index>
          <tsd:standard>
            <tsd:field xpath="C"/>
            <tsd:field xpath="B/@b" />
          </tsd:standard>
          .
          .
          .
        </tsd:index>
      </tsd:native>
    </tsd:physical>
  </tsd:elementInfo>
</xs:element>

The compound index created by this schema definition contains an entry for each combination of C and B/@b occurring below the element in whose tsd:elementInfo it appears.

The performance of queries such as /A[@b="x" and C="y"] is improved by this compound index.

You can find a more detailed example in the section Example 3: Defining a Compound Index of this document. The index definitions discussed there improve the performance of the following queries:

  • /Doc/A/B[C= 'my' and D=1]
  • /Doc/A[B[C= 'my' and D<1]]
  • /Doc[A/B[C= 'my' and D<1]]
General Rules for Index Combination

The following rules concerning combination possibilities apply to compound indices:

  • A compound index may be defined only as a standard index.

  • A compound index can be combined with a reference index or with a multi-path index.

  • The order of the components in the definition of the compound index is significant.

  • You can define a compound index that covers more than two fields.

  • You can define an arbitrary number of compound indices for one schema, as long as the limit on the total number of indices required by a schema is not violated.

  • Components with arbitrary datatypes and collations are supported.

Constraints and Limitations

The following constraints apply for the definition of compound indices:

  • A compound index definition with exactly one tsd:field element is invalid.

  • If the path specified in the xpath attribute of one or more tsd:field child elements of the tsd:standard element is invalid in the sense of the W3C XML Path Language (XPath) Version 1.0 specification, then the compound index specification is invalid in its entirety.

  • It is not possible to define a compound index definition for text indices, see above.

  • A compound index whose components point to nodes with mapping to Adabas, SQL or Tamino Server Extensions is not allowed.

  • A compound index may not be defined below attributes. It can only be defined on an element.

Advantages and Disadvantages of Compound Indexing
Advantages
  • A compound index can be used together with all kinds of indices except text indices (standard, reference and multipath indices are supported).

  • Improved selectivity by combining values relative to subtrees.

  • Only one index look-up instead of multiple index look-ups.

  • Sort over all components is possible.

Disadvantages
  • Compound index definitions are not available for text indexing.

Other Sources of Information on Indexing

External Mapping

To map parts of XML documents to externally stored data (Adabas, or using a Tamino Server Extension), use the tsd:map element.

An example showing a mapping for an element follows:

<xs:element>
 <xs:annotation>
  <xs:appinfo>
   <tsd:elementInfo>
    <tsd:physical>
     <tsd:map>
     .
     .
     .
     </tsd:map>
    </tsd:physical>
   </tsd:elementInfo>
  </xs:appinfo>
 </xs:annotation>
 .
 .
 .
</xs:element>

The tsd:map element may contain an optional tsd:ignoreUpdate child element. If present, Tamino does not pass the corresponding part of the XML document to the internal data storage or X-Node during processing.

This section covers the following topics:

Mapping to Adabas Files

The Tamino schema provides constructs that allow you to store data in an Adabas file and/or retrieve data from an Adabas file via the Tamino X-Node.

Mapping to Adabas is done on a file basis. In other words, you do not have to model the whole of an Adabas database using the Tamino schema language; you model only those files and fields that you wish to access using Tamino.

Mapping to an Adabas file is explained under the following headings:

Pure vs. Hybrid Adabas Mapping

There are two possible approaches for mapping to an Adabas file:

  • Pure Mapping
    This means storing the data for a document in a single Adabas file.

    Pure XA Mapping

    Pure XA mapping defines a mapping from any XML schema to Adabas. All XML data can be stored in an Adabas file and accessed via normal XML operations, using Adabas as the XML store.

    This option is especially suited for mapping from existing XML schemas to Adabas files.

    Pure AX Mapping

    Pure AX mapping defines a mapping from any Adabas schema to an XML schema. Correspondingly, all XML operations are mapped to Adabas operations. All Adabas data can be accessed via XML operations (XML view on Adabas data).

    This option is especially suited for mapping from existing Adabas files to XML schemas.

    A pure mapping is denoted by the presence of a tsd:pure child element below the tsd:map element; otherwise, the mapping is hybrid. The tsd:pure element is described in the next subsection and in the section tsd:pure element of the Tamino XML Schema Reference Guide .

    In older versions of Tamino, the X-Node implementation was only capable of representing a hybrid XA mapping. The tsd:pure element now enables pure XA mapping. Pure Adabas mapping is specified as a physical option for a doctype:

    <tsd:doctype xmlns:tsd="http://namespaces.softwareag.com/tamino/TaminoSchemaDefinition">
    	<tsd:physical>
    		<tsd:map>
       <tsd:pure/>
      </tsd:map>
    	</tsd:physical>
    </tsd:doctype>
  • Hybrid Mapping
    Hybrid Mapping means storing the data for a document partly in Tamino and partly in a single Adabas file. This mapping option has been supported by all former versions of Tamino.

    Hybrid XA Mapping

    Hybrid XA mapping represents the combination of several mappings together with native storage parts into one document via a single schema. Varying degrees of hybrid mapping are possible.

    Hybrid AX Mapping

    Hybrid AX mapping integrating existing Adabas and natively-stored data into a single XML view (XML view on Adabas and native XML data).

Data is handled in the following manner if a pure Adabas mapping is present:

  • Important:
    In such a doctype only one Adabas file can be mapped.

  • No data is stored in Tamino for that doctype.

    Any real content in the document must be mapped to Adabas.

  • The only permitted manipulation operations on pure Adabas mapped doctypes are:

    _process

    To insert new documents.

    In particular, inserting new documents does not assign an ino:id, and any markup (for example comments, processing instructions or insignificant whitespace) is lost.

    XQuery Update

    XQuery Update is the only possible means to update and delete documents.

    Even though a document was inserted through Tamino, it cannot be updated with _process, and it cannot be deleted with _delete (updating and deletion are only possible through _xquery update).

    _xquery update does not store anything in Tamino, any modifications of the markup (e.g. comments) are lost.

  • You cannot use a full structure index in conjunction with pure mapping to Adabas.

The following conditions must be fulfilled in order to establish a pure Adabas mapping:

  • The root element referenced by the doctype mapped to pure must contain one Adabas file mapping (tsd:subTreeAdabas).

  • Each element declaration of simple content and each attribute declaration must contain one Adabas field mapping (tsd:nodeAdabasField).

  • An element declaration of complex content is not allowed to contain mixed content (xs:complexType mixed="false").

  • The doctype must not contain recursions.

  • The doctype does not allow for open content, i.e. <tsd:content>open</tsd:content> is not allowed.

  • Elements with type="xs:anyType" are not allowed.

  • No element without xs:simpleType, xs:complexType, or type attribute may occur.

  • No wildcard (xs:any, xs:anyAttribute) with processContents = "skip" or "lax" is allowed.

  • The root node must be mapped to an Adabas file.

  • Each descendant of the root node must meet one of the following conditions:

    • It has a complex content model that does not allow for mixed content and is not empty;

    • It has simple content that is mapped to an Adabas field.

  • The same Adabas field may not be mapped to different paths. Otherwise a run-time error might be signaled.

Plain URL Addressing

Normally, when a document is inserted using plain URL addressing (HTTP PUT), Tamino returns the ino:id of the new document in the HTTP header field X-INO-ID. For pure X-Node Adabas doctypes, there is no ino:id when inserting a document, and therefore the X-INO-ID field contains zero. Consequently, the pure X-Node Adabas documents cannot be read using plain URL addressing (HTTP GET), even if they were inserted through Tamino.

Elements and Attributes for Adabas Mapping

This section describes the attributes that are of particular relevance to Adabas mapping.

Adabas-Specific Elements and Attributes

tsd:subTreeAdabas element

This element provides the database ID and file number in its attributes dbid and fnr, with an optional password in the password attribute.

dbid attribute

The dbid attribute belongs to the tsd:subTreeAdabas element. Its value contains the database ID of the Adabas database.

fnr attribute

The fnr attribute belongs to the tsd:subTreeAdabas element. It represents the file number of the Adabas file.

password attribute

The optional password attribute belongs to the tsd:subTreeAdabas element. It contains the password of the Adabas file to be accessed. The password has up to 8 characters.

tsd:subTreeAdabasPE element

This element models a mapping to an Adabas periodic group via Tamino X-Node.

tsd:nodeAdabasField element

This element is used in conjunction with the elements tsd:subTreeAdabas and tsd:subTreeAdabasPE to specify the characteristics of the leaf nodes.

shortname attribute

The shortname attribute can be an attribute either of the tsd:subTreeAdabasPE element or of the tsd:nodeAdabasField element. This is the shortname (e.g. "AH") of the corresponding field in the Adabas file. It can also be used in conjunction with Adabas MU fields.

Note:
This is the case if a tsd:multiple child element is present inside tsd:nodeAdabasField.

format attribute

The format attribute belongs to the tsd:nodeAdabasField element. It describes the format of the corresponding field in the Adabas file.

The following table presents a list of the possible format attribute values and their meaning:

format Description/Meaning
A Alphanumeric
B Binary
F Fixed Point
G Floating Point
P Packed Decimal
U Unpacked Decimal

Also refer to the Adabas documentation.

Note:
The format attribute can be specified either for a tsd:nodeAdabasField element describing a mapping to a single Adabas field (that does not have a child element tsd:multiple) or for those describing a mapping to an Adabas MU field (multiple field) tsd:nodeAdabasField (that can have a child element tsd:multiple).

Example Schema for Adabas Mapping

This example uses the Employees file in the demo database delivered with the Windows installation of Adabas Version 3.2 to illustrate mapping to an Adabas file.

The following is the Natural view of the "Employees" database:

Note:
The lines of special interest are printed in italics.


T L DB Name                              F Leng  S D Remark
- - -- --------------------------------  - ----  - - ------------------------
  1 AA PERSONNEL-ID                      A    8    D
       HD=PERSONNEL/ID
G 1 AB FULL-NAME
  2 AC FIRST-NAME                        A   20  N
  2 AD MIDDLE-I                          A    1  N
  2 AD MIDDLE-NAME                       A   20  N
  2 AE NAME                              A   20    D
  1 AF MAR-STAT                          A    1  F
       HD=MARITAL/STATUS
  1 AG SEX                               A    1  F
       HD=S/E/X
  1 AH BIRTH                             N  6.0    D
       HD=DATE/OF/BIRTH
       EM=99/99/99
G 1 A1 FULL-ADDRESS
M 2 AI ADDRESS-LINE                      A   20  N
       HD=ADDRESS
*      OCCURRENCES 1-6
  2 AJ CITY                              A   20  N D
  2 AK ZIP                               A   10  N
       HD=POSTAL/ADDRESS
  2 AK POST-CODE                         A   10  N
       HD=POSTAL/ADDRESS
  2 AL COUNTRY                           A    3  N
G 1 A2 TELEPHONE
  2 AN AREA-CODE                         A    6  N
       HD=AREA/CODE
  2 AM PHONE                             A   15  N
       HD=TELEPHONE
.
.
.

The file ada_empl.tsd is provided in the Tamino distribution kit to illustrate the mapping to the Employee database. Note that not all fields are mapped, so instances containing nodes that are not mapped are rejected. The graphical representation of this schema as it appears in the Tamino Schema Editor is shown below. The properties view is reduced to the physical properties pane, which shows as an example the mapping of the zip element to the Adabas field AK:

ada_empl TSD3 in the Schema Editor

Note that such a schema could also be a subtree within a schema for another doctype.

This schema allows Adabas data to be retrieved using Tamino X-Query expressions, for example:

/employees/employee[name/surname~="A*"]

This returns all employees whose surnames start with the letter "A".

The following XML object is an instance of the above schema and can be loaded into the Adabas database and retrieved using Tamino:

<?xml version="1.0" encoding="ISO-8859-1"?>
 <employee personnelid="007">
  <name>
   <firstname>James</firstname>
   <surname>Bond</surname>
  </name>
  <sex>M</sex>
  <birth>340401</birth>
  <address>
   <zip>SW13JB</zip>
   <city>London</city>
  </address>
</employee>

Mapping to Server Extensions

Server Extensions allow you to extend Tamino's functionality by executing user-defined application logic (written in languages such as Java or C++) at specified points in the processing of an instance. Server Extensions can be associated with elements and attributes at any level in a Tamino schema. When an element or attribute is associated with a Server Extension, the element is said to be "mapped" to the Server Extension.

Server Extensions can be used, for example, to let external programs handle indexing, storing or retrieval of data in ways not provided as standard by Tamino.

Mapping to a Server Extension means transferring control to an external function offered by the Server Extension. One Server Extension may offer several functions. Control can be passed to a Server Extension function during parsing of an incoming instance or during composition of an XML object as the result of a query. The invocation context is specified by the name of the Server Extension-specific child elements on the node. The associated Server Extension function is specified as the element value.

Mapping to Server Extension functions is described in this section under the following headings:

Node Attributes for Mapping to Server Extensions

This section describes the Tamino schema extensions that are of particular relevance to mapping elements to Server Extensions.

Server Extension-Specific Elements and Attributes

There is one single parent element in TSD offering all the functionality needed to map Tamino Server Extensions in conjunction with its child elements, namely the tsd:xTension element.

TSD provides the following mappings for Server Extensions:

tsd:xTension

This element is the parent element of the elements tsd:onDelete and tsd:onProcess.

The syntax of the values of these child elements is:

SXSModule.SXSFunction
tsd:onProcess

Specifies the name of the SXS function that is to be executed when data is stored using Tamino (that is, when using the Tamino _process request). The SXS function is called after parsing the associated element.

tsd:onDelete

Specifies the name of the SXS function that is to be executed when the associated element is to be deleted (that is, when the Tamino _delete request is issued).

Example of Mapping to Server Extensions

The following example demonstrates the mapping to a Server Extension Function. A doctype "Mail" describes the structure of an e-mail message.

The effect of the Server Extension function is that instead of storing instances of the doctype (i.e., e-mail messages) in Tamino, they are routed to an e-mail tool, which transmits the messages to the recipient.

The TSD schema describing the "Mail" doctype with mapping to Server Extensions contains a tsd:onProcess element specifying the corresponding Server Extension function to be invoked when processing a document:

<?xml version = "1.0" encoding = "UTF-8"?>
<xs:schema xmlns:xs = "http://www.w3.org/2001/XMLSchema"
 xmlns:tsd = "http://namespaces.softwareag.com/tamino/TaminoSchemaDefinition">
  <xs:annotation>
    <xs:appinfo>
      <tsd:schemaInfo name = "SXS">
        <tsd:collection>MailSXS</tsd:collection>
        <tsd:doctype name = "Mail">
          <tsd:logical>
            <tsd:content>closed</tsd:content>
            <tsd:accessOptions>
              <tsd:insert></tsd:insert>
            </tsd:accessOptions>
          </tsd:logical>
        </tsd:doctype>
      </tsd:schemaInfo>
    </xs:appinfo>
  </xs:annotation>
  <xs:element name = "Mail">
    <xs:annotation>
      <xs:appinfo>
        <tsd:elementInfo>

        <!--- Beginning of mapping to Server Extension--->
          <tsd:physical>
            <tsd:map>
              <tsd:xTension>
                <tsd:onProcess>SXSMail.SendMail</tsd:onProcess>
              </tsd:xTension>
            </tsd:map>
          </tsd:physical>
        <!--- End of mapping to Server Extension--->
        </tsd:elementInfo>
      </xs:appinfo>
    </xs:annotation>
    <xs:complexType>
      <xs:sequence>
        <xs:element name = "Recipient" type = "xs:string"></xs:element>
        <xs:element name = "Subject" type = "xs:string"></xs:element>
        <xs:element name = "Body" type = "xs:string"></xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

Notes:

  1. In this example, the Server Extension function is specified on the root node of the doctype, thus ensuring that the function is executed after parsing the whole instance. It could also be placed at a subordinate node of a document node.
  2. Mapping to a Server Extension function means that the corresponding subtree of the XML document is not stored in Tamino, but is processed by the map-in function (see Tamino Server Extension Functions) being invoked.

The following XML object is an instance of the doctype "Mail". Using the schema definition above, this instance is not stored in Tamino, but sent to the e-mail address in the Recipient element:

<Mail>
  <Recipient>world@planet.org</Recipient>
  <Subject>Global greeting</Subject>
  <Body>Hello World!</Body>
</Mail>