Schema Operations

This document discusses the following topics:


Defining a Schema

The operation of defining a schema can be divided into two categories:

Simple Schema Definition

The operation of simple schema definition is explained in the section The Logical Schema. The steps described there apply to simple schema definition.

Defining a Cluster of Schemas

It is possible to define multiple schemas, i.e. a cluster of schemas, in a single request. This makes it possible:

  • To define a set of schemas that have circular dependencies without having to manually break the cycle;

  • To define a set of related schemas in any order.

It is also possible to delete multiple collections, schemas or doctypes with a single _UNDEFINE command.

These features are fully supported by the Tamino Schema Editor.

Updating Existing Schemas ("Update Schema")

The X-Machine _define command can be used to update existing schemas. This section describes some aspects to be considered when updating schemas, and lists the restrictions involved.

This information is organized under the following headings:

General Considerations

Important:
The guiding principle to updating existing schemas is the following: It is guaranteed that all documents already stored in a doctype also remain valid with respect to the updated schema and the doctypes defined therein. If the schema is too strict, it must be loosened.

It is very difficult to detect all cases where a schema modification still permits all instances to be valid with respect to the new schema, for example if an element's content model is changed from arbitrary content (e.g. if untyped) to an explicit content model using a complex type definition. In this case, instances that are already loaded into Tamino may or may not validate against the new schema. When defining the new schema, it is possible to pass a parameter _mode=validate with the _define command. This causes Tamino to first perform a structure-based comparison of the old schema and the new schema. If Tamino cannot guarantee that all instances validate against the new schema, it then explicitly validates all instances against the new schema. If no errors are detected, the new schema is accepted. This can be done using the Tamino Schema Editor.

Note:
Adding indices and adding or changing collations are potentially time-consuming operations, due to the doctype scan. This is also true for _define with _mode=validate.

This implies:

Logical Schema

  • Structural updates are restricted to adding optional nodes (elements or attributes).

    When adding new nodes to a schema, the same restrictions as for using the _define statement generally apply (see Mapping Type Dependencies).

  • Type changes are not possible, except that existing enumerations can be extended and constraining facets such as xs:length, xs:maxLength, xs:maxExclusive, xs:maxInclusive, xs:minExclusive, xs:minInclusive, xs:totalDigits and xs:fractionDigits can be loosened or omitted.

  • It is also possible to loosen restrictions by adding further xs:pattern facets, if there was already one xs:pattern which must remain unchanged.

  • Fixed values may not be changed. Default values may be changed.

  • Changes of the multiplicity are allowed if they loosen the schema. This means:.

    The minOccurs attribute for any xs:element, xs:choice or xs:sequence already described in the old schema may be decreased, whereas maxOccurs may be increased. Similarly, the use attribute of xs:attribute may be changed from "required" to "optional".

    If instances are to be stored that contain a node not yet described in the schema, the node can be added to the schema with minOccurs="0" (element) or with use = "optional" (attribute).

  • Updating a schema can add, change or remove collations defined for elements or attributes.

  • Doctypes can be added, but not removed. (See the section Undefine from a Schema below.)

Physical Schema

  • The contents of the tsd:index element may be changed, except for objects mapped to a non-XML node.

  • Attribute values may be changed, as long as this does not create conflicts (for example, a node's mapping type must not be changed).

    The default attributes of xs:element and xs:attribute elements may be changed.

Changing Element and Attribute Values in the Physical Schema

Generally, for all kinds of mapping, schema information may be changed, as long as this is compatible with existing objects.

For the sake of convenience, this section lists the attributes specifically related to the mapping possibilities for SQL, Adabas and Server Extensions.

SQLTable Mapping Information: the tsd:subTreeSQL element

The contents of the following child element of the tsd:subTreeSQL element may be changed:

tsd:accessPredicate

The values of the following attributes of the tsd:subTreeSQL element may be changed:

datasource
password
schema
table
userid

SQL Column Mapping Information: the tsd:nodeSQL element

The following attribute of the tsd:nodeSQL element may be changed:

column

Adabas File Mapping Information: the subTreeAdabas element

The following attributes of the tsd:subTreeAdabas element may be changed:

dbid
encoding
fnr
password

Adabas PE Mapping Information: the tsd:subTreeAdabasPE element

The following attribute of the tsd:subTreeAdabasPE element may be changed:

shortname

Adabas Field and Adabas MU Mapping Information: the tsd:nodeAdabasField element

The following attributes of the tsd:nodeAdabasField element can be changed:

encoding
format
length
shortname

Server Extension Function Mapping Information: the tsd:xTension element

All elements and attribute identifying the Server Extension function (tsd:xTension element) can be changed.

tsd:index Element

The contents of the tsd:index element may be changed. This means adding or removing support for standard or text indexing by adding or removing the tsd:standard or tsd:text element within the tsd:index element.

An exception is the indexing of non-XML documents: Currently, this cannot be switched on or off via update-define.

tsd:ignoreUpdate Element

The element tsd:ignoreUpdate may be added but not removed.

tsd:structureIndex Element

The node of the structure index for XML doctypes can be changed in an arbitrary fashion.

tsd:compress Element

The compression mode used for the document stored in a doctype may be changed via update-define. This does not change the compression of documents already stored in Tamino.

Schema Evolution for Open Content ("Update Schema" Processing)

Schema evolution (also called "update schema") for open content differs from normal closed content. If you do not explicitly use _mode=validate, the following applies:

Unlike closed content, unknown elements with arbitrary content may exist. Therefore it is generally not possible to add elements to the schema for open content doctypes. On the other hand, removal of elements is possible without changing the integrity rule: all XML instances already stored in a doctype must remain valid with respect to the new schema.

The following rules apply for adding and removing elements from a content model:

  1. Elements used in open content doctype: its CNS (child-element-name-set) cannot be extended.

  2. Elements used in closed content doctype: optional child elements can be added

  3. Elements used in both open and closed content doctypes: The conjunction of the two rules above holds: its CNS must be the same. This means only the multiplicity can be defined more loosely.

  4. The doctype of elements used only in closed content is changed to open content: The disjunction of the two rules above results in the fact that the CNS can be reduced or increased.

Notes:

  1. The open content update schema rules on the CNS are equivalent to the rules imposed for xs:any with processContents="loose" on global elements.
  2. Analogous rules apply for attribute definitions.

Update Schema Checks for Imported Schemas

Prior to Tamino version 4.2, updating of schemas which had been imported using the xs:import element was not permitted.

Update Schema Checks for xsi:type

There are several scenarios in schema update where all documents containing xsi:type need to be revalidated. Revalidation can be requested via _mode=validate. Such cases occur for example:

  • if global types have been added and there are wildcards with processContents="lax";

  • if global type definitions have been removed, causing references to them to become invalid.

Undefine

One or more doctypes can be removed from a TSD schema by sending the following command to Tamino:

_undefine = {doctype|schema|collectionname},...

The operand is a comma-separated list of doctypes, schemas and/or collectionnames, where:

  • doctype is defined as collectionname/schemaname/ . . . /doctypename

  • schema is defined as collectionname/schemaname

Undefining a doctype deletes it from Tamino, including all documents stored inside. The respective tsd:doctype element is removed from the schema document stored in Tamino. This may lead to the schema document not defining any doctype at all.

Deleting a schema deletes all doctypes that are defined in it.

There is a postcondition that dangling references to imported or included schemas must not exist after the undefine operation (referential integrity).