Tamino XML Server Version 9.7
 —  Tamino XML Schema User Guide  —

The Logical Schema

The following topics are discussed in this chapter:


Schema

The root element of each XML Schema-conformant schema document is the xs:schema element. It may have the optional targetNamespace attribute, which specifies the namespace to which all definitions in the current schema document belong.

<xs:schema targetNamespace="http://my-company.com/"
           xmlns:xs="http://www.w3.org/2001/XMLSchema">

If a schema has a targetNamespace, all globally defined objects (elements, attributes, type definitions etc.) that occur as direct child elements of xs:schema have qualified names that belong to that namespace. Locally declared elements or attributes only belong to the targetNamespace if their form attribute, which defaults to the value of the elementFormDefault or attributeFormDefault attribute of xs:schema, has a value of "qualified".

Top of page

Elements and Attributes

Elements and attributes are described by:

Top of page

Overview: The Logical Part of the Meta Schema

The logical schema specifies the structural information of the schema.

The Tamino Schema Language is defined by a meta schema (also known as a schema of schemas), based on a subset of the W3C XML Schema standard. It is described in the section TSD Logical: Definitions of the Tamino XML Schema Reference Guide.

The structure of the logical schema is shown in the graphics below. The root element is the xs:schema element.

Note:
The namespace prefix xs: is omitted in all of these graphics.

Logical part of TSD: Expansion of root element xs:schema

graphics/TSD3logi-schema.gif

The top-level element xs:schema as defined in this graphic is the container for all the information pertaining to a Tamino schema. For further information about the elements and attributes as they are defined in the W3C standard, see XML Schema Elements. This section describes Tamino's implementation of the standard.

Particles

The following particle definitions appear below:

  1. xs:all

  2. xs:annotation

  3. xs:any

  4. xs:choice

  5. xs:element

  6. xs:field

  7. xs:group

  8. xs:selector

  9. xs:sequence

Logical part of TSD: Expansion of xs:all

graphics/TSD3logi-all.gif

Logical part of TSD: Expansion of xs:annotation

graphics/TSD3logi-annotation.gif

Logical part of TSD: Expansion of xs:any

graphics/TSD3logi-any.gif

Logical part of TSD: Expansion of xs:choice

graphics/TSD3logi-choice.gif

Logical part of TSD: Expansion of xs:element

graphics/TSD3logi-element.gif

Logical part of TSD: Expansion of xs:field

graphics/TSD3logi-field.gif

Logical part of TSD: Expansion of xs:group

graphics/TSD3logi-group.gif

Logical part of TSD: Expansion of xs:selector

graphics/TSD3logi-selector.gif

Logical part of TSD: Expansion of xs:sequence

graphics/TSD3logi-sequence.gif

Attribute-Related Elements

The following attribute element definitions appear below:

  1. xs:anyAttribute

  2. xs:attribute

  3. xs:attributeGroup

Logical part of TSD: Expansion of xs:anyAttribute

graphics/TSD3logi-anyAttribute.gif

Logical part of TSD: Expansion of xs:attribute

graphics/TSD3logi-attribute.gif

Logical part of TSD: Expansion of xs:attributeGroup

graphics/TSD3logi-attributeGroup.gif

Type-Related Elements

The following type element definitions appear below:

  1. xs:complexType

  2. xs:simpleType

Logical part of TSD: Expansion of xs:complexType

graphics/TSD3logi-complexType.gif

Logical part of TSD: Expansion of xs:simpleType

graphics/TSD3logi-simpleType.gif

Type Derivation Elements

The following type derivation element definitions appear below:

  1. xs:extension as a child element of xs:complexContent

  2. xs:extension as a child element of xs:simpleContent

  3. xs:restriction as a child element of xs:complexContent

  4. xs:restriction as a child element of xs:simpleContent

Logical part of TSD: Expansion of xs:complexContent/xs:extension

graphics/TSD3logi-complexContent_extension.gif

Logical part of TSD: Expansion of xs:simpleContent/xs:extension

graphics/TSD3logi-simpleContent_extension.gif

Logical part of TSD: Expansion of xs:complexContent/xs:restriction

graphics/TSD3logi-complexContent_restriction.gif

Logical part of TSD: Expansion of xs:simpleContent/xs:restriction

graphics/TSD3logi-simpleContent_restriction.gif

Note:
xs:restriction as a child element of xs:simpleType has the same content model as xs:restriction as a child element of xs:simpleContent, except that it does not allow xs:simpleType, xs:attribute, xs:attributeGroup and xs:anyAttribute as child elements.

Top of page

Constraints on Element Definition

In general, there are two kinds of elements in XML Schema: elements of simple type and elements of complex type. Elements of complex type contain other elements or attributes, whereas elements of simple type contain only character data but neither child elements nor attributes. xs:element enables you to define elements of both simple type and complex type by using its child elements xs:simpleType and xs:complexType, or by referencing a user-defined or predefined named type. Lists of the predefined types offered by the XML Schema standard can be found at http://www.w3.org/TR/xmlschema-0/#CreatDt and http://www.w3.org/TR/xmlschema-2/#built-in-datatypes. All the predefined simple types are also available in Tamino. Additionally, in XML Schema there are mechanisms for deriving new types from existing types. These are:

Type derivation by restriction can be used:

The restriction mechanism is provided by the xs:restriction element.

Complex type definitions offer the possibility of defining elements with attributes, and elements with child elements. The XML Schema standard offers a wealth of possibilities for defining elements and attributes. Most but not all of them are available in the Tamino schema definition language.

In general, TSD (like XML Schema) offers both named and anonymous complex type definitions, i.e. a complex type definition may or may not have a name attribute by which it can subsequently be referenced.

The complex type definition of XML Schema is described in detail at http://www.w3.org/TR/xmlschema-0/#DefnDeclars

An attribute can be defined using the extension mechanism. For defining extensions, use the xs:extension child element of the xs:simpleContent, which is, in turn, a child element of the xs:complexType element. Here is an example:

<xs:complexType>
  <xs:simpleContent>
    <xs:extension base = "xs:normalizedString">
      <xs:attribute name = "duration"
                    type = "xs:unsignedShort"
                    use = "required"/>
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>

Here, a new complex type is created by extending the existing type xs:normalizedString with an attribute whose name is duration that must be specified (use = required) and is of type xs:unsignedShort. The xs:simpleContent element indicates that the element to be defined has no child elements, i.e. it contains only character data and attributes.

You can create a complex type containing elements by using, for example, the xs:sequence element, which allows you to define a sequence of elements comprising the content model. Also you can specify alternatively appearing elements with the xs:choice element. If you have defined elements using the xs:sequence, xs:choice or xs:all elements and want to define additional attributes, this cannot be done as described above using the xs:simpleContent element; however, it can be achieved by the xs:attribute element.

Tamino also supports groups and attribute groups as defined by XML Schema. For more information, see the descriptions of the elements xs:group and xs:attributeGroup.

Top of page

Constraints on Attribute Definition

Attributes can be declared locally or globally. They can only be of simple types. Therefore, the constraints on attributes (which correspond to the constraints on elements described above) only apply to the attribute type and the element xs:simpleType.

For more information, see the section xs:attribute element of the Tamino XML Reference Guide.

Top of page

XML Datatypes

This section describes Tamino's facilities for specifying datatypes for XML data. It is subdivided into the following parts:

Datatype and Facet Support in TSD Compared to XML Schema 1.0

Tamino offers the same possibilities for defining new datatypes as XML Schema does; of these, the simple types are described in the W3C standards document XML Schema Part 2: Datatypes. All datatypes described in that document are implemented in TSD.

For readers not familiar with XML Schema, the basics of the mechanisms used for datatype definition in Tamino and XML Schema are briefly summarized here.

According to the definition in the XML Schema specification, a datatype is a set (more precisely, a 3-tuple) comprising the following items:

The value space

The value space is the set of values that is allowed for a given datatype. A value space has some properties; for example, it can be ordered.

The lexical space

The lexical space is the set of valid literals for a datatype. It may be possible to have more than one representation for one and the same member of a specific value space. These different representations are different members in the lexical space that represent the same element in the value space. For example, "1000" and "1.000E3" are two different literals and therefore two different members of the lexical space that both represent the same member in the value space of the datatype float.

A set containing one or more facets

A facet is a property of a value space that can be used to characterize that value space. It typically represents a single aspect of the value space (i.e., a single dimension in a multi-dimensional representation of the value space). A facet can be fundamental (semantically characterizing the value space) or non-fundamental (defining constraints to the value space, therefore also called a constraining facet).

Built-In Datatypes

These datatypes are predefined in Tamino (an in the W3C XML Schema), so you can use them without having to declare or define them. They can be divided into two groups:

Primitive Types in Tamino

Primitive datatypes are predefined by the XML Schema standard. The following primitive types are available in Tamino:

Datatype Description Lexical Representation
xs:string Character string of unlimited length A short string
xs:boolean Boolean value. "true", "false", "1", "0"
xs:decimal Decimal number. A precision of at least 18 digits is supported. "-1.23", "125.64", "0.0", "+500000.00", "170"
xs:float Single-precision 32-bit floating point type according to the IEEE 754-1985 Standard for Binary Floating-Point Arithmetic. This type includes the special values positive and negative zero, positive and negative infinity, and not-a-number. "-1E3", "172.363E14", "18.73e-5", "45", "INF", "-INF", "0", "-0", "NaN"
xs:double Double-precision 64-bit floating point type according to the IEEE 754-1985 Standard for Binary Floating-Point Arithmetic. "-1E4", "547.433E12", "36.78e-2", "12", "INF", "-INF", "0", "-0", "NaN"
xs:duration This datatype specifies a period of time: The value space is a six-dimensional space, where the coordinates designate the Gregorian year, month, day, hour, minute and second.

Note:
This datatype cannot indexed.

The lexical representation follows the format "PnYnMnDTnHnMnS". An optional fractional part for seconds is allowed. Negative durations are also allowed.
xs:time A specific time of day as defined in §5.3 of the ISO 8601 standard on date and time formats. Also see note below.

The lexical format is hh:mm:ss

Note:
An optional fractional part for seconds is permitted. A time zone can be specified, if necessary: "Z" for UTC time, or a signed time difference in the format hh:mm

Examples:

"05:20:23.2"

"13:20:00-05:00"

xs:date A Gregorian calendar date according to §5.2.1 of the ISO 8601 standard on date and time formats. Also see note below.

The lexical format is CCYY-MM-DD. To accommodate values outside the range 1-9999, additional digits and a negative sign can be added to the left. (The year 0000 is prohibited.)

Example:

"1999-05-31"

xs:dateTime A specific instant of time (a combination of date and time) as defined in §5.4 of the ISO 8601 standard on date and time formats. Also see note below.

The lexical format is CCYY-MM-DDThh:mm:ssZ, where "T" is the delimiter character between date and time and "Z" denotes an optional time zone.

Examples:

"1999-05-31T13:20:00-05:00"

"2001-12-01T05:20:23.2"

xs:gYearMonth This datatype represents a specific Gregorian month in a specific Gregorian year.

The lexical format is CCYY-MMZ, where "Z" denotes an optional time zone.

Example:

"2001-05"

xs:gYear This datatype represents a Gregorian year.

The lexical format is CCYYZ, where "Z" denotes an optional time zone.

Example:

"1994"

xs:gMonthDay This datatype specifies a Gregorian date.

The lexical format is --MM-DDZ, where "Z" denotes an optional time zone.

Example:

"--04-01"

xs:gMonth This datatype denotes a Gregorian month that recurs every year.

The lexical format is --MM--Z, where "Z" denotes an optional time zone.

Example:

"--07" ("--07--" is accepted for backward compatibility)

xs:gDay This datatype denotes a Gregorian day that recurs every month.

The lexical format is ---DDZ, where "Z" denotes an optional time zone.

Example:

"---13"

xs:hexBinary Hexadecimal-encoded arbitrary binary data.
Examples:
"9a7f", "FFFF3", "0100"
xs:base64Binary Base64-encoded arbitrary binary data. The entire binary stream is encoded using the Base64 Content-Transfer-Encoding defined in Section 6.8 of RFC 2045.  
xs:anyURI A reference to a Uniform Resource Identifier (URI).  
xs:QName An XML qualified name, consisting of a namespace name and a local part.  
xs:NOTATION Represents the NOTATION attribute type from XML attributes. This is an abstract datatype, i.e. the user must derive an own datatype from it.
Derived Built-In Datatypes

In addition to these primitive datatypes, it is also possible to derive datatypes in TSD from other datatypes, which in turn may be either primitive types or derived types, by two different mechanisms:

The W3C XML Schema standard contains further mechanisms for deriving datatypes from a given datatype.

The following derived datatypes are supported in Tamino:

Datatype Derived From Description SQL Equivalent
xs:normalizedString String A string after whitespace normalization. VARCHAR CHAR
xs:token xs:normalizedString Does not contain the line feed ("#xA") or tab ("#x9") characters, does not have leading or trailing spaces ("#x20") and does not have multiple consecutive internal spaces. VARCHAR CHAR
xs:NMTOKEN xs:token Represents the NMTOKEN attribute type (DTD) that is described in http://www.w3.org/TR/2004/REC-xml-20040204/#NT-Nmtoken. VARCHAR CHAR
xs:NMTOKENS xs:NMTOKEN Represents the NMTOKENS attribute type (DTD) that is described in http://www.w3.org/TR/2004/REC-xml-20040204/#NT-Nmtokens. VARCHAR CHAR
xs:Name xs:token Represents an XML Name as described in http://www.w3.org/TR/2004/REC-xml-20040204/#NT-Name. VARCHAR CHAR
xs:NCName xs:Name Represents an XML "non-colonized" Name as described in http://www.w3.org/TR/xmlschema-2/#NCName. VARCHAR CHAR
xs:ID xs:NCName Represents the ID attribute type as described in http://www.w3.org/TR/xmlschema-2/#ID. VARCHAR CHAR
xs:IDREF xs:NCName Represents the IDREF attribute type as described in http://www.w3.org/TR/xmlschema-2/#IDREF. VARCHAR CHAR
xs:IDREFS xs:IDREF Represents the IDREFS attribute type as described in http://www.w3.org/TR/xmlschema-2/#IDREFS. VARCHAR CHAR
xs:ENTITY xs:NCName Represents the ENTITY attribute type as described in http://www.w3.org/TR/xmlschema-2/#ENTITY. VARCHAR CHAR
xs:ENTITIES xs:ENTITY Represents the ENTITIES attribute type as described in http://www.w3.org/TR/xmlschema-2/#ENTITIES. VARCHAR CHAR
xs:language xs:token Represents formal language identifiers, as defined by RFCs 3066, 4646 and 4647 or their successor(s). The value space and lexical space are the set of all strings that conform to the pattern [a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})*. no equivalent
xs:integer xs:decimal The standard mathematical integer datatype. Derived from datatype decimal by setting the facet "fractionDigits" to 0. no equivalent in SQL due to excessive value range
xs:nonPositiveInteger xs:integer An integer less than or equal to zero. no equivalent
xs:negativeInteger xs:nonPositiveInteger An integer less than zero. no equivalent
xs:long xs:integer An integer in the range -9223372036854775808 (-263) to 9223372036854775807 (263-1). not supported
xs:int xs:long An integer in the range -2147483648 (-231) to 2147483647 (231-1). INTEGER
xs:short xs:int An integer in the range -32768 (-215) to 32767 (215-1). SMALLINT
xs:byte xs:short An integer in the range -128 (-27) to 127 (27-1). TINYINT
xs:nonNegativeInteger xs:integer An integer greater than or equal to zero (0 to 264-1). no equivalent
xs:unsignedLong xs:nonNegativeInteger An integer in the range 0 to 264-1. no equivalent
xs:unsignedInt xs:unsignedLong An integer in the range 0 to 4294967295 (232-1). no equivalent
xs:unsignedShort xs:unsignedInt An integer in the range 0 to 65535 (216-1). no equivalent
xs:unsignedByte xs:unsignedShort An integer in the range 0 to 255 (28-1). TINYINT
xs:positiveInteger xs:nonNegativeInteger An integer greater than zero. no equivalent

User-Defined Datatypes

You can create two kinds of user-defined datatypes in Tamino (or XML Schema):

Hierarchy of Datatypes

The hierarchy of datatypes allowed in TSD is depicted in the following graphic:

Built-in Datatype Hierarchy

graphics/thschema.gif

Originally published in the W3C Recommendation "XML Schema 1.0 Part 2: Datatypes"

Ranges of Numeric Types and Related Issues

The value space of the datatype integer is from -9223372036854775808 (-2^63, approx. -9E18) to +9223372036854775807 (2^63-1, approx. 9E18). Numbers outside this range lead to overflow errors. In queries, the value -9223372036854775808 must be coded as (-9223372036854775807 -1). This type is sometimes called "signed integer".

The value space of the datatype unsigned integer is from 0 to 18446744073709551615 (2^64, approx. 1.8E19). Numbers outside of this range will lead to overflow errors.

The value space of the datatype decimal is from -999999999999999999 (-1E18-1) to +999999999999999999, or more precisely: from -999999999999999999 to -0.000000000000000001, 0, and from 0.000000000000000001 to +999999999999999999. Accuracy is limited to 18 significant digits; for example, 123456789.987654321 (18 significant digits) can be represented exactly, but 123456789.9876543215 (19 significant digits) is rounded to 123456789.987654322. Numbers between -0.000000000000000005 and 0.000000000000000005 are rounded to zero. Numbers greater than or equal to 999999999999999999.5 or less than or equal to -999999999999999999.5 lead to overflow.

The value spaces of the datatypes float and double and their binary representations are as specified in IEEE 754. All comparisons and arithmetic operations with numeric data are carried out in the internal representation. Consequently, the limitations of binary representation as described in IEEE 754: IEEE Standard for Binary Floating-Point Arithmetic apply.

If you want to use the datatypes float or double, you should understand these numeric formats, e.g. by reading an introductory text on numerical mathematics, in order to know exactly the difficulties and limitations.

If you have to calculate financial results, use the type decimal. Neither float nor double is suitable.

The approximate ranges of float and double in decimal notation are as follows:

The range of float is approximately from -3.402823466E+38 to -1.175494351E-38, 0, and from 1.175494351E-38 to 3.402823466E+38; also the special values -INF, INF, NaN.

The range of double is approximately from -1.7976931348623158E+308 to -2.2250738585072014E-308, 0, and from 2.2250738585072014E-308 to 1.7976931348623158E+308; also the special values -INF, INF, NaN.

Numbers outside these ranges lead to overflow or to the results -INF, INF, NaN.

The word "approximately" is used above because different conversion routines on different platforms may behave slightly differently. The conversion from string to the internal binary representation can handle any precision, but the precision of the result of the conversion cannot exceed the precision of the internal representation. The conversion of the internal representation to string also leads to limitations of precision. This implementation defines that the conversion of float to string returns at most 6 significant digits, and the conversion of double to string returns at most 14 significant digits.

Conversion to string always yields the canonical representation. This applies also to integer, unsigned integer and decimal.

Type Propagations

When two operands in an arithmetic expression or a comparison are of different numeric types, Tamino ensures that the results are correctly evaluated. For example, if you add an integer to a decimal, the integer is converted to decimal and the result is calculated as a decimal. The conversion of the integer to decimal may result in an overflow, as the value space of decimal is smaller than the value space of integer. Overflow may also occur when attempting to convert, for example, a negative integer to type unsigned integer.

Generally, types are propagated in the following order: integer --> unsigned integer --> decimal --> float --> double.

Exceptions from this rule

If, for example, an integer and a double are added, the integer is not converted via the steps in the chain. That would cause unnecessary loss of precision. Instead, the integer is converted directly to double.

If the parameters to the min() function are a sequence of integers and unsigned integers, the result is normally unsigned integer. However, if at least one sequence member is negative, the result is integer (to avoid overflow).

Ordering and Comparison Operations with Datatypes for Date and Time

The following applies for datatypes such as date or time:

Relationship Between XML Schema Types and integer Types

The signed integer type is used for the following XML Schema types:

The unsigned integer type is used for the following XML Schema types:

Defining Simple Types

There is one mechanism for creating a new simple type from an existing base type, namely restriction.

Constraining facets can be used to restrict the value space of an existing simple type (which is specified as the base attribute) using the restriction element.

This is done in Tamino in the same way as in the XML Schema standard.

The following constraining facets, as defined in the XML Schema standard, are available in Tamino:

They can be applied individually or in combination. However, not all combinations of restricting facets are valid for all datatypes. See the table of Valid Combinations of Restricting Facets and Base Datatypes for details.

For improved reusability, it is possible to use the name attribute to specify a name for a simple type. The definition can subsequently be referenced using this name. This is called a named simple type definition. At the highest level, only named type definitions are allowed; anonymous type definitions are only permitted below the highest level.

The first example below shows a named simple type definition; the second example shows an anonymous simple type definition:

Example of a simple type definition using a single facet (enumeration)

This example uses the enumeration facet to restrict the value space of the base type NMTOKEN to three possible values. It defines a simple type for a datatype for characterizing locomotives of different traction based on a restriction that only the three values "steam", "diesel" and "electric" are allowed data for the defined type:

<xs:simpleType name="traction">
  <xs:restriction base = "xs:NMTOKEN">
    <xs:enumeration value = "steam"/>
    <xs:enumeration value = "diesel"/>
    <xs:enumeration value = "electric"/>
  </xs:restriction>
</xs:simpleType>

Example of a simple type definition using multiple facets (totalDigits and fractionDigits)

The SQL datatype numeric(18,5) is expressed using the totalDigits and fractionDigits facets as:

<xs:element name = "myDecimal">
  <xs:simpleType>
    <xs:restriction base="xs:decimal">
      <xs:totalDigits value= "18"/>
      <xs:fractionDigits value = "5"/>
    </xs:restriction>
  </xs:simpleType>
</xs:element>

Defining Complex Types

A simple type definition allows neither the definition of a node that contains other elements nor the definition of a node that contains attributes; therefore, the nodes that can be defined using simple type definitions are terminal nodes in the XML tree. To define more complex nodes containing elements or attributes, a more sophisticated kind of type definition is required, namely the complex type definition. It allows the following:

This section deals with the following topics:

Definition of Element and Attribute Wildcards

xs:any (Definition of Element Wildcards)

The xs:any element enables you to extend the instances with elements that are not specified by the schema. It is allowed in xs:choice and xs:sequence elements.

<xs:element name="client">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="cl_firstname" type="xs:string"/>
      <xs:element name="cl_lastname" type="xs:string"/>
      <xs:any minOccurs="0"/>
    </xs:all>
  </xs:complexType>
</xs:element>

xs:anyAttribute (Definition of Attribute Wildcards)

This element is available for the specification of attribute wildcards in complex type definitions. It allows the occurrence of arbitrary attributes with the current element in the XML instance to be validated against the schema. It is allowed in the xs:choice element.

Model Groups: Choices, Sequences and All

xs:complexType is used to:

Extension and Definition of Simple Content Models

Another possibility for creating a complex type is to define a simple content model. A simple content model for a complex type can be created by extending an existing base type with additional attributes. This is done using the xs:simpleContent element and its child element, xs:extension.

The following examples illustrate this:

Example 1:

A complex type is defined with a simple content model constructed as follows:

An attribute duration of type unsignedShort is added to an element of type normalizedString

<xs:element name="Track">
  <xs:complexType>
    <xs:simpleContent>
      <xs:extension base = "xs:normalizedString">
        <xs:attribute name = "duration"
                      type = "xs:unsignedShort"
                      use = "required"/>
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
</xs:element>

Example 2:

The attributes language and length are added:

<xs:element name = "TITLE">
  <xs:complexType>
    <xs:simpleContent>
      <xs:extension base = "xs:string">
        <xs:attribute name = "language"
                      type = "xs:string"/>
        <xs:attribute name = "length"
                      use = "required"
                      type = "xs:string"/>
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
</xs:element>

Example 3:

Perhaps surprisingly, an empty element is also modeled in TSD using an empty complex type definition with the mixed attribute set to the value false:

<xs:element name = "Tag">
  <xs:complexType mixed="false"/>
</xs:element name>

Untyped Element

If neither a type attribute nor a simple or complex type is specified for an element, arbitrary attributes and child elements are permitted.

Examples of XML Data Typing

For more information about the Tamino query language, see the X-Query User Guide.

Example: Natively-Stored Doctype with some Formally-Structured Nodes

The following example specifies the "born" node as a field of type integer in a doctype stored natively in XML that is to be indexed for full text retrieval. In XML Schema, it can be represented by this code:

<xs:schema xmlns:xs ="http://www.w3.org/2001/XMLSchema">
  <!-- schema for patient data. This could be a fragment of a larger
       DTD modeling hospital data -->
  <xs:element name='patient'>
    <xs:complexType>
      <xs:sequence>
        <xs:element ref='name' minOccurs='0'/>
        <xs:element ref='address' minOccurs='0'/>
        <xs:element name='born' type='xs:integer'/>
      </xs:sequence>
      <xs:attribute name='ID' type='xs:string' use='optional'/>
    </xs:complexType>
  </xs:element>
  <xs:element name='name'>
    <xs:complexType>
      <xs:sequence>
        <xs:element ref='surname'/>
        <xs:element ref='firstname'/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name='surname' type='xs:string'/>
  <xs:element name='firstname' type='xs:string'/>
  ...
  <tsd:elementInfo>
    <tsd:physical>
      <tsd:native>
        <tsd:index>
          <tsd:standard/>
        </tsd:index>
      </tsd:native>
    </tsd:physical> 
  </tsd:elementInfo>
  ...
</xs:schema>

This improves the performance of the execution of requests such as: "List all patients born after a certain date", for example:

....../patient[born >= 1950]

Top of page

Substitution Groups

The motivation for substitution groups originates in the area of object-oriented design. Assume that a schema includes the following global element declarations:

<xs:element name="name" type="xs:string" />
<xs:element name="surname" type="xs:string" substitutionGroup="name" />

Then the element <surname> may replace <name> in any context where <name> would have been validated against the element declaration shown above. In general, the type of the substituting element must be derived from the type of the substituted element.

Top of page

Identity Constraints

Identity constraints allow you to define unique constraints or keys that are checked within the scope of a single XML document that is validated against the schema.

As an example, we consider a schema that describes a company and its employees:

<xs:element name="company">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="name" type="xs:string"/>
      <xs:element ref="address"/>
      <xs:element ref="employee" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="employee">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="name" type="xs:string"/>
      <xs:element name="boss" type="xs:string" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

We assume that each employee has a unique name. We further assume that, as a rule, an employee has a boss who is also an employee (of the same company). These constraints can be described by the following extension of the element declaration shown above:

<xs:element name="company">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="name" type="xs:string"/>
      <xs:element ref="address"/>
      <xs:element ref="employee" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>

  <xs:key name="emplName">
    <xs:selector xpath="employee"/>
    <xs:field xpath="name"/>
  </xs:key>

  <xs:keyref name="emplNameRef" refer="emplName">
    <xs:selector xpath="employee"/>
    <xs:field xpath="boss"/>
  </xs:keyref>

</xs:element>

The <xs:key> constraint asserts that:

If <xs:unique> is used instead of <xs:key>, the second condition is not enforced by the identity constraint. Based on the extended declaration of the company element, the employee:

<employee>
  <name>Bob Smith</name>
  <boss>Alex Miller</boss>
</employee>

is only valid if:

Top of page