The Logical Schema

The following topics are discussed in this chapter:


Schema

The root element of each XML Schema-conformant schema document is the xs:schema element. It may have the optional targetNamespace attribute, which specifies the namespace to which all definitions in the current schema document belong.

<xs:schema targetNamespace="http://my-company.com/"
           xmlns:xs="http://www.w3.org/2001/XMLSchema">

If a schema has a targetNamespace, all globally defined objects (elements, attributes, type definitions etc.) that occur as direct child elements of xs:schema have qualified names that belong to that namespace. Locally declared elements or attributes only belong to the targetNamespace if their form attribute, which defaults to the value of the elementFormDefault or attributeFormDefault attribute of xs:schema, has a value of "qualified".

Elements and Attributes

Elements and attributes are described by:

  • Elements xs:element and xs:attribute
    If these elements are child elements of the xs:schema root element, they are called global elements or attributes, respectively. Otherwise, we speak of local elements or attributes.

  • The name attribute
    The value of the name attribute is an unqualified (i.e. local) name (without namespace prefix) of the element or attribute being defined, or a ref attribute containing a qualified reference to a global element or attribute (i.e. containing the namespace prefix, if relevant).

  • Cardinality
    The cardinality of the xs:element can be specified using minOccurs (default value: 1) and maxOccurs (default value: 1; the value may also be "unbounded") attributes. Similarly, the cardinality of xs:attribute is specified by the use attribute: use="optional" (default) or use="required".

  • Optional type attribute
    Only one of the following variants can be used.

    • The type attribute is a qualified reference to a globally defined type. This may be one of the following set of more than 40 simple types predefined by XML Schema a user-defined named simple type or user-defined complex type.

      Example:

      <xs:element name="fee" type="xs:decimal" />
      

      Instead of using the type attribute, one of the child elements xs:simpleType or xs:complexType may be used to define the type of the logical node.

    • xs:simpleType allows you to add constraints via facets specified as child elements of a nested xs:restriction element.

      Example:

      This example shows an element of predefined type, xs:string, with the xs:maxLength facet restricting the maximum length of the string; it has no attributes.

      <xs:element name="surname">
        <xs:simpleType>
          <xs:restriction base="xs:string">
            <xs:maxLength value="20"/>
          </xs:restriction>
        </xs:simpleType>
      </xs:element>
      

      Alternatively, a new simple type can be constructed using xs:list or xs:union.

      <xs:simpleType name="listOfInt">
        <xs:list itemType="xs:int" />
      </xs:simpleType>

      validates any list of whitespace-separated values of type xs:int:

      <xs:simpleType name="typeOfMaxOccurs">
        <xs:union memberType="xs:nonNegativeInteger">
          <xs:simpleType>
            <xs:restriction base="xs:token">
              <xs:enumeration value="unbounded" />
            </xs:restriction>
          </xs:simpleType>
        </xs:union>
      </xs:simpleType>
      

      shows a possible type definition for the maxOccurs attribute of xs:element; it allows for a non-negative integer or the token "unbounded".

    • Complex type definitions are described below.

      Note:
      A complex type can only be specified for element declarations; it is not allowed for attribute declarations. Only an element declaration may specify or reference a complex type definition.

Overview: The Logical Part of the Meta Schema

The logical schema specifies the structural information of the schema.

The Tamino Schema Language is defined by a meta schema (also known as a schema of schemas), based on a subset of the W3C XML Schema standard. It is described in the section TSD Logical: Definitions of the Tamino XML Schema Reference Guide.

The structure of the logical schema is shown in the graphics below. The root element is the xs:schema element.

Note:
The namespace prefix xs: is omitted in all of these graphics.

Logical part of TSD: Expansion of root element xs:schema

graphics/TSD3logi-schema.gif

The top-level element xs:schema as defined in this graphic is the container for all the information pertaining to a Tamino schema. For further information about the elements and attributes as they are defined in the W3C standard, see XML Schema Elements. This section describes Tamino's implementation of the standard.

Particles

The following particle definitions appear below:

  1. xs:all

  2. xs:annotation

  3. xs:any

  4. xs:choice

  5. xs:element

  6. xs:field

  7. xs:group

  8. xs:selector

  9. xs:sequence

Logical part of TSD: Expansion of xs:all

graphics/TSD3logi-all.gif

Logical part of TSD: Expansion of xs:annotation

graphics/TSD3logi-annotation.gif

Logical part of TSD: Expansion of xs:any

graphics/TSD3logi-any.gif

Logical part of TSD: Expansion of xs:choice

graphics/TSD3logi-choice.gif

Logical part of TSD: Expansion of xs:element

graphics/TSD3logi-element.gif

Logical part of TSD: Expansion of xs:field

graphics/TSD3logi-field.gif

Logical part of TSD: Expansion of xs:group

graphics/TSD3logi-group.gif

Logical part of TSD: Expansion of xs:selector

graphics/TSD3logi-selector.gif

Logical part of TSD: Expansion of xs:sequence

graphics/TSD3logi-sequence.gif

Attribute-Related Elements

The following attribute element definitions appear below:

  1. xs:anyAttribute

  2. xs:attribute

  3. xs:attributeGroup

Logical part of TSD: Expansion of xs:anyAttribute

graphics/TSD3logi-anyAttribute.gif

Logical part of TSD: Expansion of xs:attribute

graphics/TSD3logi-attribute.gif

Logical part of TSD: Expansion of xs:attributeGroup

graphics/TSD3logi-attributeGroup.gif

Type-Related Elements

The following type element definitions appear below:

  1. xs:complexType

  2. xs:simpleType

Logical part of TSD: Expansion of xs:complexType

graphics/TSD3logi-complexType.gif

Logical part of TSD: Expansion of xs:simpleType

graphics/TSD3logi-simpleType.gif

Type Derivation Elements

The following type derivation element definitions appear below:

  1. xs:extension as a child element of xs:complexContent

  2. xs:extension as a child element of xs:simpleContent

  3. xs:restriction as a child element of xs:complexContent

  4. xs:restriction as a child element of xs:simpleContent

Logical part of TSD: Expansion of xs:complexContent/xs:extension

graphics/TSD3logi-complexContent_extension.gif

Logical part of TSD: Expansion of xs:simpleContent/xs:extension

graphics/TSD3logi-simpleContent_extension.gif

Logical part of TSD: Expansion of xs:complexContent/xs:restriction

graphics/TSD3logi-complexContent_restriction.gif

Logical part of TSD: Expansion of xs:simpleContent/xs:restriction

graphics/TSD3logi-simpleContent_restriction.gif

Note:
xs:restriction as a child element of xs:simpleType has the same content model as xs:restriction as a child element of xs:simpleContent, except that it does not allow xs:simpleType, xs:attribute, xs:attributeGroup and xs:anyAttribute as child elements.

Constraints on Element Definition

In general, there are two kinds of elements in XML Schema: elements of simple type and elements of complex type. Elements of complex type contain other elements or attributes, whereas elements of simple type contain only character data but neither child elements nor attributes. xs:element enables you to define elements of both simple type and complex type by using its child elements xs:simpleType and xs:complexType, or by referencing a user-defined or predefined named type. Lists of the predefined types offered by the XML Schema standard can be found at http://www.w3.org/TR/xmlschema-0/#CreatDt and http://www.w3.org/TR/xmlschema-2/#built-in-datatypes. All the predefined simple types are also available in Tamino. Additionally, in XML Schema there are mechanisms for deriving new types from existing types. These are:

  • extension;

  • restriction;

  • list;

  • union.

Type derivation by restriction can be used:

  • to introduce constraining facets, in the case of simple types or complex types with simple content;

  • to restrict attribute occurrences for complex types with simple content or complex content;

  • to restrict the content model, in the case of complex types with complex content;

  • to apply constraining facets to the original datatype (base datatype).

The restriction mechanism is provided by the xs:restriction element.

Complex type definitions offer the possibility of defining elements with attributes, and elements with child elements. The XML Schema standard offers a wealth of possibilities for defining elements and attributes. Most but not all of them are available in the Tamino schema definition language.

In general, TSD (like XML Schema) offers both named and anonymous complex type definitions, i.e. a complex type definition may or may not have a name attribute by which it can subsequently be referenced.

The complex type definition of XML Schema is described in detail at http://www.w3.org/TR/xmlschema-0/#DefnDeclars

An attribute can be defined using the extension mechanism. For defining extensions, use the xs:extension child element of the xs:simpleContent, which is, in turn, a child element of the xs:complexType element. Here is an example:

<xs:complexType>
  <xs:simpleContent>
    <xs:extension base = "xs:normalizedString">
      <xs:attribute name = "duration"
                    type = "xs:unsignedShort"
                    use = "required"/>
    </xs:extension>
  </xs:simpleContent>
</xs:complexType>

Here, a new complex type is created by extending the existing type xs:normalizedString with an attribute whose name is duration that must be specified (use = required) and is of type xs:unsignedShort. The xs:simpleContent element indicates that the element to be defined has no child elements, i.e. it contains only character data and attributes.

You can create a complex type containing elements by using, for example, the xs:sequence element, which allows you to define a sequence of elements comprising the content model. Also you can specify alternatively appearing elements with the xs:choice element. If you have defined elements using the xs:sequence, xs:choice or xs:all elements and want to define additional attributes, this cannot be done as described above using the xs:simpleContent element; however, it can be achieved by the xs:attribute element.

Tamino also supports groups and attribute groups as defined by XML Schema. For more information, see the descriptions of the elements xs:group and xs:attributeGroup.

Constraints on Attribute Definition

Attributes can be declared locally or globally. They can only be of simple types. Therefore, the constraints on attributes (which correspond to the constraints on elements described above) only apply to the attribute type and the element xs:simpleType.

  • Attribute with Simple Type
    Attribute: type.
  • Attribute with Simple Type and Facets
    Element xs:simpleType with element xs:restriction containing facets.
  • Reference to Attributes
    Attribute: ref.

For more information, see the section xs:attribute element of the Tamino XML Reference Guide.

XML Datatypes

This section describes Tamino's facilities for specifying datatypes for XML data. It is subdivided into the following parts:

Datatype and Facet Support in TSD Compared to XML Schema 1.0

Tamino offers the same possibilities for defining new datatypes as XML Schema does; of these, the simple types are described in the W3C standards document XML Schema Part 2: Datatypes. All datatypes described in that document are implemented in TSD.

For readers not familiar with XML Schema, the basics of the mechanisms used for datatype definition in Tamino and XML Schema are briefly summarized here.

According to the definition in the XML Schema specification, a datatype is a set (more precisely, a 3-tuple) comprising the following items:

The value space

The value space is the set of values that is allowed for a given datatype. A value space has some properties; for example, it can be ordered.

The lexical space

The lexical space is the set of valid literals for a datatype. It may be possible to have more than one representation for one and the same member of a specific value space. These different representations are different members in the lexical space that represent the same element in the value space. For example, "1000" and "1.000E3" are two different literals and therefore two different members of the lexical space that both represent the same member in the value space of the datatype float.

A set containing one or more facets

A facet is a property of a value space that can be used to characterize that value space. It typically represents a single aspect of the value space (i.e., a single dimension in a multi-dimensional representation of the value space). A facet can be fundamental (semantically characterizing the value space) or non-fundamental (defining constraints to the value space, therefore also called a constraining facet).

Built-In Datatypes

These datatypes are predefined in Tamino (an in the W3C XML Schema), so you can use them without having to declare or define them. They can be divided into two groups:

Primitive Types in Tamino

Primitive datatypes are predefined by the XML Schema standard. The following primitive types are available in Tamino:

Datatype Description Lexical Representation
xs:string Character string of unlimited length A short string
xs:boolean Boolean value. "true", "false", "1", "0"
xs:decimal Decimal number. A precision of at least 18 digits is supported. "-1.23", "125.64", "0.0", "+500000.00", "170"
xs:float Single-precision 32-bit floating point type according to the IEEE 754-1985 Standard for Binary Floating-Point Arithmetic. This type includes the special values positive and negative zero, positive and negative infinity, and not-a-number. "-1E3", "172.363E14", "18.73e-5", "45", "INF", "-INF", "0", "-0", "NaN"
xs:double Double-precision 64-bit floating point type according to the IEEE 754-1985 Standard for Binary Floating-Point Arithmetic. "-1E4", "547.433E12", "36.78e-2", "12", "INF", "-INF", "0", "-0", "NaN"
xs:duration This datatype specifies a period of time: The value space is a six-dimensional space, where the coordinates designate the Gregorian year, month, day, hour, minute and second.

Note:
This datatype cannot indexed.

The lexical representation follows the format "PnYnMnDTnHnMnS". An optional fractional part for seconds is allowed. Negative durations are also allowed.
xs:time A specific time of day as defined in §5.3 of the ISO 8601 standard on date and time formats. Also see note below.

The lexical format is hh:mm:ss

Note:
An optional fractional part for seconds is permitted. A time zone can be specified, if necessary: "Z" for UTC time, or a signed time difference in the format hh:mm

Examples:

"05:20:23.2"

"13:20:00-05:00"

xs:date A Gregorian calendar date according to §5.2.1 of the ISO 8601 standard on date and time formats. Also see note below.

The lexical format is CCYY-MM-DD. To accommodate values outside the range 1-9999, additional digits and a negative sign can be added to the left. (The year 0000 is prohibited.)

Example:

"1999-05-31"

xs:dateTime A specific instant of time (a combination of date and time) as defined in §5.4 of the ISO 8601 standard on date and time formats. Also see note below.

The lexical format is CCYY-MM-DDThh:mm:ssZ, where "T" is the delimiter character between date and time and "Z" denotes an optional time zone.

Examples:

"1999-05-31T13:20:00-05:00"

"2001-12-01T05:20:23.2"

xs:gYearMonth This datatype represents a specific Gregorian month in a specific Gregorian year.

The lexical format is CCYY-MMZ, where "Z" denotes an optional time zone.

Example:

"2001-05"

xs:gYear This datatype represents a Gregorian year.

The lexical format is CCYYZ, where "Z" denotes an optional time zone.

Example:

"1994"

xs:gMonthDay This datatype specifies a Gregorian date.

The lexical format is --MM-DDZ, where "Z" denotes an optional time zone.

Example:

"--04-01"

xs:gMonth This datatype denotes a Gregorian month that recurs every year.

The lexical format is --MM--Z, where "Z" denotes an optional time zone.

Example:

"--07" ("--07--" is accepted for backward compatibility)

xs:gDay This datatype denotes a Gregorian day that recurs every month.

The lexical format is ---DDZ, where "Z" denotes an optional time zone.

Example:

"---13"

xs:hexBinary Hexadecimal-encoded arbitrary binary data.
Examples:
"9a7f", "FFFF3", "0100"
xs:base64Binary Base64-encoded arbitrary binary data. The entire binary stream is encoded using the Base64 Content-Transfer-Encoding defined in Section 6.8 of RFC 2045.  
xs:anyURI A reference to a Uniform Resource Identifier (URI).  
xs:QName An XML qualified name, consisting of a namespace name and a local part.  
xs:NOTATION Represents the NOTATION attribute type from XML attributes. This is an abstract datatype, i.e. the user must derive an own datatype from it.
Derived Built-In Datatypes

In addition to these primitive datatypes, it is also possible to derive datatypes in TSD from other datatypes, which in turn may be either primitive types or derived types, by two different mechanisms:

  • Restriction
    This means that the value space of the original datatype is constrained in some respect by a constraining facet. For example, the datatype nonNegativeInteger is derived from the datatype integer by constraining its value space to non-negative values; the constraint consists of the exclusion of negative values.

  • List
    This offers the possibility of creating a new datatype by combining elements of existing datatypes: instead of a single element of the datatype in question, a sequence of values of the same datatype is allowed. For example, the datatype xs:NMTOKENS is derived from the datatype xs:NMTOKEN by constructing a list of NMTOKEN values.

The W3C XML Schema standard contains further mechanisms for deriving datatypes from a given datatype.

The following derived datatypes are supported in Tamino:

Datatype Derived From Description SQL Equivalent
xs:normalizedString String A string after whitespace normalization. VARCHAR CHAR
xs:token xs:normalizedString Does not contain the line feed ("#xA") or tab ("#x9") characters, does not have leading or trailing spaces ("#x20") and does not have multiple consecutive internal spaces. VARCHAR CHAR
xs:NMTOKEN xs:token Represents the NMTOKEN attribute type (DTD) that is described in http://www.w3.org/TR/2004/REC-xml-20040204/#NT-Nmtoken. VARCHAR CHAR
xs:NMTOKENS xs:NMTOKEN Represents the NMTOKENS attribute type (DTD) that is described in http://www.w3.org/TR/2004/REC-xml-20040204/#NT-Nmtokens. VARCHAR CHAR
xs:Name xs:token Represents an XML Name as described in http://www.w3.org/TR/2004/REC-xml-20040204/#NT-Name. VARCHAR CHAR
xs:NCName xs:Name Represents an XML "non-colonized" Name as described in http://www.w3.org/TR/xmlschema-2/#NCName. VARCHAR CHAR
xs:ID xs:NCName Represents the ID attribute type as described in http://www.w3.org/TR/xmlschema-2/#ID. VARCHAR CHAR
xs:IDREF xs:NCName Represents the IDREF attribute type as described in http://www.w3.org/TR/xmlschema-2/#IDREF. VARCHAR CHAR
xs:IDREFS xs:IDREF Represents the IDREFS attribute type as described in http://www.w3.org/TR/xmlschema-2/#IDREFS. VARCHAR CHAR
xs:ENTITY xs:NCName Represents the ENTITY attribute type as described in http://www.w3.org/TR/xmlschema-2/#ENTITY. VARCHAR CHAR
xs:ENTITIES xs:ENTITY Represents the ENTITIES attribute type as described in http://www.w3.org/TR/xmlschema-2/#ENTITIES. VARCHAR CHAR
xs:language xs:token Represents formal language identifiers, as defined by RFCs 3066, 4646 and 4647 or their successor(s). The value space and lexical space are the set of all strings that conform to the pattern [a-zA-Z]{1,8}(-[a-zA-Z0-9]{1,8})*. no equivalent
xs:integer xs:decimal The standard mathematical integer datatype. Derived from datatype decimal by setting the facet "fractionDigits" to 0. no equivalent in SQL due to excessive value range
xs:nonPositiveInteger xs:integer An integer less than or equal to zero. no equivalent
xs:negativeInteger xs:nonPositiveInteger An integer less than zero. no equivalent
xs:long xs:integer An integer in the range -9223372036854775808 (-263) to 9223372036854775807 (263-1). not supported
xs:int xs:long An integer in the range -2147483648 (-231) to 2147483647 (231-1). INTEGER
xs:short xs:int An integer in the range -32768 (-215) to 32767 (215-1). SMALLINT
xs:byte xs:short An integer in the range -128 (-27) to 127 (27-1). TINYINT
xs:nonNegativeInteger xs:integer An integer greater than or equal to zero (0 to 264-1). no equivalent
xs:unsignedLong xs:nonNegativeInteger An integer in the range 0 to 264-1. no equivalent
xs:unsignedInt xs:unsignedLong An integer in the range 0 to 4294967295 (232-1). no equivalent
xs:unsignedShort xs:unsignedInt An integer in the range 0 to 65535 (216-1). no equivalent
xs:unsignedByte xs:unsignedShort An integer in the range 0 to 255 (28-1). TINYINT
xs:positiveInteger xs:nonNegativeInteger An integer greater than zero. no equivalent

User-Defined Datatypes

You can create two kinds of user-defined datatypes in Tamino (or XML Schema):

  • Simple Datatypes
    Simple datatypes are datatypes that users of Tamino or XML Schema can define by themselves. An element that is defined using a simple datatype can have neither child elements nor attributes. In Tamino, a simple datatype can be defined with the xs:simpleType element.

    In Tamino's Schema Definition Language TSD, a new datatype can be specified as shown in the following example using an XML Schema simple type definition:

    <xs:element name = "---">
      <xs:simpleType>
        <xs:restriction base="xs:decimal">
          <xs:totalDigits value= "15"/>
          <xs:fractionDigits value = "5"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:element>
    

    This defines a decimal datatype with the precision set to 15 digits and the number of fraction digits set to 5.

  • Complex Datatypes
    In contrast to the simple datatype definitions, complex type definitions can be used to define elements that contain child elements and/or attributes. This very powerful technique is realized in Tamino with the xs:complexType element.

Hierarchy of Datatypes

The hierarchy of datatypes allowed in TSD is depicted in the following graphic:

Built-in Datatype Hierarchy

graphics/thschema.gif

Originally published in the W3C Recommendation "XML Schema 1.0 Part 2: Datatypes"

Ranges of Numeric Types and Related Issues

The value space of the datatype integer is from -9223372036854775808 (-2^63, approx. -9E18) to +9223372036854775807 (2^63-1, approx. 9E18). Numbers outside this range lead to overflow errors. In queries, the value -9223372036854775808 must be coded as (-9223372036854775807 -1). This type is sometimes called "signed integer".

The value space of the datatype unsigned integer is from 0 to 18446744073709551615 (2^64, approx. 1.8E19). Numbers outside of this range will lead to overflow errors.

The value space of the datatype decimal is from -999999999999999999 (-1E18-1) to +999999999999999999, or more precisely: from -999999999999999999 to -0.000000000000000001, 0, and from 0.000000000000000001 to +999999999999999999. Accuracy is limited to 18 significant digits; for example, 123456789.987654321 (18 significant digits) can be represented exactly, but 123456789.9876543215 (19 significant digits) is rounded to 123456789.987654322. Numbers between -0.000000000000000005 and 0.000000000000000005 are rounded to zero. Numbers greater than or equal to 999999999999999999.5 or less than or equal to -999999999999999999.5 lead to overflow.

The value spaces of the datatypes float and double and their binary representations are as specified in IEEE 754. All comparisons and arithmetic operations with numeric data are carried out in the internal representation. Consequently, the limitations of binary representation as described in IEEE 754: IEEE Standard for Binary Floating-Point Arithmetic apply.

If you want to use the datatypes float or double, you should understand these numeric formats, e.g. by reading an introductory text on numerical mathematics, in order to know exactly the difficulties and limitations.

If you have to calculate financial results, use the type decimal. Neither float nor double is suitable.

The approximate ranges of float and double in decimal notation are as follows:

The range of float is approximately from -3.402823466E+38 to -1.175494351E-38, 0, and from 1.175494351E-38 to 3.402823466E+38; also the special values -INF, INF, NaN.

The range of double is approximately from -1.7976931348623158E+308 to -2.2250738585072014E-308, 0, and from 2.2250738585072014E-308 to 1.7976931348623158E+308; also the special values -INF, INF, NaN.

Numbers outside these ranges lead to overflow or to the results -INF, INF, NaN.

The word "approximately" is used above because different conversion routines on different platforms may behave slightly differently. The conversion from string to the internal binary representation can handle any precision, but the precision of the result of the conversion cannot exceed the precision of the internal representation. The conversion of the internal representation to string also leads to limitations of precision. This implementation defines that the conversion of float to string returns at most 6 significant digits, and the conversion of double to string returns at most 14 significant digits.

Conversion to string always yields the canonical representation. This applies also to integer, unsigned integer and decimal.

Type Propagations

When two operands in an arithmetic expression or a comparison are of different numeric types, Tamino ensures that the results are correctly evaluated. For example, if you add an integer to a decimal, the integer is converted to decimal and the result is calculated as a decimal. The conversion of the integer to decimal may result in an overflow, as the value space of decimal is smaller than the value space of integer. Overflow may also occur when attempting to convert, for example, a negative integer to type unsigned integer.

Generally, types are propagated in the following order: integer --> unsigned integer --> decimal --> float --> double.

Exceptions from this rule

If, for example, an integer and a double are added, the integer is not converted via the steps in the chain. That would cause unnecessary loss of precision. Instead, the integer is converted directly to double.

If the parameters to the min() function are a sequence of integers and unsigned integers, the result is normally unsigned integer. However, if at least one sequence member is negative, the result is integer (to avoid overflow).

Ordering and Comparison Operations with Datatypes for Date and Time

The following applies for datatypes such as date or time:

Relationship Between XML Schema Types and integer Types

The signed integer type is used for the following XML Schema types:

The unsigned integer type is used for the following XML Schema types:

Defining Simple Types

There is one mechanism for creating a new simple type from an existing base type, namely restriction.

Constraining facets can be used to restrict the value space of an existing simple type (which is specified as the base attribute) using the restriction element.

This is done in Tamino in the same way as in the XML Schema standard.

The following constraining facets, as defined in the XML Schema standard, are available in Tamino:

They can be applied individually or in combination. However, not all combinations of restricting facets are valid for all datatypes. See the table of Valid Combinations of Restricting Facets and Base Datatypes for details.

For improved reusability, it is possible to use the name attribute to specify a name for a simple type. The definition can subsequently be referenced using this name. This is called a named simple type definition. At the highest level, only named type definitions are allowed; anonymous type definitions are only permitted below the highest level.

The first example below shows a named simple type definition; the second example shows an anonymous simple type definition:

Example of a simple type definition using a single facet (enumeration)

This example uses the enumeration facet to restrict the value space of the base type NMTOKEN to three possible values. It defines a simple type for a datatype for characterizing locomotives of different traction based on a restriction that only the three values "steam", "diesel" and "electric" are allowed data for the defined type:

<xs:simpleType name="traction">
  <xs:restriction base = "xs:NMTOKEN">
    <xs:enumeration value = "steam"/>
    <xs:enumeration value = "diesel"/>
    <xs:enumeration value = "electric"/>
  </xs:restriction>
</xs:simpleType>

Example of a simple type definition using multiple facets (totalDigits and fractionDigits)

The SQL datatype numeric(18,5) is expressed using the totalDigits and fractionDigits facets as:

<xs:element name = "myDecimal">
  <xs:simpleType>
    <xs:restriction base="xs:decimal">
      <xs:totalDigits value= "18"/>
      <xs:fractionDigits value = "5"/>
    </xs:restriction>
  </xs:simpleType>
</xs:element>

Defining Complex Types

A simple type definition allows neither the definition of a node that contains other elements nor the definition of a node that contains attributes; therefore, the nodes that can be defined using simple type definitions are terminal nodes in the XML tree. To define more complex nodes containing elements or attributes, a more sophisticated kind of type definition is required, namely the complex type definition. It allows the following:

This section deals with the following topics:

Definition of Element and Attribute Wildcards

xs:any (Definition of Element Wildcards)

The xs:any element enables you to extend the instances with elements that are not specified by the schema. It is allowed in xs:choice and xs:sequence elements.

<xs:element name="client">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="cl_firstname" type="xs:string"/>
      <xs:element name="cl_lastname" type="xs:string"/>
      <xs:any minOccurs="0"/>
    </xs:all>
  </xs:complexType>
</xs:element>

xs:anyAttribute (Definition of Attribute Wildcards)

This element is available for the specification of attribute wildcards in complex type definitions. It allows the occurrence of arbitrary attributes with the current element in the XML instance to be validated against the schema. It is allowed in the xs:choice element.

Model Groups: Choices, Sequences and All

xs:complexType is used to:

  • Define a complex content model using model groups. A model group is composed of particles, which are elements, wildcards (represented by xs:any) and nested model groups. A particle has a valid occurrence count, which is determined by the values of the minOccurs and maxOccurs attributes. The default value of each of these attributes is 1. The following types of model groups exist:

    xs:choice

    xs:choice specifies a set of mutually exclusive particles. With this model group you can specify that exactly one of the particles specified in xs:choice must occur in the element.

    xs:sequence

    xs:sequence specifies an ordered set of particles. All of the particles must occur in the given order in the element.

    xs:all

    xs:all specifies an unordered set of elements. All elements specified must occur in the element, but they can occur in any order.

    Model groups themselves can be nested inside an xs:sequence or xs:choice element.

    Example:

    An element named "address", which may contain either a postal address or a telephone number or an email address:

    <xs:element name="address">
      <xs:complexType>
        <xs:choice>
          <xs:sequence>
            <xs:element name="street" type="xs:string"/>
            <xs:element name="zip"    type="xs:integer" minOccurs="0"/>
            <xs:element name="city"   type="xs:string"/>
          </xs:sequence>
          <xs:element name="phone" type="xs:string"/>
          <xs:element name="email" type="xs:string"/>
        </xs:choice>
      </xs:complexType>
    </xs:element>
    

    The following fragments validate against this schema fragment:

    <address>
      <street>5th Avenue</street>
      <city>New York</city>
    </address>
    
    <address>
      <phone>32168</phone>
    </address>
    
    <address>
      <email>E.Hillary@mt-everest.org</email>
    </address>
    

    whereas

    <address>
      <street>5th Avenue</street>
      <city>New York</city>
      <email>E.Hillary@mt-everest.org</email>
    </address>
    

    does not validate.

  • Add attribute definitions.

    Example 1:

    An element named fee with an amount attribute and a currency attribute, but empty content:

    <xs:element name="fee">
      <xs:complexType>
        <xs:attribute name="amount" type="xs:decimal" />
        <xs:attribute name="currency" type="xs:token" />
      </xs:complexType>
    </xs:element>
    

    An example of a valid instance:

    <fee amount="10" currency="USD"/>
    

    Example 2:

    An element named fee containing a decimal number with a currency attribute:

    <xs:element name="fee">
      <xs:complexType>
        <xs:simpleContent>
          <xs:extension base="xs:decimal">
            <xs:attribute name="currency" type="xs:token"/>
          </xs:extension>
        </xs:simpleContent>
      </xs:complexType>
    </xs:element>
    

    An example of a valid instance:

    <fee currency="EUR">10</fee>
    
  • Allow for additional more or less arbitrary child elements or attributes using wildcards, i.e. the elements xs:any and xs:anyAttribute respectively.

    The XML Schema for schemas (and thus TSD) allows arbitrary attributes belonging to any other non-XML-schema namespace to be specified in any element. For example, the definition of the xs:schema element has the following structure:

    Example:

    <xs:element name="schema">
      <xs:complexType>
    
        <xs:sequence>
          ...
        </xs:sequence>
    
        <xs:attribute name="targetNamespace" type="xs:anyURI" />
        <xs:attribute name="version" type="xs:string" />
        ...
        <xs:anyAttribute namespace="##other" />
    
      </xs:complexType>
    </xs:element>
    
  • Allow for mixed content (where both text and child elements are allowed) by using the mixed="true" attribute.

Extension and Definition of Simple Content Models

Another possibility for creating a complex type is to define a simple content model. A simple content model for a complex type can be created by extending an existing base type with additional attributes. This is done using the xs:simpleContent element and its child element, xs:extension.

The following examples illustrate this:

Example 1:

A complex type is defined with a simple content model constructed as follows:

An attribute duration of type unsignedShort is added to an element of type normalizedString

<xs:element name="Track">
  <xs:complexType>
    <xs:simpleContent>
      <xs:extension base = "xs:normalizedString">
        <xs:attribute name = "duration"
                      type = "xs:unsignedShort"
                      use = "required"/>
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
</xs:element>

Example 2:

The attributes language and length are added:

<xs:element name = "TITLE">
  <xs:complexType>
    <xs:simpleContent>
      <xs:extension base = "xs:string">
        <xs:attribute name = "language"
                      type = "xs:string"/>
        <xs:attribute name = "length"
                      use = "required"
                      type = "xs:string"/>
      </xs:extension>
    </xs:simpleContent>
  </xs:complexType>
</xs:element>

Example 3:

Perhaps surprisingly, an empty element is also modeled in TSD using an empty complex type definition with the mixed attribute set to the value false:

<xs:element name = "Tag">
  <xs:complexType mixed="false"/>
</xs:element name>

Untyped Element

If neither a type attribute nor a simple or complex type is specified for an element, arbitrary attributes and child elements are permitted.

Examples of XML Data Typing

For more information about the Tamino query language, see the X-Query User Guide.

Example: Natively-Stored Doctype with some Formally-Structured Nodes

The following example specifies the "born" node as a field of type integer in a doctype stored natively in XML that is to be indexed for full text retrieval. In XML Schema, it can be represented by this code:

<xs:schema xmlns:xs ="http://www.w3.org/2001/XMLSchema">
  <!-- schema for patient data. This could be a fragment of a larger
       DTD modeling hospital data -->
  <xs:element name='patient'>
    <xs:complexType>
      <xs:sequence>
        <xs:element ref='name' minOccurs='0'/>
        <xs:element ref='address' minOccurs='0'/>
        <xs:element name='born' type='xs:integer'/>
      </xs:sequence>
      <xs:attribute name='ID' type='xs:string' use='optional'/>
    </xs:complexType>
  </xs:element>
  <xs:element name='name'>
    <xs:complexType>
      <xs:sequence>
        <xs:element ref='surname'/>
        <xs:element ref='firstname'/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name='surname' type='xs:string'/>
  <xs:element name='firstname' type='xs:string'/>
  ...
  <tsd:elementInfo>
    <tsd:physical>
      <tsd:native>
        <tsd:index>
          <tsd:standard/>
        </tsd:index>
      </tsd:native>
    </tsd:physical> 
  </tsd:elementInfo>
  ...
</xs:schema>

This improves the performance of the execution of requests such as: "List all patients born after a certain date", for example:

....../patient[born >= 1950]

Substitution Groups

The motivation for substitution groups originates in the area of object-oriented design. Assume that a schema includes the following global element declarations:

<xs:element name="name" type="xs:string" />
<xs:element name="surname" type="xs:string" substitutionGroup="name" />

Then the element <surname> may replace <name> in any context where <name> would have been validated against the element declaration shown above. In general, the type of the substituting element must be derived from the type of the substituted element.

Identity Constraints

Identity constraints allow you to define unique constraints or keys that are checked within the scope of a single XML document that is validated against the schema.

As an example, we consider a schema that describes a company and its employees:

<xs:element name="company">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="name" type="xs:string"/>
      <xs:element ref="address"/>
      <xs:element ref="employee" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="employee">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="name" type="xs:string"/>
      <xs:element name="boss" type="xs:string" minOccurs="0"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

We assume that each employee has a unique name. We further assume that, as a rule, an employee has a boss who is also an employee (of the same company). These constraints can be described by the following extension of the element declaration shown above:

<xs:element name="company">
  <xs:complexType>
    <xs:sequence>
      <xs:element name="name" type="xs:string"/>
      <xs:element ref="address"/>
      <xs:element ref="employee" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>

  <xs:key name="emplName">
    <xs:selector xpath="employee"/>
    <xs:field xpath="name"/>
  </xs:key>

  <xs:keyref name="emplNameRef" refer="emplName">
    <xs:selector xpath="employee"/>
    <xs:field xpath="boss"/>
  </xs:keyref>

</xs:element>

The <xs:key> constraint asserts that:

  • Each employee's name is unique within the company;

  • Each employee has a name;

  • An employee cannot have more than one name.

If <xs:unique> is used instead of <xs:key>, the second condition is not enforced by the identity constraint. Based on the extended declaration of the company element, the employee:

<employee>
  <name>Bob Smith</name>
  <boss>Alex Miller</boss>
</employee>

is only valid if:

  • There is no other employee named Bob Smith; and

  • There is another employee named Alex Miller.