The UML (Unified Modeling Language) is a popular object-oriented modeling method. Since it has been submitted as an ISO standard, we discuss it here also in the context of modeling for XML.
Most commercial CASE tools that support UML such as Rational Rose or TogetherSoft also support the importing and exporting of XML DTDs and/or XML Schema. In the simplest case, an existing DTD or XML Schema is simply imported into the CASE tool, resulting in a number of UML classes that represent the different nodes of the XML document. Side effects of this functionality are the possibility of converting from DTD to XML Schema and vice versa, and of generating a Java-based access layer for a given document type.
However, you should not misinterpret this technique as "conceptual" modeling: it results in a model of an implementation object. Generating XML schemas from a conceptual model is somewhat more demanding. In this chapter, we discuss how this can be achieved with relatively simple means.
What we should not expect in this context, however, is a complete solution that supports round-trip engineering. UML was developed with object-oriented implementation and design methods in mind. We should therefore experience (and tolerate) a slight impedance mismatch between UML and XML.
One way to generate code with a CASE tool is to write production rules for the tool's code generator. However, this is a proprietary approach, and we would have to demonstrate different solutions for each CASE tool on the market.
We therefore choose a method that can be applied to most CASE tools. Practically all CASE tools on the market support the exporting of metadata to XMI (XML Metadata Interchange). XMI is an XML-based standard for the exchange of modeling data between different design and development tools. It can capture virtually all information within a UML model.
In the context of this tutorial we use Poseidon for UML (the Community Edition is freeware, available from http://www.gentleware.com/), a commercialized version of ArgoUML, as our CASE tool. We define our jazz example in UML, then export it to XMI, and finally convert the resulting XMI into XML Schema with the help of an XSLT stylesheet.
Here are the mapping rules to cast an asset-oriented model onto UML:
We decorate all identifying assets of business objects with the
stereotype entity
. This allows us to generate arcs leading to
these assets differently (as these arcs lead to separate documents).
We use qualified names for all assets (i.e. names with namespace prefixes). Because the colon is not a valid name character in most programming languages, we replace it by an underscore.
Since UML is an object-oriented technology, it does not have a native
concept of primary keys. It is conventional to decorate primary keys with the
stereotype primaryKey
.
We represent the arcs of our conceptual model as unnamed UML associations. If required, we can decorate the source end of an association with a role name and the target end with a cardinality constraint.
The exception to the rule are the arcs that are decorated with an
is_a
label. These are represented as a UML
generalization/specialization. Multiple inheritance is allowed in UML. Thus,
the conversion process must resolve inheritance relations because XML Schema
does not support multiple inheritance.
UML attribute specifications can include a type and an initial value.
Other XML Schema-specific constraints, such as minOccurs
,
maxOccurs
, form
, maxLength
,
length
, totalDigits
, fractionDigits
and
enumeration
, have no specific equivalent in UML but can be
specified as tagged values (which we name appropriately
xs_minOccurs
, xs_maxOccurs
, etc.). Similarly, a
tagged value xs_fixed=true
can be used to determine if the initial
value shall be regarded as a fixed value or as a default value.
We can use Java-based datatypes for attributes. These are already
built into the modeler and can be mapped automatically onto XML Schema
datatypes during the conversion process. We can also explicitly use the
built-in datatypes of XML Schema, but we have to declare them explicitly in
UML. We do this by defining classes such as xs_NMTOKEN
or
xs_ID
and decorating them with stereotype type
. We
also introduce a pseudo datatype xs_any
to indicate wildcards.
UML does not support complex attribute definitions. Instead, we have
to resolve complex properties. We have two options: (1) represent a complex
property as an explicit aggregation, or (2) define a separate datatype for a
complex property. In this example, we opt for the latter. For example, we
introduce the datatypes tPerformedAt
for
performedAt(location&time)
, tPeriod
for
period(from,to)
and tName
for
name(first,middle?,last)
.
Alternatives (choice groups) require extra care. In UML we model them
as a datatype generalization. For example, the property
(performedAt(location&time)|period(from,to))
in asset
collaboration
is modeled as an element (which we call
collaborationContext
) with a type that is a generalization of the
datatypes tPerformedAt
and tPeriod
.
Clusters are represented as a generalizations also. To represent, for
example, the cluster containing all the instruments, we introduce a generalized
class instrument
. Because we do not want this class to appear in
the final schema, we define it as an abstract class. Similarly, we
introduce a generalized class representing all classes that are subject to
reviews, such as jazz musicians and albums.
By default, we assume an ordered sequence for the attributes of an UML
class and would therefore generate an xs:sequence
connector. If we
want an unordered sequence (resulting in an xs:all
connector), we
indicate this by attaching the tagged value xs_ordered=false
to
the respective UML class.
Similarly, we attach the tagged value xs_mixed=true
if a
class shall contain mixed content.
Applying these rules, we finally arrive at the following model:
Most UML tools provide a function to serialize a model into XMI format. XMI is an XML-based industry standard for the exchange of metadata between CASE tools. Because it is XML based, XMI can be converted with the help of XSLT stylesheets into other formats such as XML Schema. An example of such a stylesheet can be found at http://www.aomodeling.org/.