XML Schema: root element
XmlXsdXml Problem Overview
The following post asks how to indicate that an element is the root element in an XML schema:
I have followed the w3schools tutorial on XML Schema but something is still not clear.
Consider example schema 2 from https://www.w3schools.com/xml/schema_example.asp
(reproduced below for convenience). How does this code indicate that <shiporder>
is the root element? Isn't the example saying that all elements
are valid as root elements?
------------------ instance ----------------------------------
<?xml version="1.0" encoding="ISO-8859-1"?>
<shiporder orderid="889923"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="shiporder.xsd">
<orderperson>John Smith</orderperson>
<shipto>
<name>Ola Nordmann</name>
<address>Langgt 23</address>
<city>4000 Stavanger</city>
<country>Norway</country>
</shipto>
<item>
<title>Empire Burlesque</title>
<note>Special Edition</note>
<quantity>1</quantity>
<price>10.90</price>
</item>
<item>
<title>Hide your heart</title>
<quantity>1</xample saying that all elements are valid as root elements?quantity>
<price>9.90</price>
</item>
</shiporder>
----------------------- schema ------------------------
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- definition of simple elements -->
<xs:element name="orderperson" type="xs:string"/>
<xs:element name="name" type="xs:string"/>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
<xs:element name="title" type="xs:string"/>
<xs:element name="note" type="xs:string"/>
<xs:element name="quantity" type="xs:positiveInteger"/>
<xs:element name="price" type="xs:decimal"/>
<!-- definition of attributes -->
<xs:attribute name="orderid" type="xs:string"/>
<!-- definition of complex elements -->
<xs:element name="shipto">
<xs:complexType>
<xs:sequence>
<xs:element ref="name"/>
<xs:element ref="address"/>
<xs:element ref="city"/>
<xs:element ref="country"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="item">
<xs:complexType>
<xs:sequence>
<xs:element ref="title"/>
<xs:element ref="note" minOccurs="0"/>
<xs:element ref="quantity"/>
<xs:element ref="price"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="shiporder">
<xs:complexType>
<xs:sequence>
<xs:element ref="orderperson"/>
<xs:element ref="shipto"/>
<xs:element ref="item" maxOccurs="unbounded"/>
</xs:sequence>
<xs:attribute ref="orderid" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>
From my point of view an XML Schema should do two things:
- define what can occur inside each node
- define where each node can be placed
And it seems the example fails at #2. Any Suggestions?
Xml Solutions
Solution 1 - Xml
As far as I know, any globally defined element can be used as root element, and XML Schema does not have a notion for specifying what the root element is supposed to be.
You can however work around this by designing your XML Schema well, so that there is only one globally defined element - then only this element is valid as root element.
An example of this can be found at W3Schools (heading Using Named Types) This example only has one globally defined element, and thus only one possible root element.
Solution 2 - Xml
Not everyone agrees with it, but the fact that XML Schema can't specify a root element is by design. The thinking is that if an <invoice>
is valid when it's the only thing in a document, then it is equally valid if it is contained in something else. The idea is that content should be reusable, and you shouldn't be allowed to prevent someone using valid content as part of something larger.
(The fact that ID and IDREF are scoped to a document rather goes against this policy; but then the language was designed by a rather large committee.)
Solution 3 - Xml
yes, you are right. the xsd should be:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- definition of attributes -->
<xs:attribute name="orderid" type="xs:string"/>
<!-- definition of complex elements -->
<xs:complexType name="shiptoType">
<xs:sequence>
<xs:element name="name" type="xs:string" />
<xs:element name="address" type="xs:string" />
<xs:element name="city" type="xs:string" />
<xs:element name="country" type="xs:string" />
</xs:sequence>
</xs:complexType>
<xs:complexType name="itemType">
<xs:sequence>
<xs:element name="title" type="xs:string" />
<xs:element name="note" minOccurs="0" type="xs:string" />
<xs:element name="quantity" type="xs:string" />
<xs:element name="price" type="xs:string" />
</xs:sequence>
</xs:complexType>
<xs:element name="shiporder">
<xs:complexType>
<xs:sequence>
<xs:element name="orderperson" type="xs:string" />
<xs:element name="shipto" type="shiptoType"/>
<xs:element name="item" maxOccurs="unbounded" type="itemType"/>
</xs:sequence>
<xs:attribute ref="orderid" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>
as you see, now there is only one xs:element
, and that one is the only one that can be a valid root element :)
Solution 4 - Xml
How does this code indicate that
John, That schema just defined all the elements and any of those can be chosen as a root element. If you try generating a sample xml from any tool like Altova XML Spy or its kind, you will get to choose an element to be the root element.
So any of those elements can be the root.
To prevent ambiguity, use one globally defined element.
Solution 5 - Xml
The disadvantage of lots of global elements is they could all be used as root elements for documents. The advantage is then you can use the element when defining new types which will assure the namespace of the child elements match those of the parent type.
I have changed from thinking there should only be one global element to that all complex types should have a global element.
Solution 6 - Xml
Based on the example that you provided, it is possible to find the only root element.
You can get a list of global elements, then get a list a nested elements that referenced in complexType under the node xs:sequence, thus the root element is the one in global elements list but not in nested elements list.
I have done this by using XmlSchemaSet class in .NET. Here is the code snippet:
var localSchema = schemaSet.Schemas().OfType<XmlSchema>().Where(x => !x.SourceUri.StartsWith("http")).ToList();
var globalComplexTypes = localSchema
.SelectMany(x => x.Elements.Values.OfType<XmlSchemaElement>())
.Where(x => x.ElementSchemaType is XmlSchemaComplexType)
.ToList();
var nestedTypes = globalComplexTypes.Select(x => x.ElementSchemaType)
.OfType<XmlSchemaComplexType>()
.Select(x => x.ContentTypeParticle)
.OfType<XmlSchemaGroupBase>()
.SelectMany(x => x.GetNestedTypes())
.ToList();
var rootElement= globalComplexTypes.Single(x => !nestedTypes.Select(y => y.ElementSchemaType.QualifiedName).Contains(x.SchemaTypeName));
The extension method GetNestedTypes:
static IEnumerable<XmlSchemaElement> GetNestedTypes(this XmlSchemaGroupBase xmlSchemaGroupBase)
{
if (xmlSchemaGroupBase != null)
{
foreach (var xmlSchemaObject in xmlSchemaGroupBase.Items)
{
var element = xmlSchemaObject as XmlSchemaElement;
if (element != null)
yield return element;
else
{
var group = xmlSchemaObject as XmlSchemaGroupBase;
if (group != null)
foreach (var item in group.GetNestedTypes())
yield return item;
}
}
}
}
But there still has problems for the general xsd when using this approach. For example, in DotNetConfig.xsd that Visual studio use for configuration file, the root element is define as below:
<xs:element name="configuration">
<xs:complexType>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:any namespace="##any" processContents="lax" />
</xs:choice>
<xs:anyAttribute namespace="http://schemas.microsoft.com/XML-Document-Transform" processContents="strict"/>
</xs:complexType>
</xs:element>
I havn't found a complete solution to deal with all kinds of schemas yet. Will continue for it.