| |
| Overview |
| |
| In this chapter you will learn : |
- What is an XML schema?
- What is a DTD and types of DTD?
- How the document is validated?
- What is an XML validator?
|
Document Type Definition
The SGML formalizes the concept of a document type and provides for a separate file called a Document Type Definition (DTD), which identifies all of the elements in its respective document, and indicates the structural relationships among them.
So, unlike its predecessor SGML, XML does not absolutely require a Document Type Declaration (also known as DOCTYPE definition) in all circumstances. The DTD is a file that contains the necessary rules that the XML code in the file must follow.
The syntax of a DTD is like :
| Syntax |
| < !DOCTYPE DTDname [options] > |
- DTD name is the name of the DTD. The DTD name should be same as the root element of the document.
- The "options" include other specifications, for example, an indication of where DTD or schemas are located, their own types, etc.
- A DTD can be declared inline in your XML document, or as an external reference.
Internal DTDs (also known as internal subset)
If the DTD is included in your XML source file, it should be wrapped in a DOCTYPE definition, with the following syntax :
| Syntax |
| < !DOCTYPE Rootname [element declaration] > |
- With all the declarations in the internal DTD subset, the XML processor would not need to read and process external documents.
- When you start off the XML document the first line in the XML declaration, which can include a standalone document declaration:
< ?xml version="1.0" standalone = "yes"? >
The statement standalone = "yes" means that there are no mark up declarations external to the document entity.
The keyword DOCTYPE must be in uppercase.
| Example |
<?xml version="1.0"? >
<!DOCTYPE Vehicle [
<!ELEMENT Vehicle (two-wheeler, three-wheeler, four-wheeler)>
<!ELEMENT two-wheeler (#PCDATA)>
<!ELEMENT three-wheeler (#PCDATA)>
<!ELEMENT four-wheeler (#PCDATA)>
]>
<Vehicle>
<two-wheeler>Bi-cycle</two-wheeler>
<three-wheeler>Auto-rickshaw</three-wheeler>
<four-wheeler>Car</four-wheeler>
</Vehicle> |
External DTDs (also known as external subset)
The DTD portion of the document doesn't always have to be stored inside the related XML document. Instead, it can be saved in a file for reference by one document or by several different documents.
| Syntax |
| < !DOCTYPE Root element System "filename" > |
- External DTD is intended for use with more than one XML document.
- For External DTD, the "standalone" is set to "no" in the XML version statement, which indicates that an external DTD must be processed as well as all internal declarations.
| Example |
A simple XML document :
<?xml version = "1.0" ? >
<page>
<head>
<title> Fruits </title>
</head>
<body>
<title> Select your favorite fruits </title>
</body>
</page> |
Developing a DTD from XML code :
<!DOCTYPE page
[
<!ELEMENT page ( head, body ) >
<!ELEMENT head (title) >
<!ELEMENT body (title) >
] |
Although DTDs are valuable and effective tools used in defining document types, the DTD has several drawbacks
-
DTDs have their own syntax, which differs from a true XML, which means that a DTD cannot be processed with a standard XML parser. It would be better, and make learning much easier, if the tools used to process XML documents could also be used to process their document models.
-
DTDs have limited ability to describe the data in elements and attributes. For example, you can't indicate when character data should be numbers, date-format, or currency.
-
DTDs have limited support for namespaces, so they can't define or restrict the content of elements based on context sensitivity. ( Namespace declarations, which are special attributes, prevent name collisions. However, they require the insertion of appropriate declarations into the respective DTDs ).
-
Therefore, a movement has developed to create a different system of writing the prototypes of XML-Data. It's a language used to create schemas, which are descriptions of the data in an XML file. The important thing about schemas is that the description of the data can be written in XML.
What is XML Schema?
A schema is a definition of the syntax of an XML based language, i.e., it defines a class of XML document. A schema language is a formal language for expressing schemas. They are composed of declarations for concepts and classes of objects with class hierarchies, properties, constraints, and relationships. Like a DTD, a schema is a model for describing the structure and content of data. But XML Schema was developed as a content modeling language, an application of XML, and not as an application of SGML. So XML Schema pertains only to XML and XML-related languages.
Schemas define the elements that can appear in an XML document and the attributes that can be associated with those elements.
- Schemas define the document's structure, i.e., :
- which elements are children of others.
- the order the child elements can appear.
- and the number of child elements.
- Schemas specify if an element is empty or if it can include text.
- Schemas can also specify default values for attributes.
- Schemas are more powerful and flexible than DTDs and use XML syntax.
- Schemas supports scope enabled definitions.
- Schema standards are defined by the World Wide Web Consortium(W3C). The W3C site provides comprehensive reference of XML schemas.
| Example |
A Simple XML file :
< ?xml version = "1.0"?>
<Colors>
<First>
RED
</First>
<Second>
PINK
</Second>
<Body>
Welcome to the world of colors
</Body>
</Colors> |
The XML Schema, which defines the Colors data Type :
<?xml version = "1.0"?>
<Schema name = "ColorsDef "
xmlns = "http://www.xyz.com /XMLSchema">
<ElementType name = "First" content="textOnly"/>
<ElementType name = "Second" content="textOnly"/>
<ElementType name = "Body" content="textOnly"/>
<ElementType name = "Colors" content="eltOnly">
<element type = "First"/>
<element type = "Second "/>
<element type = "Body"/>
</ElementType>
</Schema> |
|