4. Defining Web Document Types


  1. Document Types
  2. Document Type Definitions (DTDs)
  3. Valid XML
  4. DTD syntax
  5. Examples of DTD element declarations
  6. DTD syntax
  7. DTD for RSS
  8. Validation of XML Documents
  9. Referencing a DTD
  10. Declaring an Internal DTD
  11. Declaring an External DTD (1)
  12. Declaring an External DTD (2)
  13. DTD for CD Example
  14. Attributes
  15. Some Attribute Types
  16. Attribute Defaults
  17. Example Declaring HTML Attributes
  18. Mixed Content Models
  19. Some exercises
  20. Entities
  21. General Entities
  22. Parameter Entities
  23. Limitations of DTDs
  24. JSON Schema
  25. Example Product Schema (1)
  26. Example Product Schema (2)
  27. Example Product Schema (3)
  28. Some Other JSON Schema Features
  29. Exercises
  30. Links to more information

4.1. Document Types

4.2. Document Type Definitions (DTDs)

4.3. Valid XML

XML parser checking document is valid

4.4. DTD syntax

4.5. Examples of DTD element declarations

4.6. DTD syntax

DTD Syntax Meaning
b element b must occur
b,c both b and c must occur, in the order specified
b|c one (and only one) of b or c must occur
b* zero or more occurrences of b must occur
b+ one or more occurrences of b must occur
b? zero or one occurrence of b must occur
EMPTY no element content is allowed
ANY any content (of declared elements and text) is allowed
#PCDATA content is text rather than an element

#PCDATA stands for "parsed character data", meaning an XML parser should parse the characters to resolve character and entity references.

4.7. DTD for RSS

4.8. Validation of XML Documents

4.9. Referencing a DTD

4.10. Declaring an Internal DTD

<?xml version="1.0"?>
<!DOCTYPE rss [
    <!-- all declarations for rss DTD go here -->
    ...
    <!ELEMENT rss ... >
    ...
]>
<rss>
   <!-- This is an instance of a document of type rss -->
   ...
</rss>

4.11. Declaring an External DTD (1)

<?xml version="1.0"?>
<!DOCTYPE rss SYSTEM "rss.dtd">
<rss>
   <!-- This is an instance of a document of type rss -->
   ...
</rss>

4.12. Declaring an External DTD (2)

<?xml version="1.0"?>
<!DOCTYPE math PUBLIC "-//W3C//DTD MathML 2.0//EN"
     "http://www.w3.org/TR/MathML2/dtd/mathml2.dtd">
<math>
   <!-- This is an instance of a mathML document type -->
   ...
</math>

Formal public identifiers are meant for widely used entities. They should be unique world-wide. Processing software might either come with such entities already installed or it might know the most efficient sites form which to download them. If not, the URI is used to retrieve the DTD.

4.13. DTD for CD Example

4.14. Attributes

4.15. Some Attribute Types

4.16. Attribute Defaults

4.17. Example Declaring HTML Attributes

4.18. Mixed Content Models

4.19. Some exercises

4.20. Entities

4.21. General Entities

4.22. Parameter Entities

4.23. Limitations of DTDs

4.24. JSON Schema

4.25. Example Product Schema (1)

{
    "$schema": "http://json-schema.org/draft-07/schema#",
    "title": "Product",
    "description": "A product from Acme's catalog",
    "type": "object"
}

4.26. Example Product Schema (2)

{
    ...
    "type": "object",
    "properties": {
        "id": {
            "description": "The unique identifier for a product",
            "type": "integer"
        },
        "name": {
            "description": "The name of the product",
            "type": "string"
        }
    },
    "required": ["id", "name"]
}

4.27. Example Product Schema (3)

{
    ...
    "properties": {
        ...
        "price": {
            "type": "number",
            "exclusiveMinimum": 0
        },
        "tags": {
            "type": "array",
            "items": {
                "type": "string"
            },
            "minItems": 1,
            "uniqueItems": true
        }
    },
    "required": ["id", "name", "price"]
}

4.28. Some Other JSON Schema Features

4.29. Exercises

  1. Write an XML DTD which will define the following structure for documents of type exam. An exam has a course code, a title and a date, which comprises only the month and year. These are followed by a list of questions. Exams consist of either 5 or 6 questions. Each question has one or more parts. Parts of questions can themselves comprise parts along with text.

    Give an instance of an exam document which is valid with respect to your DTD and two instances which are invalid, explaining why they are invalid. Check your answers using an on-line XML validator.


  2. Write an XML DTD for representing information about students on an MSc programme. All information should be represented using elements rather than attributes. The root element of the document is programme. A programme has a degree and a year. These elements are followed by the results for the programme. The results are partitioned into distinction, merit, pass and fail. Within each is a sequence of name elements, each containing the name of a person having achieved the corresponding result for the programme.


  3. Consider the information described above about an MSc programme, along with the sample JSON data you produced for the corresponding exercise at the end of the Web Languages section. Write a suitable JSON schema for such data and check that your sample data validates against the schema. You can use an online schema validator to perform this check.

4.30. Links to more information

DTDs are covered in Chapter 4 of [Moller and Schwartzbach] and briefly in Chapter 2 of [Jacobs].