circle
element must have an r
(radius) attribute
<!ELEMENT name (model) >where
ELEMENT
is a keyword
name
is the element name
being declared
model
is the element content model
(the allowed contents of the element)
html
element must contain a head
element followed by
a body
element:
<!ELEMENT html (head, body) >where "
,
" is the sequence (or concatenation) operator
list
element (not in HTML) must contain either a ul
element or
an ol
element (but not both):
<!ELEMENT list (ul|ol) >where "
|
" is the alternation (or "exclusive or") operator
ul
element must contain zero or more li
elements:
<!ELEMENT ul (li)* >where "
*
" is the repetition (or "Kleene star") operator
DTD Syntax | Meaning |
---|---|
b |
element b must occur
|
b,c |
both b and
c must occur, in the order specified
|
b|c |
one (and only one) of b or
c must occur
|
b* |
zero or more occurrences of b must occur
|
b+ |
one or more occurrences of b must occur
|
b? |
zero or one occurrence of b must occur
|
EMPTY |
no element content is allowed |
ANY |
any content (of declared elements and text) is allowed |
#PCDATA |
content is text rather than an element |
b
and c
(a,b)*
b+
is short for
(b,b*)
b?
is short for
(b|EMPTY)
#PCDATA
stands for "parsed character data", meaning an XML
parser should parse the characters to resolve character and entity references.
<!ELEMENT rss (channel) > <!ELEMENT channel (title,link,description,item+) > <!ELEMENT item (title,description,link,pubDate?) > <!ELEMENT title (#PCDATA) > <!ELEMENT link (#PCDATA) > <!ELEMENT description (#PCDATA) > <!ELEMENT pubDate (#PCDATA) >
rss-fragment-dtd-invalid.xml
giving
results
rss-fragment-dtd-valid.xml
giving
results
<!DOCTYPE rss ... >
<?xml version="1.0"?> <!DOCTYPE rss [ <!-- all declarations for rss DTD go here --> ... <!ELEMENT rss ... > ... ]> <rss> <!-- This is an instance of a document of type rss --> ... </rss>
rss
must be
defined in the DTD
DOCTYPE
(i.e.,
rss
) must match root element
of document
<?xml version="1.0"?> <!DOCTYPE rss SYSTEM "rss.dtd"> <rss> <!-- This is an instance of a document of type rss --> ... </rss>
SYSTEM
is a URIrss.dtd
is a relative URI, assumed to be in
same directory as source document
<?xml version="1.0"?> <!DOCTYPE math PUBLIC "-//W3C//DTD MathML 2.0//EN" "http://www.w3.org/TR/MathML2/dtd/mathml2.dtd"> <math> <!-- This is an instance of a mathML document type --> ... </math>
PUBLIC
means what follows is a formal public
identifier with 4 fields:
ISO
for ISO standard, +
for approval by
other standards body, and -
for everything else
W3C
DTD MathML 2.0
EN
Formal public identifiers are meant for widely used entities. They should be unique world-wide. Processing software might either come with such entities already installed or it might know the most efficient sites form which to download them. If not, the URI is used to retrieve the DTD.
CD
element contains a composer
followed by one or more performance
elements
performance
element must contain
composition
and date
elements
soloist
element,
followed by either both an orchestra
and conductor
or neither of them
CD
and performance
elements might be as follows:
<!ELEMENT CD (composer, (performance)+)> <!ELEMENT performance (composition, (soloist)?, (orchestra, conductor)?, date)>
href="file.html"
in an HTML a
start tag
rss
and guid
elements, these might be
<!ATTLIST rss version CDATA #FIXED "2.0" > <!ATTLIST guid isPermaLink (true|false) "true" >
version
CDATA
"true"
CDATA
: any valid character data
ID
: an identifier unique within the document
IDREF
: a reference to a unique identifier
IDREFS
: a reference to several unique identifiers
(separated by white-space)
(a|b|c)
,
e.g.: (enumerated attribute type)
possible values are one of
a
, b
or c
#IMPLIED
: attribute may be omitted (optional)
#REQUIRED
: attribute must be present
#FIXED "x"
, e.g.:
attribute optional; if present, value
must be x
"x"
, e.g.: value will be
x
if attribute is omitted
IMG
element is empty
and must have src
and alt
attributes
height
and width
attributes
<!ELEMENT IMG EMPTY> <!ATTLIST IMG src CDATA #REQUIRED alt CDATA #REQUIRED height CDATA #IMPLIED width CDATA #IMPLIED >
FORM
element has an optional
method
attribute
GET
or POST
, with default value GET
:
<!ATTLIST FORM method (GET|POST) GET>
rss
DTD all content models comprise only
elements or only text
em
, img
, b
, etc.
em
, img
and
b
as contents of element p
:
<!ELEMENT p (#PCDATA | em | img | b)* >
#PCDATA
must be first (in the definition)
|
*
applied to them
(zero, one)*
and (zero | one)*
.
Give an example of a sequence of elements allowed by the one model but not by the
other.
day
, month
and year
.
Produce a content model which allows for each of the sequences
year month year day month yearbut no others.
<!ENTITY BBK "Birkbeck, University of London">
&BBK;
substitutes value of entity for its name in document
BBK
is an example of a general entity&
(&
),
<
(<
), >
(>
), "
("
),
'
('
),
<!ENTITY HTML-chapter SYSTEM "html.xml" >
&HTML-chapter;
includes contents of file
html.xml
at point of reference
standalone="no"
in XML declaration
%
between
ENTITY
and name, e.g.,
<!ENTITY % list "OL | UL" > <!ENTITY % heading "H1 | H2 | H3 | H4 | H5 | H6" >
%
and ;
delimiters,
e.g.,
<!ENTITY % block "P | %list; | %heading; | ..." >
&
operator from SGML)
json-schema.org
)
json-schema.org
):
{ "id": 1234, "name": "Bowers & Wilkins Zeppelin Wireless", "price": 499.50, "tags": ["airplay", "spotify", "bluetooth"] }
{ "$schema": "http://json-schema.org/draft-07/schema#", "title": "Product", "description": "A product from Acme's catalog", "type": "object" }
$schema
keyword specifies the vocabulary used
title
and description
keywords are descriptive only; they do not add constraints
type
keyword defines the first constraint: it has to be a JSON object
id
and name
are always required
id
is an integer, while name
is a string
{ ... "type": "object", "properties": { "id": { "description": "The unique identifier for a product", "type": "integer" }, "name": { "description": "The name of the product", "type": "string" } }, "required": ["id", "name"] }
price
is required and must be a positive number
tags
is an array of strings, where there must be
at least one tag and all tags must be unique
{ ... "properties": { ... "price": { "type": "number", "exclusiveMinimum": 0 }, "tags": { "type": "array", "items": { "type": "string" }, "minItems": 1, "uniqueItems": true } }, "required": ["id", "name", "price"] }
maximum
and minimum
values for numbers
maxLength
, minLength
and pattern
(regular expression) values for strings
"anyOf": [ ... ]
specifies that at least one alternative should match
"oneOf": [ ... ]
specifies that only one alternative should match
"allOf": [ ... ]
specifies that all alternatives should match
"additionalProperties": false
specifies that no other properties are allowed
exam
. An exam
has a
course code
, a title
and a
date
, which comprises only the month
and year
. These are followed by a list of
questions
. Exams consist of either 5 or 6 questions.
Each question
has one or more part
s.
Parts of questions can themselves comprise part
s
along with text.
programme
. A
programme
has a degree
and a year
.
These elements are followed by the
results
for the programme. The results
are partitioned into distinction
, merit
,
pass
and fail
. Within each is a sequence of
name
elements, each containing the name of a person
having achieved the corresponding result
for the
programme
.
www.w3.org/TR/REC-xml.html
www.w3schools.com/dtd
www.xmlvalidation.com
en.wikipedia.org/wiki/Chomsky_hierarchy
json-schema.org
DTDs are covered in Chapter 4 of [Moller and Schwartzbach] and briefly in Chapter 2 of [Jacobs].