circle
element must have an r
(radius) attribute
<hr>
must appear as <hr/>
(or <hr />
to fool old browsers)<!ELEMENT name (model) >where
ELEMENT
is a keywordname
is the element name
being declaredmodel
is the element content model
(the allowed contents of the element)html
element must contain a head
element followed by
a body
element:
<!ELEMENT html (head, body) >where "
,
" is the sequence (or concatenation) operator list
element (not in HTML) must contain either a ul
element or
an ol
element (but not both):
<!ELEMENT list (ul|ol) >where "
|
" is the alternation (or "exclusive or") operator ul
element must contain zero or more li
elements:
<!ELEMENT ul (li)* >where "
*
" is the repetition (or "Kleene star") operator DTD Syntax | Meaning |
---|---|
b |
element b must occur |
b,c |
both b and
c must occur, in the order specified |
b|c |
one (and only one) of b or
c must occur |
b* |
zero or more occurrences of b must occur |
b+ |
one or more occurrences of b must occur |
b? |
zero or one occurrence of b must occur |
EMPTY |
no element content is allowed |
ANY |
any content (of declared elements and text) is allowed |
#PCDATA |
content is text rather than an element |
b
and c
(a,b)*
b+
is short for
(b,b*)
b?
is short for
(b|EMPTY)
#PCDATA
stands for "parsed character data", meaning an XML
parser should parse the characters to resolve character and entity references.
<!ELEMENT rss (channel) > <!ELEMENT channel (title,link,description,item+) > <!ELEMENT item (title,description,link,pubDate?) > <!ELEMENT title (#PCDATA) > <!ELEMENT link (#PCDATA) > <!ELEMENT description (#PCDATA) > <!ELEMENT pubDate (#PCDATA) >
rss-fragment-dtd-invalid.xml
giving
results
rss-fragment-dtd-valid.xml
giving
results
<!DOCTYPE rss ... >
<?xml version="1.0"?> <!DOCTYPE rss [ <!-- all declarations for rss DTD go here --> ... <!ELEMENT rss ... > ... ]> <rss> <!-- This is an instance of a document of type rss --> ... </rss>
rss
must be
defined in the DTDDOCTYPE
(i.e.,
rss
) must match root element
of document<?xml version="1.0"?> <!DOCTYPE rss SYSTEM "rss.dtd"> <rss> <!-- This is an instance of a document of type rss --> ... </rss>
SYSTEM
is a URIrss.dtd
is a relative URI, assumed to be in
same directory as source document<?xml version="1.0"?> <!DOCTYPE math PUBLIC "-//W3C//DTD MathML 2.0//EN" "http://www.w3.org/TR/MathML2/dtd/mathml2.dtd"> <math> <!-- This is an instance of a mathML document type --> ... </math>
PUBLIC
means what follows is a formal public
identifier with 4 fields:
ISO
for ISO standard, +
for approval by
other standards body, and -
for everything elseW3C
DTD MathML 2.0
EN
Formal public identifiers are meant for widely used entities. They should be unique world-wide. Processing software might either come with such entities already installed or it might know the most efficient sites form which to download them. If not, the URI is used to retrieve the DTD.
href="file.html"
in an HTML a
start tagrss
and guid
elements, these might be
<!ATTLIST rss version CDATA #FIXED "2.0" > <!ATTLIST guid isPermaLink (true|false) "true" >
version
CDATA
"true"
CDATA
: any valid character dataID
: an identifier unique within the documentIDREF
: a reference to a unique identifierIDREFS
: a reference to several unique identifiers
(separated by white-space)(a|b|c)
,
e.g.: (enumerated attribute type)
possible values are one of
a
, b
or c
#IMPLIED
: attribute may be omitted (optional)#REQUIRED
: attribute must be present#FIXED "x"
, e.g.:
attribute optional; if present, value
must be x
"x"
, e.g.: value will be
x
if attribute is omittedrss
DTD all content models comprise only
elements or only textem
, img
, b
, etc.em
, img
and
b
as contents of element p
:
<!ELEMENT p (#PCDATA | em | img | b)* >
#PCDATA
must be first (in the definition)|
*
applied to them(zero, one)*
and (zero | one)*
.
Give an example of a sequence of elements allowed by the one model but not by the other.
day
, month
and year
.
Produce a content model which allows for each of the sequences
year month year day month yearbut no others.
<!ELEMENT family (parent, (parent)?, (child)*)> <!ELEMENT parent (name)> <!ELEMENT child (name)> <!ELEMENT name (#PCDATA)> <!ATTLIST parent pno ID #IMPLIED role (mother|father) #IMPLIED spouse IDREF #IMPLIED> <!ATTLIST child cno ID #IMPLIED date-of-birth CDATA #IMPLIED siblings IDREFS #IMPLIED>
spouse
attribute is meant to be interpreted as
a reference to a pno
attributesiblings
attribute is meant to be interpreted as
a set of references to cno
attributes<?xml version="1.0"?> <!-- <!DOCTYPE family [ ... DTD goes here ... ]> --> <family> <parent pno="p1" role="mother" spouse="p2"> <name>Janet</name> </parent> <parent pno="p2" role="father" spouse="p1"> <name>John</name> </parent> <child cno="c1" siblings="c2 c3"> <name>Tom</name> </child> <child cno="c2" siblings="c1 c3"> <name>Dick</name> </child> <child cno="c3" siblings="c1 c2"> <name>Harry</name> </child> </family>
<?xml version="1.0"?> <!-- <!DOCTYPE family [ ... DTD goes here ... ]> --> <family> <parent pno="janet"> <name>Janet</name> </parent> <child date-of-birth="yesterday"> <name>Tom</name> </child> </family>
ID
to be referenceddate-of-birth
cannot be restricted to a valid
date by a DTD<family> <parent role="stepmother" spouse="john jim"> <name>Janet</name> </parent> <parent pno="john" spouse="janet"></parent> <parent pno="jim" spouse="janet"></parent> </family>
role
is given, must be mother
or father
spouse
must refer to only one value of type
ID
spouse
must refer to a value of type ID
which existsparent
must have a name
parent
elements allowedinvalid-family.xml
using the online XML validator gives
results
<!ENTITY BBK "Birkbeck, University of London">
&BBK;
substitutes value of entity for its name in documentxmas.xml
can be
produced by<!DOCTYPE xmas [ <!ENTITY on "On the"> <!ENTITY day "day of Christmas my true love sent to me"> <!ENTITY partridge "<line>a partridge in a pear tree.</line>"> <!ENTITY doves "<line>two turtle doves and</line> &partridge;"> <!ENTITY hens "<line>three French hens,</line> &doves;"> <!ELEMENT xmas (verse+)> <!ELEMENT verse (line+)> <!ELEMENT line (#PCDATA)> ]> <xmas> <verse><line>&on; first &day;</line> &partridge;</verse> <verse><line>&on; second &day;</line> &doves;</verse> <verse><line>&on; third &day;</line> &hens;</verse> </xmas>
BBK
is an example of a general entity&
(&
),
<
(<
), >
(>
), "
("
),
'
('
),<!ENTITY HTML-chapter SYSTEM "html.xml" >
&HTML-chapter;
includes contents of file
html.xml
at point of referencestandalone="no"
in XML declaration%
between
ENTITY
and name, e.g.,
<!ENTITY % list "OL | UL" > <!ENTITY % heading "H1 | H2 | H3 | H4 | H5 | H6" >
%
and ;
delimiters,
e.g.,
<!ENTITY % block "P | %list; | %heading; | ..." >
&
operator from SGML)exam
. An exam
has a
course code
, a title
and a
date
, which comprises only the month
and year
. These are followed by a list of
questions
. Exams consist of either 5 or 6 questions.
Each question
has one or more part
s.
Parts of questions can themselves comprise part
s
along with text.
programme
. A
programme
has a degree
and a year
.
These elements are followed by the
results
for the programme. The results
are partitioned into distinction
, merit
,
pass
and fail
. Within each is a sequence of
name
elements, each containing the name of a person
having achieved the corresponding result
for the
programme
.teaches
with attributes course
and
lecturer
, representing the relationship
between courses taught on an MSc programme and the lecturers who
teach them. Give an XML DTD for representing this information.www.w3.org/TR/REC-xml.html
www.w3schools.com/dtd
validator.w3.org
en.wikipedia.org/wiki/Chomsky_hierarchy
DTDs are covered in Chapter 4 of [Moller and Schwartzbach] and briefly in Chapter 2 of [Jacobs].