This chapter is based on Chapters 1 and 2 of Eric Ray's book [Ray 2001], Chapters 4, 5, 20 and 21 of the Deitel, et.al. tome [Deitel 2001], Part 1 of Elliotte Rusty Harold's book [Harold 1999], plus additional material from the web.
Figure 1 illustrates how a collection of CDs may be put together in XML.
<?xml version="1.0"?>
<!DOCTYPE cdcollection SYSTEM "cd-collection.dtd">
<cdcollection>
<album id="540 590-2">
<title>Sheryl Crow</title>
<artist>Sheryl Crow</artist>
<label>A & M Records</label>
<track time="4:56">maybe angels</track>
<track time="3:50">a change</track>
<track time="4:51">home</track>
<track time="3:58">sweet rosalyn</track>
<track time="5:23">if it makes you happy</track>
<track time="4:27">redemption day</track>
<track time="3:07">hard to make a stand</track>
<track time="4:16">everyday is a winding road</track>
<track time="4:43">love is a good thing</track>
<track time="3:30">oh marie</track>
<track time="4:58">superstar</track>
<track time="4:34">the book</track>
<track time="3:55">ordinary morning</track>
<track time="3:20">free man</track>
</album>
<album id="332 80-2">
<title>Slide on This</title>
<artist>Ronnie Wood</artist>
<label>KOCH International</label>
<track>Somebody Else Might</track>
<track>Testify</track>
<track>Ain't Rock'n Roll</track>
<track>Josephine</track>
<track>Knock Yer Teeth Out</track>
<track>Ragtime Annie (Lillie's Bordello)</track>
<track>Must Be Love</track>
<track>Fear For Your Future</track>
<track>Show Me</track>
<track>Always Wanted More</track>
<track>Thinkin'</track>
<track>Like It</track>
<track>Breath On Me</track>
<track>Somebody Else Might (Remix)</track>
</album>
</cdcollection>
|
The document root is
<cdcollection>, this contains one or more
<album>s, etc. Figure 2 illustrates the
collection rendered in HTML with the individual tracks being
items in an ordered list.
Sheryl CrowSheryl Crow
Slide on ThisRonnie Wood
|
Note that track times, if known, are provided as attributes
of <track> instead of as elements in
their own right. Figure 3 illustrates the tree-like
structure that all XML documents possess.
This structure enables us to process or
parse the document by starting at the root and
visiting the nodes of interest. For example, we could visit
just the <artist> nodes and generate an
alphabetic list of artists. This will be demonstrated when
we look at the Extensible Stylesheet Language (XSL).
There are no hard and fast rules about when to use child elements and when to use attributes. In general, the data itself should be stored in elements, with information about the data (meta-data) stored in attributes. Imagine a document with all the tags (and their attributes) removed, then the basic information should still be present. Attributes are good places for storing IDs, URLs, references, and other information not directly relevent to the reader of the document. Keep the following points in mind.
The first two points are illustrated by the XML fragment given in Figure 4.
<article date="10/11/2000"> Yet More XML </article> |
The date attribute has structure signified by
the / character. This is difficult to extract
and potentially ambiguous depending on your country and
upbringing. Parsers and people entering data can interpret
the date differently to you. Figure 5 illustrates the
alternative approach.
<article>
<date>
<day>10</day>
<month>11</month>
<year>2000</year>
</date>
<title>
Yet More XML
</title>
</article>
|
Now the XML is unambiguous. With CSS or XSL it's easy to
format the date, or even omit parts. It also allows more
than one <date> to be associated with an
element, for example revisions of the article. Note that
once the <article> element has child
elements, it is preferable to explicitly identify text
children, such as <title> in Figure 5,
rather than leave their meaning implicit, as in Figure 4.