SGML entities

This introduction is provided to help explain some basic concepts of using SGML to writers who are not yet very familiar with SGML and illustrate some of the strategies from SGML that you may use to handle common writing issues. It assumes that you already understand the basic concept of marking up documents in an SGML language that indicates what the content is and not what it should look like and that you know a few acronyms, like DTD (for document type definition) or ISO (International Standards Organization).

BASIC STRUCTURE OF AN SGML DOCUMENT

So what does an SGML document look like? Well, there are at least three different pieces: a main document file, the DTD for that type of document, and the SGML Declaration for that type of document. The main document file can also use any number of other files containing text marked up in SGML or other types of information like graphics or multimedia files.

The SGML declaration contains a lot of esoteric information, but basically describes SGML conventions that are followed in documents of that type. The DTD describes the structure for documents of that type and the set of tags (the language) that mark up the document to define the content and structure. SGML systems need all of this information to read the SGML document correctly.

When SGML systems work with SGML documents, they start with the main document:

  1. The main document defines the start and end of your document and identifies the DTD and SGML declaration to use with the document. It also identifies any other files with content that are included in your document.

    The main document starts with a line identifying the DTD for this document that looks something like this:

    <!DOCTYPE  MyDocuments PUBLIC "-//Joan Duvall//DTD Joan's Documents//EN">

    Tags are enclosed in start/end tag characters. The most common characters for this are < (start) and > (end).

    The !DOCTYPE tells SGML systems that this tag identifies the document type for this document.

    MyDocuments is the name of the document type and is always the name of the beginning and ending tags for documents that use this DTD. All of the content for the document comes between these tags.

    PUBLIC and the long strange name in quotes identifies the file for the DTD and for the SGML declaration. This is called a formal public ID, or FPI. An explanation of FPIs comes later.

  2. Inside the document type declaration, a document can also have other declarations of information that can be used in the document. This is known as the internal subset and is contained inside brackets [ and ] after the DTD identifier and before the end tag character. It would then look like this:

    <!DOCTYPE  MyDocuments PUBLIC "-//Joan Duvall//DTD Joan's Documents//EN" [
    ...some other declarations here...
    ]>
  3. After the document type declaration comes the document. All of the content for your document must be enclosed in a beginning and ending tag that matches the document type. In our example, this looks like:

    <MyDocuments> ... the content of your document ... </MyDocuments>

    Tags that enclose contents have a start tag like <MyDocuments> and an end tag </MyDocuments> that use the same name but the slash identifies the end tag. Tag names are also known as elements or generic identifiers (gi's).