In this chapter we first explain what a "document type definition" is and then describe gapdoc.dtd
in detail. That file together with the current chapter define how a GAPDoc document has to look like. It can be found in the main directory of the GAPDoc package and it is reproduced in Appendix B..
We do not give many examples in this chapter which is more intended as a formal reference for all GAPDoc elements. Instead we provide an extra document with book name GAPDocExample
(also accessible from the GAP online help). This uses all the constructs introduced in this chapter and you can easily compare the source code and how it looks like in the different output formats. Furthermore recall that many basic things about XML markup were already explained by example in the introductory chapter 1..
A document type definition (DTD) is a formal declaration of how an XML document has to be structured. It is itself structured such that programs that handle documents can read it and treat the documents accordingly. There are for example parsers and validity checkers that use the DTD to validate an XML document, see 2.1-13.
The main thing a DTD does is to specify which elements may occur in documents of a certain document type, how they can be nested, and what attributes they can or must have. So, for each element there is a rule.
Note that a DTD can not ensure that a document which is "valid" also makes sense to the converters! It only says something about the formal structure of the document.
For the remaining part of this chapter we have divided the elements of GAPDoc documents into several subsets, each of which will be discussed in one of the next sections.
See the following three subsections to learn by example, how a DTD works. We do not want to be too formal here, but just enable the reader to understand the declarations in gapdoc.dtd
. For precise descriptions of the syntax of DTD's see again the official standard in:
http://www.xml.com/axml/axml.html
A GAPDoc document contains on its top level exactly one element with name Book
. This element is declared in the DTD as follows:
<!ELEMENT Book (TitlePage, TableOfContents?, Body, Appendix*, Bibliography?, TheIndex?)> <!ATTLIST Book Name CDATA #REQUIRED> |
After the keyword ELEMENT
and the name Book
there is a list in parentheses. This is a comma separated list of names of elements which can occur (in the given order) in the content of a Book
element. Each name in such a list can be followed by one of the characters "?
", "*
" or "+
", meaning that the corresponding element can occur zero or one time, an arbitrary number of times, or at least once, respectively. Without such an extra character the corresponding element must occur exactly once. Instead of one name in this list there can also be a list of elements names separated by "|
" characters, this denotes any element with one of the names (i.e., "|
" means "or").
So, the Book
element must contain first a TitlePage
element, then an optional TableOfContents
element, then a Body
element, then zero or more elements of type Appendix
, then an optional Bibliography
element, and finally an optional element of type TheIndex
.
Note that only these elements are allowed in the content of the Book
element. No other elements or text is allowed in between. An exception of this is that there may be whitespace between the end tag of one and the start tag of the next element - this should be ignored when the document is processed to some output format. An element like this is called an element with "element content".
The second declaration starts with the keyword ATTLIST
and the element name Book
. After that there is a triple of whitespace separated parameters (in general an arbitrary number of such triples, one for each allowed attribute name). The first (Name
) is the name of an attribute for a Book
element. The second (CDATA
) is always the same for all of our declarations, it means that the value of the attribute consists of "character data". The third parameter #REQUIRED
means that this attribute must be specified with any Book
element. Later we will also see optional attributes which are declared as #IMPLIED
.
<!ELEMENT TitlePage (Title, Subtitle?, Version?, Author+, Date?, Abstract?, Copyright? , Acknowledgements? , Colophon? )> |
Within this element information for the title page is collected. Note that more than one author can be specified. The elements must appear in this order because there is no sensible way to specify in a DTD something like "the following elements may occur in any order but each exactly once".
Before going on with the other elements inside the Book
element we explain the elements for the title page.
<!ELEMENT Title (%Text;)*> |
Here is the last construct you need to understand for reading gapdoc.dtd
. The expression "%Text;
" is a so-called "parameter entity". It is something like a macro within the DTD. It is defined as follows:
<!ENTITY % Text "%InnerText; | List | Enum | Table"> |
This means, that every occurrence of "%Text;
" in the DTD is replaced by the expression
%InnerText; | List | Enum | Table |
which is then expanded further because of the following definition:
<!ENTITY % InnerText "#PCDATA | Alt | Emph | E | Par | P | Keyword | K | Arg | A | Quoted | Q | Code | C | File | F | Button | B | Package | M | Math | Display | Example | Listing | Log | Verb | URL | Email | Homepage | Cite | Label | Ref | Index" > |
These are the only two parameter entities we are using. They expand to lists of element names which are explained in the sequel and the keyword #PCDATA
(concatenated with the "or" character "|
").
So, the element (Title
) is of so-called "mixed content": It can contain parsed character data which does not contain further markup (#PCDATA
) or any of the other above mentioned elements. Mixed content must always have the asterisk qualifier (like in Title
) such that any sequence of elements (of the above list) and character data can be contained in a Title
element.
The %Text;
parameter entity is used in all places in the DTD, where "normal text" should be allowed, including lists, enumerations, and tables, but no sectioning elements.
The %InnerText;
parameter entity is used in all places in the DTD, where "inner text" should be allowed. This means, that no structures like lists, enumerations, and tables are allowed. This is used for example in headings.
<!ELEMENT Subtitle (%Text;)*> |
Contains the subtitle of the document.
<!ELEMENT Version (#PCDATA|Alt)*> |
Note that the version can only contain character data and no further markup elements (except for Alt
, which is necessary to resolve the entities described in 2.2-3). The converters will not put the word "Version" in front of the text in this element.
<!ELEMENT Author (%Text;)*> <!-- There may be more than one Author! --> |
As noted in the comment there may be more than one element of this type. This elements should contain the name of an author and probably an Email
-address and/or WWW-Homepage
element for this author, see 3.5-6 and 3.5-7.
<!ELEMENT Date (#PCDATA)> |
Only character data is allowed in this element which gives a date for the document. No automatic formatting is done.
<!ELEMENT Abstract (%Text;)*> |
This element contains an abstract of the whole book.
<!ELEMENT Copyright (%Text;)*> |
This element is used for the copyright notice. Note the ©right;
entity as described in section 2.2-3.
<!ELEMENT Acknowledgements (%Text;)*> |
This element contains the acknowledgements.
<!ELEMENT Colophon (%Text;)*> |
The "colophon" page is used to say something about the history of a document.
<!ELEMENT TableOfContents EMPTY> |
This element may occur in the Book
element after the TitlePage
element. If it is present, a table of contents is generated and inserted into the document. Note that because this element is declared to be EMPTY
one can use the abbreviation
<TableOfContents/> |
to denote this empty element.
<!ELEMENT Bibliography EMPTY> <!ATTLIST Bibliography Databases CDATA #REQUIRED Style CDATA #IMPLIED> |
This element may occur in the Book
element after the last Appendix
element. If it is present, a bibliography section is generated and inserted into the document. The attribute Databases
must be specified and refers to BibTeX databases. The databases must be separated by commas and must not have a .bib
extension. A bibliography style may be specified with the Style
attribute. The optional Style
attribute (for LaTeX output of the document) must also be specified without the .bst
extension (the default is alpha
). See also section 3.5-3 for a description of the Cite
element which is used to include bibliography references into the text.
The reference for the format of BibTeX database files is [L85, Appendix B].
<!ELEMENT TheIndex EMPTY> |
This element may occur in the Book
element after the Bibliography
element. If it is present, an index is generated and inserted into the document. There are elements in GAPDoc which implicitly generate index entries (e.g., Func
(3.4-2)) and there is an element Index
(3.5-4)for explicitly adding index entries.
A GAPDoc book is divided into chapters, sections, and subsections. The idea is of course, that a chapter consists of sections, which in turn consist of subsections. However for the sake of flexibility, the rules are not too restrictive. Firstly, text is allowed everywhere in the body of the document (and not only within sections). Secondly, the chapter level may be omitted. The exact rules are described below.
Appendices are a flavor of chapters, occurring after all regular chapters. There is a special type of subsection called "ManSection
". This is a subsection devoted to the description of a function, operation or variable. It is analogous to a manpage in the UNIX environment. Usually each function, operation, method, and so on should have its own ManSection
.
Cross referencing is done on the level of Subsection
s, respectively ManSection
s. The topics in GAP's online help are also pointing to subsections. So, they should not be too long.
We start our description of the sectioning elements "top-down":
The Body
element marks the main part of the document. It must occur after the TableOfContents
element. There is a big difference between inside and outside of this element: Whereas regular text is allowed nearly everywhere in the Body
element and its subelements, this is not true for the outside. This has also implications on the handling of whitespace. Outside superfluous whitespace is usually ignored when it occurs between elements. Inside of the Body
element whitespace matters because character data is allowed nearly everywhere. Here is the definition in the DTD:
<!ELEMENT Body ( %Text;| Chapter | Section )*> |
The fact that Chapter
and Section
elements are allowed here leads to the possibility to omit the chapter level entirely in the document. For a description of %Text;
see here.
(Remark: The purpose of this element is to make sure that a valid GAPDoc document has a correct overall structure, which is only possible when the top element Book
has element content.)
<!ELEMENT Chapter (%Text;| Heading | Section)*> <!ATTLIST Chapter Label CDATA #IMPLIED> <!-- For reference purposes --> |
A Chapter
element can have a Label
attribute, such that this chapter can be referenced later on with a Ref
element (see section 3.5-1). Note that you have to specify a label to reference the chapter as there is no automatic labelling!
Chapter
elements can contain text (for a description of %Text;
see here), Section
elements, and Heading
elements.
The following additional rule cannot be stated in the DTD because we want a Chapter
element to have mixed content. There must be exactly one Heading
element in the Chapter
element, containing the heading of the chapter. Here is its definition:
<!ELEMENT Heading (%InnerText;)*> |
This element is used for headings in Chapter
, Section
, Subsection
, and Appendix
elements. It may only contain %InnerText;
(for a description see here).
Each of the mentioned sectioning elements must contain exactly one direct Heading
element (i.e., one which is not contained in another sectioning element).
<!ELEMENT Appendix (%Text;| Heading | Section)*> <!ATTLIST Appendix Label CDATA #IMPLIED> <!-- For reference purposes --> |
The Appendix
element behaves exactly like a Chapter
element (see 3.3-2) except for the position within the document and the numbering. While chapters are counted with numbers (1., 2., 3., ...) the appendices are counted with capital letters (A., B., ...).
Again there is an optional Label
attribute used for references.
<!ELEMENT Section (%Text;| Heading | Subsection | ManSection)*> <!ATTLIST Section Label CDATA #IMPLIED> <!-- For reference purposes --> |
A Section
element can have a Label
attribute, such that this section can be referenced later on with a Ref
element (see section 3.5-1). Note that you have to specify a label to reference the section as there is no automatic labelling!
Section
elements can contain text (for a description of %Text;
see here), Heading
elements, and subsections.
There must be exactly one direct Heading
element in a Section
element, containing the heading of the section.
Note that a subsection is either a Subsection
element or a ManSection
element.
<!ELEMENT Subsection (%Text;| Heading)*> <!ATTLIST Subsection Label CDATA #IMPLIED> <!-- For reference purposes --> |
The Subsection
element can have a Label
attribute, such that this subsection can be referenced later on with a Ref
element (see section 3.5-1). Note that you have to specify a label to reference the subsection as there is no automatic labelling!
Subsection
elements can contain text (for a description of %Text;
see here), and Heading
elements.
There must be exactly one Heading
element in a Subsection
element, containing the heading of the subsection.
Another type of subsection is a ManSection
, explained now:
ManSection
s are intended to describe a function, operation, method, variable, or some other technical instance. It is analogous to a manpage in the UNIX environment.
<!ELEMENT ManSection (((Func, Returns?) | (Oper, Returns?) | (Meth, Returns?) | (Filt, Returns?) | (Prop, Returns?) | (Attr, Returns?) | Var | Fam | InfoClass)+, Description )> <!ATTLIST ManSection Label CDATA #IMPLIED> <!-- For reference purposes --> <!ELEMENT Returns (%Text;)*> <!ELEMENT Description (%Text;)*> |
The ManSection
element can have a Label
attribute, such that this subsection can be referenced later on with a Ref
element (see section 3.5-1). But this is probably rarely necessary because the elements Func
and so on (explained below) generate automatically labels for cross referencing.
The content of a ManSection
element is one or more elements describing certain items in GAP, each of them optionally followed by a Returns
element, followed by a Description
element, which contains %Text;
(see here) describing it. (Remember to include examples in the description as often as possible, see 3.7-10). The classes of items GAPDoc knows of are: functions (Func
), operations (Oper
), methods (Meth
), filters (Filt
), properties (Prop
), attributes (Attr
), variables (Var
), families (Fam
), and info classes (InfoClass
). One ManSection
should only describe several of such items when these are very closely related.
Each element for an item corresponding to a GAP function can be followed by a Returns
element. In output versions of the document the string "Returns: " will be put in front of the content text. The text in the Returns
element should usually be a short hint about the type of object returned by the function. This is intended to give a good mnemonic for the use of a function (together with a good choice of names for the formal arguments).
ManSection
s are also sectioning elements which count as subsections. A possible heading is generated automatically from the first element.
<!ELEMENT Func EMPTY> <!ATTLIST Func Name CDATA #REQUIRED Label CDATA #IMPLIED Arg CDATA #REQUIRED Comm CDATA #IMPLIED> |
This element is used within a ManSection
element to specify the usage of a function. The Name
attribute is required and its value is the name of the function. The value of the Arg
attribute (also required) contains the full list of arguments including optional parts, which are denoted by square brackets. The arguments are separated by whitespace or commas.
The name of the function is also used as label for cross referencing. When the name of the function appears in the text of the document it should always be written with the Ref
element, see 3.5-1. This allows to use a unique typesetting style for function names and automatic cross referencing.
If the optional Label
attribute is given, it is appended (with a colon :
in between) to the name of the function for cross referencing purposes. The text of the label can also appear in the document text. So, it should be a kind of short explanation.
<Func Arg="x[, y]" Name="LibFunc" Label="for my objects"/> |
The optional Comm
attribute should be a short description of the function, usually at most one line long.
This element automatically produces an index entry with the name of the function and, if present, the text of the Label
attribute as subentry (see also 3.2-14 and 3.5-4).
<!ELEMENT Oper EMPTY> <!ATTLIST Oper Name CDATA #REQUIRED Label CDATA #IMPLIED Arg CDATA #REQUIRED Comm CDATA #IMPLIED> |
This element is used within a ManSection
element to specify the usage of an operation. The attributes are used exactly in the same way as in the Func
element (see 3.4-2).
Note that multiple descriptions of the same operation may occur in a document because there may be several declarations in GAP. Furthermore there may be several ManSection
s for methods of this operation (see 3.4-4) which also use the same name. For reference purposes these must be distinguished by different Label
attributes.
<!ELEMENT Meth EMPTY> <!ATTLIST Meth Name CDATA #REQUIRED Label CDATA #IMPLIED Arg CDATA #REQUIRED Comm CDATA #IMPLIED> |
This element is used within a ManSection
element to specify the usage of a method. The attributes are used exactly in the same way as in the Func
element (see 3.4-2).
Due to the fact that it often happens that many methods are installed for the same operation it seems to be interesting to document them independently. This is possible by using the same method name in different ManSection
s. It is however required that these subsections and those describing the corresponding operation are distinguished by different Label
attributes.
<!ELEMENT Filt EMPTY> <!ATTLIST Filt Name CDATA #REQUIRED Label CDATA #IMPLIED Arg CDATA #IMPLIED Comm CDATA #IMPLIED Type CDATA #IMPLIED> |
This element is used within a ManSection
element to specify the usage of a filter. The first four attributes are used in the same way as in the Func
element (see 3.4-2), except that the Arg
attribute is optional.
The Type
attribute can be any string, but it is thought to be something like "Category
" or "Representation
".
<!ELEMENT Prop EMPTY> <!ATTLIST Prop Name CDATA #REQUIRED Label CDATA #IMPLIED Arg CDATA #REQUIRED Comm CDATA #IMPLIED> |
This element is used within a ManSection
element to specify the usage of a property. The attributes are used exactly in the same way as in the Func
element (see 3.4-2).
<!ELEMENT Attr EMPTY> <!ATTLIST Attr Name CDATA #REQUIRED Label CDATA #IMPLIED Arg CDATA #REQUIRED Comm CDATA #IMPLIED> |
This element is used within a ManSection
element to specify the usage of an attribute (in GAP). The attributes are used exactly in the same way as in the Func
element (see 3.4-2).
<!ELEMENT Var EMPTY> <!ATTLIST Var Name CDATA #REQUIRED Label CDATA #IMPLIED Comm CDATA #IMPLIED> |
This element is used within a ManSection
element to document a global variable. The attributes are used exactly in the same way as in the Func
element (see 3.4-2) except that there is no Arg
attribute.
<!ELEMENT Fam EMPTY> <!ATTLIST Fam Name CDATA #REQUIRED Label CDATA #IMPLIED Comm CDATA #IMPLIED> |
This element is used within a ManSection
element to document a family. The attributes are used exactly in the same way as in the Func
element (see 3.4-2) except that there is no Arg
attribute.
<!ELEMENT InfoClass EMPTY> <!ATTLIST InfoClass Name CDATA #REQUIRED Label CDATA #IMPLIED Comm CDATA #IMPLIED> |
This element is used within a ManSection
element to document an info class. The attributes are used exactly in the same way as in the Func
element (see 3.4-2) except that there is no Arg
attribute.
Cross referencing in the GAPDoc system is somewhat different to the usual LaTeX cross referencing in so far, that a reference knows "which type of object" it is referencing. For example a "reference to a function" is distinguished from a "reference to a chapter". The idea of this is, that the markup must contain this information such that the converters can produce better output. The HTML converter can for example typeset a function reference just as the name of the function with a link to the description of the function, or a chapter reference as a number with a link in the other case.
Referencing is done with the Ref
element:
<!ELEMENT Ref EMPTY> <!ATTLIST Ref Func CDATA #IMPLIED Oper CDATA #IMPLIED Meth CDATA #IMPLIED Filt CDATA #IMPLIED Prop CDATA #IMPLIED Attr CDATA #IMPLIED Var CDATA #IMPLIED Fam CDATA #IMPLIED InfoClass CDATA #IMPLIED Chap CDATA #IMPLIED Sect CDATA #IMPLIED Subsect CDATA #IMPLIED Appendix CDATA #IMPLIED Text CDATA #IMPLIED Label CDATA #IMPLIED BookName CDATA #IMPLIED Style (Text | Number) #IMPLIED> <!-- normally automatic --> |
The Ref
element is defined to be EMPTY
. If one of the attributes Func
, Oper
, Meth
, Prop
, Attr
, Var
, Fam
, InfoClass
, Chap
, Sect
, Subsect
, Appendix
is given then there must be exactly one of these, making the reference one to the corresponding object. The Label
attribute can be specified in addition to make the reference unique, for example if more than one method with a given name is present. (Note that there is no way to specify in the DTD that exactly one of the first listed attributes must be given, this is an additional rule.)
A reference to a Label
element defined below (see 3.5-2) is done by giving the Label
attribute and optionally the Text
attribute. If the Text
attribute is present its value is typeset in place of the Ref
element, if linking is possible (for example in HTML). If this is not possible, the section number is typeset. This type of reference is also used for references to tables (see 3.6-5).
Optionally an external reference into another book can be specified by using the BookName
attribute. In this case the Label
attribute must be specified and refers to a search string as in the GAP help system. It is guaranteed that the reference points to the position in the other book, that the GAP help system finds as first match if one types the value of the Label
element after a question mark.
The optional attribute Style
can take only the values Text
and Number
. It can be used with references to sectioning units and it controls, whether an explicit section number is generated or text. Normally all references to sections generate numbers and references to a GAP object generate the name of the corresponding object with some additional link or sectioning information, which is the behavior of Style="Text"
. In case Style="Number"
in all cases an explicit section number is generated. So
<Ref Subsect="Func" Style="Text"/> described in section <Ref Subsect="Func" Style="Number"/> |
produces: 3.2-3 described in section ???.
<!ELEMENT Label EMPTY> <!ATTLIST Label Name CDATA #REQUIRED> |
This element is used to define a label for referencing a certain position in the document, if this is possible. If an exact reference is not possible (like in a printed version of the document) a reference to the corresponding subsection is generated. The value of the Name
attribute must be unique under all Label
elements.
<!ELEMENT Cite EMPTY> <!ATTLIST Cite Key CDATA #REQUIRED Where CDATA #IMPLIED> |
This element is for bibliography citations. It is EMPTY
by definition. The attribute Key
is the key for a lookup in a BibTeX database that has to be specified in the Bibliography
element (see 3.2-13). The value of the Where
attribute specifies the position in the document as in the corresponding LaTeX syntax \cite[...]{...}
.
<!ELEMENT Index (%InnerText;)*> <!ATTLIST Index Key CDATA #IMPLIED Subkey CDATA #IMPLIED> |
This element generates an index entry. The text within the element is typeset in the index entry, which is sorted under the value, that is specified in the Key
and Subkey
attributes. If they are not specified, the typeset text itself is used as the key.
Note that all Func
and similar elements automatically generate index entries. If the TheIndex
element (3.2-14) is not present in the document all Index
elements are ignored.
<!ELEMENT URL (#PCDATA)> <!-- Can we define this better? --> <!ATTLIST URL Text CDATA #IMPLIED> <!-- This is for output formats that have links like HTML --> |
This element is for references into the internet. The text within the element should be a valid URL. It is typeset in the document. For the case of an output document format that supports links the value of the attribute Text
is typeset as visible text for the link.
<!ELEMENT Email (#PCDATA)> |
This element type is the special case of an URL specifying an email address. The content of the element should be the email address without any prefix like "mailto:
". This address is typeset by all converters, also without any prefix. In the case of an output document format like HTML the converter can produce a link with a "mailto:
" prefix.
<!ELEMENT Homepage (#PCDATA)> |
This element type is the special case of an URL specifying a WWW-homepage. The content of the element should be the valid URL specifying a world wide web page. In comparison with the URL
element the address is visible in all output formats.
The GAPDoc system offers some limited access to structural elements like lists, enumerations, and tables. Although it is possible to use all LaTeX constructs one always has to think about other output formats. The elements in this section are guaranteed to produce something reasonable in all output formats.
<!ELEMENT List ( ((Mark,Item)|(BigMark,Item)|Item)+ )> <!ATTLIST List Only CDATA #IMPLIED Not CDATA #IMPLIED> |
This element produces a list. Each item in the list corresponds to an Item
element. Every Item
element is optionally preceded by a Mark
element. The content of this is used as a marker for the item. Note that this marker can be a whole word or even a sentence. It will be typeset in some emphasized fashion and most converters will provide some indentation for the rest of the item.
The Only
and Not
attributes can be used to specify, that the list is included into the output by only one type of converter (Only
) or all but one type of converter (Not
). Of course at most one of the two attributes may occur in one element. The following values are allowed as of now: "LaTeX
", "HTML
", and "Text
". See also the Alt
element in 3.9-1 for more about text alternatives for certain converters.
<!ELEMENT Mark ( %InnerText;)*> |
This element is used in the List
element to mark items. See 3.6-1 for an explanation.
<!ELEMENT Item ( %Text;)*> |
This element is used in the List
, Enum
, and Table
elements to specify the items. See sections 3.6-1, 3.6-4, and 3.6-5 for further information.
<!ELEMENT Enum ( Item+ )> <!ATTLIST Enum Only CDATA #IMPLIED Not CDATA #IMPLIED> |
This element is used identically to the List
element (see 3.6-1) except that the items may not have marks attached to them. Instead, the items are numbered automatically. The same comments about the Only
and Not
attributes as above apply.
<!ELEMENT Table ( Caption?, (Row | HorLine)+ )>
<!ATTLIST Table Label CDATA #IMPLIED
Only CDATA #IMPLIED
Not CDATA #IMPLIED
Align CDATA #REQUIRED>
<!-- We allow | and l,c,r, nothing else -->
<!ELEMENT Row ( Item+ )>
<!ELEMENT HorLine EMPTY>
<!ELEMENT Caption ( %InnerText;)*>
A table in GAPDoc consists of an optional Caption
element followed by a sequence of Row
and HorLine
elements. A HorLine
element produces a horizontal line in the table. A Row
element consists of a sequence of Item
elements as they also occur in List
and Enum
elements. The Only
and Not
attributes have the same functionality as described in the List
element in 3.6-1.
The Align
attribute is written like a LaTeX tabular alignment specifier but only the letters "l
", "r
", "c
", and "|
" are allowed meaning left alignment, right alignment, centered alignment, and a vertical line as delimiter between columns respectively.
If the Label
attribute is there, one can reference the table with the Ref
element (see 3.5-1) using its Label
attribute.
Usually only simple tables should be used. If you want a complicated table in the LaTeX output you should provide alternatives for text and HTML output. Note that in HTML-4.0 there is no possibility to interpret the "|
" column separators and HorLine
elements as intended. There are lines between all columns and rows or no lines at all.
3.7 Types of Text
This section covers the markup of text. Various types of "text" exist. The following elements are used in the GAPDoc system to mark them. They mostly come in pairs, one long name which is easier to remember and a shortcut to make the markup "lighter".
Most of the following elements are thought to contain only character data and no further markup elements. It is however necessary to allow Alt
elements to resolve the entities described in section 2.2-3.
3.7-1
and
<!ELEMENT Emph (%InnerText;)*> <!-- Emphasize something -->
<!ELEMENT E (%InnerText;)*> <!-- the same as shortcut -->
This element is used to emphasize some piece of text. It may contain %InnerText;
(see here).
3.7-2
and
<!ELEMENT Quoted (%InnerText;)*> <!-- Quoted (in quotes) text -->
<!ELEMENT Q (%InnerText;)*> <!-- Quoted text (shortcut) -->
This element is used to put some piece of text into " "-quotes. It may contain %InnerText;
(see here).
3.7-3
and
<!ELEMENT Keyword (#PCDATA|Alt)*> <!-- Keyword -->
<!ELEMENT K (#PCDATA|Alt)*> <!-- Keyword (shortcut) -->
This element is used to mark something as a keyword. Usually this will be a GAP keyword such as "if
" or "for
". No further markup elements are allowed within this element except for the Alt
element, which is necessary.
3.7-4
and
<!ELEMENT Arg (#PCDATA|Alt)*> <!-- Argument -->
<!ELEMENT A (#PCDATA|Alt)*> <!-- Argument (shortcut) -->
This element is used inside Description
s in ManSection
s to mark something as an argument (of a function, operation, or such). It is guaranteed that the converters typeset those exactly as in the definition of functions. No further markup elements are allowed within this element.
3.7-5
and
<!ELEMENT Code (#PCDATA|Alt)*> <!-- GAP code -->
<!ELEMENT C (#PCDATA|Alt)*> <!-- GAP code (shortcut) -->
This element is used to mark something as a piece of code like for example a GAP expression. It is guaranteed that the converters typeset this exactly as in the Listing
element (compare section 3.7-9. No further markup elements are allowed within this element.
3.7-6
and
<!ELEMENT File (#PCDATA|Alt)*> <!-- Filename -->
<!ELEMENT F (#PCDATA|Alt)*> <!-- Filename (shortcut) -->
This element is used to mark something as a filename or a pathname in the file system. No further markup elements are allowed within this element.
3.7-7
and
<!ELEMENT Button (#PCDATA|Alt)*> <!-- "Button" (also Menu, Key, ...) -->
<!ELEMENT B (#PCDATA|Alt)*> <!-- "Button" (shortcut) -->
This element is used to mark something as a button. It can also be used for other items in a graphical user interface like menus, menu entries, or keys. No further markup elements are allowed within this element.
3.7-8
<!ELEMENT Package (#PCDATA|Alt)*> <!-- A package name -->
This element is used to mark something as a name of a package. This is for example used to define the entities GAP, XGAP or GAPDoc (see section 2.2-3). No further markup elements are allowed within this element.
3.7-9
<!ELEMENT Listing (#PCDATA)> <!-- This is just for GAP code listings -->
<!ATTLIST Listing Type CDATA #IMPLIED> <!-- a comment about the type of
listed code, may appear in
output -->
This element is used to embed listings of programs into the document. Only character data and no other elements are allowed in the content. You should not use the character entities described in section 2.2-3 but instead type the characters directly. Only the general XML rules from section 2.1 apply. Note especially the usage of sections described there. It is guaranteed that all characters use a fixed width font for typesetting Listing
elements. Compare also the usage of the Code
and C
elements in 3.7-5.
The Type
attribute contains a comment about the type of listed code. It may appear in the output.
3.7-10
and
<!ELEMENT Example (#PCDATA)> <!-- This is subject to the automatic
example checking mechanism -->
<!ELEMENT Log (#PCDATA)> <!-- This not -->
These two elements behave exactly like the Listing
element (see 3.7-9). They are thought for protocols of GAP sessions. The only difference between the two is that Example
sections are intended to be subject to an automatic manual checking mechanism used to ensure the correctness of the GAP manual whereas Log
is not touched by this.
3.7-11
There is one further type of verbatim-like element.
<!ELEMENT Verb (#PCDATA)>
The content of such an element is guaranteed to be put into an output version exactly as it is using some fixed width font. Before the content a new line is started. If the line after the end of the start tag consists of whitespace only then this part of the content is skipped.
This element is intended to be used together with the Alt
element to specify pre-formatted ASCII alternatives for complicated Display
formulae or Table
s.
3.8 Elements for Mathematical Formulae
3.8-1
and
<!-- Normal TeX math mode formula -->
<!ELEMENT Math (#PCDATA|A|Arg|Alt)*>
<!-- TeX displayed math mode formula -->
<!ELEMENT Display (#PCDATA|A|Arg|Alt)*>
These elements are used for mathematical formulae. As described in section 2.2-2 they correspond to LaTeX's math and display math mode respectively.
The formulae are typed in as in LaTeX, except that the standard XML entities, see 2.1-9 (in particular the characters <
and &
), must be escaped - either by using the corresponding entities or by enclosing the formula between "" and "]]>
". (The main reference for LaTeX is [L85].)
The only element type that is allowed within the formula elements is the Arg
or A
element (see 3.7-4), which is used to typeset identifiers that are arguments to GAP functions or operations.
In text and HTML output these formula are shown as LaTeX source code. For simple formulae (and you should try to make all your formulae simple!) there is the element M
(see 3.8-2) for which there is a well defined translation into text, which can be used for text and HTML output versions of the document. So, if possible try to avoid the Math
and Display
elements or provide useful text substitutes for complicated formulae via Alt
elements (see 3.9-1 and 3.7-11).
3.8-2
<!-- Math with well defined translation to text output -->
<!ELEMENT M (#PCDATA|A|Arg|Alt)*>
The "M
" element type is intended for formulae in the running text for which there is a sensible ASCII version. For the LaTeX version of a GAPDoc document the M
and Math
elements are equivalent. The remarks in 3.8-1 about special characters and the Arg
element apply here as well. A document which has all formulae enclosed in M
elements can be well readable in text terminal output and printed output versions.
The following LaTeX macros have a sensible ASCII translation and are guaranteed to be translated accordingly by text (and HTML) converters:
Table: LaTeX macros with special text translation
\ldots
...
\mid
|
\left
\right
\mathbb
\mathop
\limits
\cdot
*
\ast
*
\geq
>=
\leq
<=
\pmod
mod
\equiv
=
\rightarrow
->
\hookrightarrow
->
\to
->
\longrightarrow
-->
\Rightarrow
=>
\Longrightarrow
==>
\Leftarrow
<=
\iff
<=>
\mapsto
->
\leftarrow
<-
\langle
<
\rangle
>
\setminus
\
In all other macros only the backslash is removed. Whitespace is normalized (to one blank) but not removed. Note that whitespace is not added, so you may want to add a few more spaces than you usually do in your LaTeX documents.
Braces {}
are removed in general, however pairs of double braces are converted to one pair of braces. This can be used to write x^{12}
for x^12
and x_{{i+1}}
for x_{i+1}
.
3.9 Everything else
3.9-1
This element is used to specify alternatives for different output formats within normal text. See also sections 3.6-1, 3.6-4, and 3.6-5 for alternatives in lists and tables.
<!ELEMENT Alt (%InnerText;)*> <!-- This is only to allow "Only" and
"Not" attributes for normal text -->
<!ATTLIST Alt Only CDATA #IMPLIED
Not CDATA #IMPLIED>
Of course exactly one of the two attributes must occur in one element. The following values are allowed as of now: "LaTeX
", "HTML
", and "Text
". If the Only
attribute is specified then only the corresponding converter will include the content of the element into the output document. If the Not
attribute is specified the corresponding converter will ignore the content of the element.
Within the element only %InnerText;
(see here) is allowed. This is to ensure that the same set of chapters, sections, and subsections show up in all output formats.
3.9-2
and
<!ELEMENT Par EMPTY> <!-- this is intentionally empty! -->
<!ELEMENT P EMPTY> <!-- this is intentionally empty! -->
This EMPTY
element marks the boundary of paragraphs. Note that an empty line in the input does not mark a new paragraph as opposed to the LaTeX convention.
(Remark: it would be much easier to parse a document and to understand its sectioning and paragraph structure when there was an element whose content is the text of a paragraph. But in practice many paragraph boundaries are implicitly clear which would make it somewhat painful to enclose each paragraph in extra tags. The introduction of the P
or Par
elements as above delegates this pain to the writer of a conversion program for GAPDoc documents.)
generated by GAPDoc2HTML