GAPDoc ( Version 1.6.7 ) February 2024 Frank Lübeck Max Neunhöffer Frank Lübeck Email: mailto:Frank.Luebeck@Math.RWTH-Aachen.De Homepage: https://www.math.rwth-aachen.de/~Frank.Luebeck ------------------------------------------------------- Copyright © 2000-2024 by Frank Lübeck and Max Neunhöffer GAPDoc is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License (https://www.fsf.org/licenses/gpl.html) as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. ------------------------------------------------------- Contents (GAPDoc) 1 Introduction and Example 1.1 XML 1.2 A complete example 1.3 Some questions 2 How To Type a GAPDoc Document 2.1 General XML Syntax 2.1-1 Head of XML Document 2.1-2 Comments 2.1-3 Processing Instructions 2.1-4 Names in XML and Whitespace 2.1-5 Elements 2.1-6 Start Tags 2.1-7 End Tags 2.1-8 Combined Tags for Empty Elements 2.1-9 Entities 2.1-10 Special Characters in XML 2.1-11 Rules for Attribute Values 2.1-12 CDATA 2.1-13 Encoding of an XML Document 2.1-14 Well Formed and Valid XML Documents 2.2 Entering GAPDoc Documents 2.2-1 Other special characters 2.2-2 Mathematical Formulae 2.2-3 More Entities 3 The Document Type Definition 3.1 What is a DTD? 3.2 Overall Document Structure 3.2-1  3.2-2  3.2-3  3.2-4 <Subtitle> 3.2-5 <Version> 3.2-6 <TitleComment> 3.2-7 <Author> 3.2-8 <Date> 3.2-9 <Address> 3.2-10 <Abstract> 3.2-11 <Copyright> 3.2-12 <Acknowledgements> 3.2-13 <Colophon> 3.2-14 <TableOfContents> 3.2-15 <Bibliography> 3.2-16 <TheIndex> 3.3 Sectioning Elements 3.3-1 <Body> 3.3-2 <Chapter> 3.3-3 <Heading> 3.3-4 <Appendix> 3.3-5 <Section> 3.3-6 <Subsection> 3.4 ManSection–a special kind of subsection 3.4-1 <ManSection> 3.4-2 <Func> 3.4-3 <Oper> 3.4-4 <Constr> 3.4-5 <Meth> 3.4-6 <Filt> 3.4-7 <Prop> 3.4-8 <Attr> 3.4-9 <Var> 3.4-10 <Fam> 3.4-11 <InfoClass> 3.5 Cross Referencing and Citations 3.5-1 <Ref> 3.5-2 <Label> 3.5-3 <Cite> 3.5-4 <Index> 3.5-5 <URL> 3.5-6 <Email> 3.5-7 <Homepage> 3.6 Structural Elements like Lists 3.6-1 <List> 3.6-2 <Mark> 3.6-3 <Item> 3.6-4 <Enum> 3.6-5 <Table> 3.7 Types of Text 3.7-1 <Emph> and <E> 3.7-2 <Quoted> and <Q> 3.7-3 <Keyword> and <K> 3.7-4 <Arg> and <A> 3.7-5 <Code> and <C> 3.7-6 <File> and <F> 3.7-7 <Button> and <B> 3.7-8 <Package> 3.7-9 <Listing> 3.7-10 <Log> and <Example> 3.7-11 <Verb> 3.8 Elements for Mathematical Formulae 3.8-1 <Math> and <Display> 3.8-2 <M> 3.9 Everything else 3.9-1 <Alt> 3.9-2 <Par> and <P> 3.9-3 <Br> 3.9-4 <Ignore> 4 Distributing a Document into Several Files 4.1 The Conventions 4.2 A Tool for Collecting a Document 4.2-1 ComposedDocument 4.2-2 OriginalPositionDocument 4.2-3 FilenameGAP 5 The Converters and an XML Parser 5.1 Producing Documentation from Source Files 5.1-1 MakeGAPDocDoc 5.2 Parsing XML Documents 5.2-1 ParseTreeXMLString 5.2-2 StringXMLElement 5.2-3 EntitySubstitution 5.2-4 DisplayXMLStructure 5.2-5 ApplyToNodesParseTree 5.2-6 GetTextXMLTree 5.2-7 XMLElements 5.2-8 CheckAndCleanGapDocTree 5.2-9 AddParagraphNumbersGapDocTree 5.2-10 InfoXMLParser 5.2-11 XMLValidate 5.2-12 ValidateGAPDoc 5.3 The Converters 5.3-1 GAPDoc2LaTeX 5.3-2 GAPDoc2Text 5.3-3 GAPDoc2TextPrintTextFiles 5.3-4 AddPageNumbersToSix 5.3-5 PrintSixFile 5.3-6 SetGAPDocTextTheme 5.3-7 GAPDoc2HTML 5.3-8 GAPDoc2HTMLPrintHTMLFiles 5.3-9 Stylesheet files 5.3-10 CopyHTMLStyleFiles 5.3-11 SetGAPDocHTMLStyle 5.3-12 InfoGAPDoc 5.3-13 SetGapDocLanguage 5.4 Testing Manual Examples 5.4-1 ExtractExamples 5.4-2 RunExamples 6 String and Text Utilities 6.1 Text Utilities 6.1-1 WHITESPACE 6.1-2 TextAttr 6.1-3 WrapTextAttribute 6.1-4 FormatParagraph 6.1-5 SubstitutionSublist 6.1-6 StripBeginEnd 6.1-7 StripEscapeSequences 6.1-8 RepeatedString 6.1-9 NumberDigits 6.1-10 LabelInt 6.1-11 PositionMatchingDelimiter 6.1-12 WordsString 6.1-13 Base64String 6.2 Unicode Strings 6.2-1 Unicode Strings and Characters 6.2-2 Encode 6.2-3 Lengths of UTF-8 strings 6.2-4 InitialSubstringUTF8String 6.3 Print Utilities 6.3-1 PrintTo1 6.3-2 StringPrint 6.3-3 PrintFormattedString 6.3-4 Page 6.3-5 StringFile 7 Utilities for Bibliographies 7.1 Parsing BibTeX Files 7.1-1 ParseBibFiles 7.1-2 NormalizedNameAndKey 7.1-3 WriteBibFile 7.1-4 LabelsFromBibTeX 7.1-5 InfoBibTools 7.2 The BibXMLext Format 7.3 Utilities for BibXMLext data 7.3-1 Translating BibTeX to BibXMLext 7.3-2 HeuristicTranslationsLaTeX2XML.Apply 7.3-3 StringBibAsXMLext 7.3-4 ParseBibXMLextString 7.3-5 WriteBibXMLextFile 7.3-6 Bibliography Entries as Records 7.3-7 RecBibXMLEntry 7.3-8 AddHandlerBuildRecBibXMLEntry 7.3-9 StringBibXMLEntry 7.3-10 TemplateBibXML 7.4 Getting BibTeX entries from MathSciNet 7.4-1 SearchMR A The File 3k+1.xml B The File gapdoc.dtd C The File bibxmlext.dtd ──────────────────────────────────────────────────────────────────────────── 1 Introduction and Example The main purpose of the GAPDoc package is to define a file format for documentation of GAP-programs and -packages (see [GAP06]). The problem is that such documentation should be readable in several output formats. For example it should be possible to read the documentation inside the terminal in which GAP is running (a text mode) and there should be a printable version in high typesetting quality (produced by some version of TeX). It is also popular to view GAP's online help with a Web-browser via an HTML-version of the documentation. Nowadays one can use LaTeX and standard viewer programs to produce and view on the screen dvi- or pdf-files with full support of internal and external hyperlinks. Certainly there will be other interesting document formats and tools in this direction in the future. Our aim is to find a format for writing the documentation which allows a relatively easy translation into the output formats just mentioned and which hopefully makes it easy to translate to future output formats as well. To make documentation written in the GAPDoc format directly usable, we also provide a set of programs, called converters, which produce text-, hyperlinked LaTeX- and HTML-output versions of a GAPDoc document. These programs are developed by the first named author. They run completely inside GAP, i.e., no external programs are needed. You only need latex and pdflatex to process the LaTeX output. These programs are described in Chapter 5. 1.1 XML The definition of the GAPDoc format uses XML, the "eXtendible Markup Language". This is a standard (defined by the W3C consortium, see https://www.w3c.org) which lays down a syntax for adding markup to a document or to some data. It allows to define document structures via introducing markup elements and certain relations between them. This is done in a document type definition. The file gapdoc.dtd contains such a document type definition and is the central part of the GAPDoc package. The easiest way for getting a good idea about this is probably to look at an example. The Appendix A contains a short but complete GAPDoc document for a fictitious share package. In the next section we will go through this document, explain basic facts about XML and the GAPDoc document type, and give pointers to more details in later parts of this documentation. In the last Section 1.3 of this introductory chapter we try to answer some general questions about the decisions which lead to the GAPDoc package. 1.2 A complete example In this section we recall the lines from the example document in Appendix A and give some explanations. ───────────────────────────── from 3k+1.xml ────────────────────────────── <?xml version="1.0" encoding="UTF-8"?>  ──────────────────────────────────────────────────────────────────────────── This line just tells a human reader and computer programs that the file is a document with XML markup and that the text is encoded in the UTF-8 character set (other common encodings are ASCII or ISO-8895-X encodings). ───────────────────────────── from 3k+1.xml ────────────────────────────── <!-- A complete "fake package" documentation  --> ──────────────────────────────────────────────────────────────────────────── Everything in a XML file between "<!--" and "-->" is a comment and not part of the document content. ───────────────────────────── from 3k+1.xml ────────────────────────────── <!DOCTYPE Book SYSTEM "gapdoc.dtd"> ──────────────────────────────────────────────────────────────────────────── This line says that the document contains markup which is defined in the system file gapdoc.dtd and that the markup obeys certain rules defined in that file (the ending dtd means "document type definition"). It further says that the actual content of the document consists of an element with name "Book". And we can really see that the remaining part of the file is enclosed as follows: ───────────────────────────── from 3k+1.xml ────────────────────────────── <Book Name="3k+1">  [...] (content omitted) </Book> ──────────────────────────────────────────────────────────────────────────── This demonstrates the basics of the markup in XML. This part of the document is an "element". It consists of the "start tag" <Book Name="3k+1">, the "element content" and the "end tag" </Book> (end tags always start with </). This element also has an "attribute" Name whose "value" is 3k+1. If you know HTML, this will look familiar to you. But there are some important differences: The element name Book and attribute name Name are case sensitive. The value of an attribute must always be enclosed in quotes. In XML every element has a start and end tag (which can be combined for elements defined as "empty", see for example <TableOfContents/> below). If you know LaTeX, you are familiar with quite different types of markup, for example: The equivalent of the Book element in LaTeX is \begin{document} ... \end{document}. The sectioning in LaTeX is not done by explicit start and end markup, but implicitly via heading commands like \section. Other markup is done by using braces {} and putting some commands inside. And for mathematical formulae one can use the $ for the start and the end of the markup. In XML all markup looks similar to that of the Book element. The content of the book starts with a title page. ───────────────────────────── from 3k+1.xml ────────────────────────────── <TitlePage>  <Title>The <Package>ThreeKPlusOne</Package> Package  Version 42  Dummy Authör  3kplusone@dev.null     ©right; 2000 The Author.

  You can do with this package what you want.

Really.    ──────────────────────────────────────────────────────────────────────────── The content of the TitlePage element consists again of elements. In Chapter 3 we describe which elements are allowed within a TitlePage and that their ordering is prescribed in this case. In the (stupid) name of the author you see that a German umlaut is used directly (in ISO-latin1 encoding). Contrary to LaTeX- or HTML-files this markup does not say anything about the actual layout of the title page in any output version of the document. It just adds information about the meaning of pieces of text. Within the Copyright element there are two more things to learn about XML markup. The 

 is a complete element. It is a combined start and end tag. This shortcut is allowed for elements which are defined to be always "empty", i.e., to have no content. You may have already guessed that 

 is used as a paragraph separator. Note that empty lines do not separate paragraphs (contrary to LaTeX). The other construct we see here is ©right;. This is an example of an "entity" in XML and is a macro for some substitution text. Here we use an entity as a shortcut for a complicated expression which makes it possible that the term copyright is printed as some text like (C) in text terminal output and as a copyright character in other output formats. In GAPDoc we predefine some entities. Certain "special characters" must be typed via entities, for example "<", ">" and "&" to avoid a misinterpretation as XML markup. It is possible to define additional entities for your document inside the  declaration, see 2.2-3. Note that elements in XML must always be properly nested, as in this example. A construct like ... is not allowed. ───────────────────────────── from 3k+1.xml ──────────────────────────────  ──────────────────────────────────────────────────────────────────────────── This is another example of an "empty element". It just means that a table of contents for the whole document should be included into any output version of the document. After this the main text of the document follows inside certain sectioning elements: ───────────────────────────── from 3k+1.xml ──────────────────────────────   The 3k+1 Problem 

Theory  [...] (content omitted) 
 
Program  [...] (content omitted)  
    ──────────────────────────────────────────────────────────────────────────── These elements are used similarly to "\chapter" and "\section" in LaTeX. But note that the explicit end tags are necessary here. The sectioning commands allow to assign an optional attribute "Label". This can be used for referring to a section inside the document. The text of the first section starts as follows. The whitespace in the text is unimportant and the indenting is not necessary. ───────────────────────────── from 3k+1.xml ──────────────────────────────   Let k \in &NN; be a natural number. We consider the  sequence n(i, k), i \in &NN;, with n(1, k) = k and  else  ──────────────────────────────────────────────────────────────────────────── Here we come to the interesting question how to type mathematical formulae in a GAPDoc document. We did not find any alternative for writing formulae in TeX syntax. (There is MATHML, but even simple formulae contain a lot of markup, become quite unreadable and they are cumbersome to type. Furthermore there seem to be no tools available which translate such formulae in a nice way into TeX and text.) So, formulae are essentially typed as in LaTeX. (Actually, it is also possible to type unicode characters of some mathematical symbols directly, or via an entity like the &NN; above.) There are three types of elements containing formulae: "M", "Math" and "Display". The first two are for in-text formulae and the third is for displayed formulae. Here "M" and "Math" are equivalent, when translating a GAPDoc document into LaTeX. But they are handled differently for terminal text (and HTML) output. For the content of an "M"-element there are defined rules for a translation into well readable terminal text. More complicated formulae are in "Math" or "Display" elements and they are just printed as they are typed in text output. So, to make a section well readable inside a terminal window you should try to put as many formulae as possible into "M"-elements. In our example text we used the notation n(i, k) instead of n_i(k) because it is easier to read in text mode. See Sections 2.2-2 and 3.9 for more details. A few lines further on we find two non-internal references. ───────────────────────────── from 3k+1.xml ──────────────────────────────  problem, see or  http://mathsrv.ku-eichstaett.de/MGF/homes/wirsching/ ──────────────────────────────────────────────────────────────────────────── The first within the "Cite"-element is the citation of a book. In GAPDoc we use the widely used BibTeX database format for reference lists. This does not use XML but has a well documented structure which is easy to parse. And many people have collections of references readily available in this format. The reference list in an output version of the document is produced with the empty element ───────────────────────────── from 3k+1.xml ──────────────────────────────  ──────────────────────────────────────────────────────────────────────────── close to the end of our example file. The attribute "Databases" give the name(s) of the database (.bib) files which contain the references. Putting a Web-address into an "URL"-element allows one to create a hyperlink in output formats which allow this. The second section of our example contains a special kind of subsection defined in GAPDoc. ───────────────────────────── from 3k+1.xml ──────────────────────────────        This function computes for a natural number k the  beginning of the sequence n(i, k) defined in section  . The sequence stops at the first  1 or at n(max, k), if max is  given.  gap> ThreeKPlusOneSequence(101); "Sorry, not yet implemented. Wait for Version 84 of the package"      ──────────────────────────────────────────────────────────────────────────── A "ManSection" contains the description of some function, operation, method, filter and so on. The "Func"-element describes the name of a function (there are also similar elements "Oper", "Meth", "Filt" and so on) and names for its arguments, optional arguments enclosed in square brackets. See Section 3.4 for more details. In the "Description" we write the argument names as "A"-elements. A good description of a function should usually contain an example of its use. For this there are some verbatim-like elements in GAPDoc, like "Example" above (here, clearly, whitespace matters which causes a slightly strange indenting). The text contains an internal reference to the first section via the explicitly defined label sec:theory. The first section also contains a "Ref"-element which refers to the function described here. Note that there is no explicit label for such a reference. The pair  and  does the cross referencing (and hyperlinking if possible) implicitly via the name of the function. Here is one further element from our example document which we want to explain. ───────────────────────────── from 3k+1.xml ──────────────────────────────  ──────────────────────────────────────────────────────────────────────────── This is again an empty element which just says that an output version of the document should contain an index. Many entries for the index are generated automatically because the "Func" and similar elements implicitly produce such entries. It is also possible to include explicit additional entries in the index. 1.3 Some questions Are those XML files too ugly to read and edit? Just have a look and decide yourself. The markup needs more characters than most TeX or LaTeX markup. But the structure of the document is easier to see. If you configure your favorite editor well, you do not need more key strokes for typing the markup than in LaTeX. Why do we not use LaTeX alone? LaTeX is good for writing books. But LaTeX files are generally difficult to parse and to process to other output formats like text for browsing in a terminal window or HTML (or new formats which may become popular in the future). GAPDoc markup is one step more abstract than LaTeX insofar as it describes meaning instead of appearance of text. The inner workings of LaTeX are too complicated to learn without pain, which makes it difficult to overcome problems that occur occasionally. Why XML and not a newly defined markup language? XML is a well defined standard that is more and more widely used. Lots of people have thought about it. Years of experience with SGML went into the design. It is easy to explain, easy to parse and lots of tools are available, there will be more in the future. 2 How To Type a GAPDoc Document In this chapter we give a more formal description of what you need to start to type documentation in GAPDoc XML format. Many details were already explained by example in Section 1.2 of the introduction. We do not answer the question "How to write a GAPDoc document?" in this chapter. You can (hopefully) find an answer to this question by studying the example in the introduction, see 1.2, and learning about more details in the reference Chapter 3. The definite source for all details of the official XML standard with useful annotations is: https://www.xml.com/axml/axml.html Although this document must be quite technical, it is surprisingly well readable. 2.1 General XML Syntax We will now discuss the pieces of text which can occur in a general XML document. We start with those pieces which do not contribute to the actual content of the document. 2.1-1 Head of XML Document Each XML document should have a head which states that it is an XML document in some encoding and which XML-defined language is used. In case of a GAPDoc document this should always look as in the following example. ──────────────────────────────── Example ─────────────────────────────────   ──────────────────────────────────────────────────────────────────────────── See 2.1-13 for a remark on the "encoding" statement. (There may be local entity definitions inside the DOCTYPE statement, see Subsection 2.2-3 below.) 2.1-2 Comments A "comment" in XML starts with the character sequence "". Between these sequences there must not be two adjacent dashes "--". 2.1-3 Processing Instructions A "processing instruction" in XML starts with the character sequence "" must not occur within the processing instruction.   And now we turn to those parts of the document which contribute to its actual content. 2.1-4 Names in XML and Whitespace A "name" in XML (used for element and attribute identifiers, see below) must start with a letter (in the encoding of the document) or with a colon ":" or underscore "_" character. The following characters may also be digits, dots "." or dashes "-". This is a simplified description of the rules in the standard, which are concerned with lots of unicode ranges to specify what a "letter" is. Sequences only consisting of the following characters are considered as whitespace: blanks, tabs, carriage return characters and new line characters. 2.1-5 Elements The actual content of an XML document consists of "elements". An element has some "content" with a leading "start tag" (2.1-6) and a trailing "end tag" (2.1-7). The content can contain further elements but they must be properly nested. One can define elements whose content is always empty, those elements can also be entered with a single combined tag (2.1-8). 2.1-6 Start Tags A "start-tag" consists of a less-than-character "<" directly followed (without whitespace) by an element name (see 2.1-4), optional attributes, optional whitespace, and a greater-than-character ">". An "attribute" consists of some whitespace and then its name followed by an equal sign "=" which is optionally enclosed by whitespace, and the attribute value, which is enclosed either in single or double quotes. The attribute value may not contain the type of quote used as a delimiter or the character "<", the character "&" may only appear to start an entity, see 2.1-9. We describe in 2.1-11 how to enter special characters in attribute values. Note especially that no whitespace is allowed between the starting "<" character and the element name. The quotes around an attribute value cannot be omitted. The names of elements and attributes are case sensitive. 2.1-7 End Tags An "end tag" consists of the two characters "". 2.1-8 Combined Tags for Empty Elements Elements which always have empty content can be written with a single tag. This looks like a start tag (see 2.1-6) except that the trailing greater-than-character ">" is substituted by the two character sequence "/>". 2.1-9 Entities An "entity" in XML is a macro for some substitution text. There are two types of entities. A "character entity" can be used to specify characters in the encoding of the document (can be useful for entering non-ASCII characters which you cannot manage to type in directly). They are entered with a sequence "&#", directly followed by either some decimal digits or an "x" and some hexadecimal digits, directly followed by a semicolon ";". Using such a character entity is just equivalent to typing the corresponding character directly. Then there are references to "named entities". They are entered with an ampersand character "&" directly followed by a name which is directly followed by a semicolon ";". Such entities must be declared somewhere by giving a substitution text. This text is included in the document and the document is parsed again afterwards. The exact rules are a bit subtle but you probably want to use this only in simple cases. Predefined entities for GAPDoc are described in 2.1-10 and 2.2-3. 2.1-10 Special Characters in XML We have seen that the less-than-character "<" and the ampersand character "&" start a tag or entity reference in XML. To get these characters into the document text one has to use entity references, namely "<" to get "<" and "&" to get "&". Furthermore ">" must be used to get ">" when the string "]]>" appears in element content (and not as delimiter of a CDATA section explained below). Another possibility is to use a CDATA statement explained in 2.1-12. 2.1-11 Rules for Attribute Values Attribute values can contain entities which are substituted recursively. But except for the entities < or a character entity it is not allowed that a < character is introduced by the substitution (there is no XML parsing for evaluating the attribute value, just entity substitutions). 2.1-12 CDATA Pieces of text which contain many characters which can be misinterpreted as markup can be enclosed by the character sequences "". Everything between these sequences is considered as content of the document and is not further interpreted as XML text. All the rules explained so far in this section do not apply to such a part of the document. The only document content which cannot be entered directly inside a CDATA statement is the sequence "]]>". This can be entered as "]]>" outside the CDATA statement. ──────────────────────────────── Example ───────────────────────────────── A nesting of tags like is not allowed. ──────────────────────────────────────────────────────────────────────────── 2.1-13 Encoding of an XML Document We suggest to use the UTF-8 encoding for writing GAPDoc XML documents. But the tools described in Chapter 5 also work with ASCII or the various ISO-8859-X encodings (ISO-8859-1 is also called latin1 and covers most special characters for western European languages). 2.1-14 Well Formed and Valid XML Documents We want to mention two further important words which are often used in the context of XML documents. A piece of text becomes a "well formed" XML document if all the formal rules described in this section are fulfilled. But this says nothing about the content of the document. To give this content a meaning one needs a declaration of the element and corresponding attribute names as well as of named entities which are allowed. Furthermore there may be restrictions how such elements can be nested. This definition of an XML based markup language is done in a "document type definition". An XML document which contains only elements and entities declared in such a document type definition and obeys the rules given there is called "valid (with respect to this document type definition)". The main file of the GAPDoc package is gapdoc.dtd. This contains such a definition of a markup language. We are not going to explain the formal syntax rules for document type definitions in this section. But in Chapter 3 we will explain enough about it to understand the file gapdoc.dtd and so the markup language defined there. 2.2 Entering GAPDoc Documents Here are some additional rules for writing GAPDoc XML documents. 2.2-1 Other special characters As GAPDoc documents are used to produce LaTeX and HTML documents, the question arises how to deal with characters with a special meaning for other applications (for example "&", "#", "$", "%", "~", "\", "{", "}", "_", "^", " " (this is a non-breakable space, "~" in LaTeX) have a special meaning for LaTeX and "&", "<", ">" have a special meaning for HTML (and XML). In GAPDoc you can usually just type these characters directly, it is the task of the converter programs which translate to some output format to take care of such special characters. The exceptions to this simple rule are: • & and < must be entered as & and < as explained in 2.1-10. • The content of the GAPDoc elements ,  and  is LaTeX code, see 3.8. • The content of an  element with Only attribute contains code for the specified output type, see 3.9-1. Remark: In former versions of GAPDoc one had to use particular entities for all the special characters mentioned above (&tamp;, &hash;, $, &percent;, ˜, &bslash;, &obrace;, &cbrace;, &uscore;, &circum;, &tlt;, &tgt;). These are no longer needed, but they are still defined for backwards compatibility with older GAPDoc documents. 2.2-2 Mathematical Formulae Mathematical formulae in GAPDoc are typed as in LaTeX. They must be the content of one of three types of GAPDoc elements concerned with mathematical formulae: "Math", "Display", and "M" (see Sections 3.8-1 and 3.8-2 for more details). The first two correspond to LaTeX's math mode and display math mode. The last one is a special form of the "Math" element type, that imposes certain restrictions on the content. On the other hand the content of an "M" element is processed in a well defined way for text terminal or HTML output. The "Display" element also has an attribute such that its content is processed as in "M" elements. Note that the content of these element is LaTeX code, but the special characters "<" and "&" for XML must be entered via the entities described in 2.1-10 or by using a CDATA statement, see 2.1-12. 2.2-3 More Entities In GAPDoc there are some more predefined entities: ┌─────────────┬─────────┐ │ &GAP; │ GAP │ ├─────────────┼─────────┤ │ &GAPDoc; │ GAPDoc │ ├─────────────┼─────────┤ │ &TeX; │ TeX │ ├─────────────┼─────────┤ │ &LaTeX; │ LaTeX │ ├─────────────┼─────────┤ │ &BibTeX; │ BibTeX │ ├─────────────┼─────────┤ │ &MeatAxe; │ MeatAxe │ ├─────────────┼─────────┤ │ &XGAP; │ XGAP │ ├─────────────┼─────────┤ │ ©right; │ © │ ├─────────────┼─────────┤ │   │ " " │ ├─────────────┼─────────┤ │ – │ – │ └─────────────┴─────────┘ Table: Predefined Entities in the GAPDoc system Here   is a non-breakable space character. Additional entities are defined for some mathematical symbols, see 3.8 for more details. One can define further local entities right inside the head (see 2.1-1) of a GAPDoc XML document as in the following example. ──────────────────────────────── Example ─────────────────────────────────   text possibly with markup">  ]> ──────────────────────────────────────────────────────────────────────────── These additional definitions go into the  ──────────────────────────── From gapdoc.dtd ─────────────────────────────   ──────────────────────────────────────────────────────────────────────────── After the keyword ELEMENT and the name Book there is a list in parentheses. This is a comma separated list of names of elements which can occur (in the given order) in the content of a Book element. Each name in such a list can be followed by one of the characters "?", "*" or "+", meaning that the corresponding element can occur zero or one time, an arbitrary number of times, or at least once, respectively. Without such an extra character the corresponding element must occur exactly once. Instead of one name in this list there can also be a list of elements names separated by "|" characters, this denotes any element with one of the names (i.e., "|" means "or"). So, the Book element must contain first a TitlePage element, then an optional TableOfContents element, then a Body element, then zero or more elements of type Appendix, then an optional Bibliography element, and finally an optional element of type TheIndex. Note that only these elements are allowed in the content of the Book element. No other elements or text is allowed in between. An exception of this is that there may be whitespace between the end tag of one and the start tag of the next element - this should be ignored when the document is processed to some output format. An element like this is called an element with "element content". The second declaration starts with the keyword ATTLIST and the element name Book. After that there is a triple of whitespace separated parameters (in general an arbitrary number of such triples, one for each allowed attribute name). The first (Name) is the name of an attribute for a Book element. The second (CDATA) is always the same for all of our declarations, it means that the value of the attribute consists of "character data". The third parameter #REQUIRED means that this attribute must be specified with any Book element. Later we will also see optional attributes which are declared as #IMPLIED. 3.2-2  ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── Within this element information for the title page is collected. Note that more than one author can be specified. The elements must appear in this order because there is no sensible way to specify in a DTD something like "the following elements may occur in any order but each exactly once". Before going on with the other elements inside the Book element we explain the elements for the title page. 3.2-3  ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── Here is the last construct you need to understand for reading gapdoc.dtd. The expression "%Text;" is a so-called "parameter entity". It is something like a macro within the DTD. It is defined as follows: ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── This means, that every occurrence of "%Text;" in the DTD is replaced by the expression ──────────────────────────── From gapdoc.dtd ───────────────────────────── %InnerText; | List | Enum | Table ──────────────────────────────────────────────────────────────────────────── which is then expanded further because of the following definition: ──────────────────────────── From gapdoc.dtd ─────────────────────────────   ──────────────────────────────────────────────────────────────────────────── These are the only two parameter entities we are using. They expand to lists of element names which are explained in the sequel and the keyword #PCDATA (concatenated with the "or" character "|"). So, the element (Title) is of so-called "mixed content": It can contain parsed character data which does not contain further markup (#PCDATA) or any of the other above mentioned elements. Mixed content must always have the asterisk qualifier (like in Title) such that any sequence of elements (of the above list) and character data can be contained in a Title element. The %Text; parameter entity is used in all places in the DTD, where "normal text" should be allowed, including lists, enumerations, and tables, but no sectioning elements. The %InnerText; parameter entity is used in all places in the DTD, where "inner text" should be allowed. This means, that no structures like lists, enumerations, and tables are allowed. This is used for example in headings. 3.2-4  ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── Contains the subtitle of the document. 3.2-5  ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── Note that the version can only contain character data and no further markup elements (except for Alt, which is necessary to resolve the entities described in 2.2-3). The converters will not put the word "Version" in front of the text in this element. 3.2-6  ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── Sometimes a title and subtitle are not sufficient to give a rough idea about the content of a package. In this case use this optional element to specify an additional text for the front page of the book. This text should be short, use the Abstract element (see 3.2-10) for longer explanations. 3.2-7  ──────────────────────────── From gapdoc.dtd ─────────────────────────────   ──────────────────────────────────────────────────────────────────────────── As noted in the comment there may be more than one element of this type. This element should contain the name of an author and probably an Email-address and/or WWW-Homepage element for this author, see 3.5-6 and 3.5-7. You can also specify an individual postal address here, instead of using the Address element described below, see 3.2-9. 3.2-8  ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── Only character data is allowed in this element which gives a date for the document. No automatic formatting is done. 3.2-9 
 ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── This optional element can be used to specify a postal address of the author or the authors. If there are several authors with different addresses then put the Address elements inside the Author elements. Use the Br element (see 3.9-3) to mark the line breaks in the usual formatting of the address on a letter. Note that often it is not necessary to use this element because a postal address is easy to find via a link to a personal web page. 3.2-10  ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── This element contains an abstract of the whole book. 3.2-11  ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── This element is used for the copyright notice. Note the ©right; entity as described in section 2.2-3. 3.2-12  ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── This element contains the acknowledgements. 3.2-13  ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── The "colophon" page is used to say something about the history of a document. 3.2-14  ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── This element may occur in the Book element after the TitlePage element. If it is present, a table of contents is generated and inserted into the document. Note that because this element is declared to be EMPTY one can use the abbreviation ──────────────────────────────── Example ─────────────────────────────────  ──────────────────────────────────────────────────────────────────────────── to denote this empty element. 3.2-15  ──────────────────────────── From gapdoc.dtd ─────────────────────────────   ──────────────────────────────────────────────────────────────────────────── This element may occur in the Book element after the last Appendix element. If it is present, a bibliography section is generated and inserted into the document. The attribute Databases must be specified, the names of several data files can be specified, separated by commas. Two kinds of files can be specified in Databases: The first are BibTeX files as defined in [Lam85, Appendix B]. Such files must have a name with extension .bib, and in Databases the name must be given without this extension. Note that such .bib-files should be in latin1-encoding (or ASCII-encoding). The second are files in BibXMLext format as defined in Section 7.2. These files must have an extension .xml and in Databases the full name must be specified. We suggest to use the BibXMLext format because it allows to produce potentially nicer bibliography entries in text and HTML documents. A bibliography style may be specified with the Style attribute. The optional Style attribute (for LaTeX output of the document) must also be specified without the .bst extension (the default is alpha). See also section 3.5-3 for a description of the Cite element which is used to include bibliography references into the text. 3.2-16  ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── This element may occur in the Book element after the Bibliography element. If it is present, an index is generated and inserted into the document. There are elements in GAPDoc which implicitly generate index entries (e.g., Func (3.4-2)) and there is an element Index (3.5-4) for explicitly adding index entries. 3.3 Sectioning Elements A GAPDoc book is divided into chapters, sections, and subsections. The idea is of course, that a chapter consists of sections, which in turn consist of subsections. However for the sake of flexibility, the rules are not too restrictive. Firstly, text is allowed everywhere in the body of the document (and not only within sections). Secondly, the chapter level may be omitted. The exact rules are described below. Appendices are a flavor of chapters, occurring after all regular chapters. There is a special type of subsection called "ManSection". This is a subsection devoted to the description of a function, operation or variable. It is analogous to a manpage in the UNIX environment. Usually each function, operation, method, and so on should have its own ManSection. Cross referencing is done on the level of Subsections, respectively ManSections. The topics in GAP's online help are also pointing to subsections. So, they should not be too long. We start our description of the sectioning elements "top-down": 3.3-1 
 The Body element marks the main part of the document. It must occur after the TableOfContents element. There is a big difference between inside and outside of this element: Whereas regular text is allowed nearly everywhere in the Body element and its subelements, this is not true for the outside. This has also implications on the handling of whitespace. Outside superfluous whitespace is usually ignored when it occurs between elements. Inside of the Body element whitespace matters because character data is allowed nearly everywhere. Here is the definition in the DTD: ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── The fact that Chapter and Section elements are allowed here leads to the possibility to omit the chapter level entirely in the document. For a description of %Text; see 3.2-3. (Remark: The purpose of this element is to make sure that a valid GAPDoc document has a correct overall structure, which is only possible when the top element Book has element content.) 3.3-2  ──────────────────────────── From gapdoc.dtd ─────────────────────────────    ──────────────────────────────────────────────────────────────────────────── A Chapter element can have a Label attribute, such that this chapter can be referenced later on with a Ref element (see section 3.5-1). Note that you have to specify a label to reference the chapter as there is no automatic labelling! Chapter elements can contain text (for a description of %Text; see 3.2-3), Section elements, and Heading elements. The following additional rule cannot be stated in the DTD because we want a Chapter element to have mixed content. There must be exactly one Heading element in the Chapter element, containing the heading of the chapter. Here is its definition: 3.3-3  ──────────────────────────── From gapdoc.dtd ─────────────────────────────  ──────────────────────────────────────────────────────────────────────────── This element is used for headings in Chapter, Section, Subsection, and Appendix elements. It may only contain %InnerText; (for a description see 3.2-3). Each of the mentioned sectioning elements must contain exactly one direct Heading element (i.e., one which is not contained in another sectioning element). 3.3-4  ──────────────────────────── From gapdoc.dtd ─────────────────────────────    ──────────────────────────────────────────────────────────────────────────── The Appendix element behaves exactly like a Chapter element (see 3.3-2) except for the position within the document and the numbering. While chapters are counted with numbers (1., 2., 3., ...) the appendices are counted with capital letters (A., B., ...). Again there is an optional Label attribute used for references. 3.3-5 
 ──────────────────────────── From gapdoc.dtd ─────────────────────────────    ──────────────────────────────────────────────────────────────────────────── A Section element can have a Label attribute, such that this section can be referenced later on with a Ref element (see section 3.5-1). Note that you have to specify a label to reference the section as there is no automatic labelling! Section elements can contain text (for a description of %Text; see 3.2-3), Heading elements, and subsections. There must be exactly one direct Heading element in a Section element, containing the heading of the section. Note that a subsection is either a Subsection element or a ManSection element. 3.3-6  ──────────────────────────── From gapdoc.dtd ─────────────────────────────    ──────────────────────────────────────────────────────────────────────────── The Subsection element can have a Label attribute, such that this subsection can be referenced later on with a Ref element (see section 3.5-1). Note that you have to specify a label to reference the subsection as there is no automatic labelling! Subsection elements can contain text (for a description of %Text; see 3.2-3), and Heading elements. There must be exactly one Heading element in a Subsection element, containing the heading of the subsection. Another type of subsection is a ManSection, explained now: 3.4 ManSection–a special kind of subsection ManSections are intended to describe a function, operation, method, variable, or some other technical instance. It is analogous to a manpage in the UNIX environment. 3.4-1  ──────────────────────────── From gapdoc.dtd ─────────────────────────────       ──────────────────────────────────────────────────────────────────────────── The ManSection element can have a Label attribute, such that this subsection can be referenced later on with a Ref element (see section 3.5-1). But this is probably rarely necessary because the elements Func and so on (explained below) generate automatically labels for cross referencing. The content of a ManSection element is one or more elements describing certain items in GAP, each of them optionally followed by a Returns element, followed by a Description element, which contains %Text; (see 3.2-3) describing it. (Remember to include examples in the description as often as possible, see 3.7-10). The classes of items GAPDoc knows of are: functions (Func), operations (Oper), constructors (Constr), methods (Meth), filters (Filt), properties (Prop), attributes (Attr), variables (Var), families (Fam), and info classes (InfoClass). One ManSection should only describe several of such items when these are very closely related. Each element for an item corresponding to a GAP function can be followed by a Returns element. In output versions of the document the string "Returns: " will be put in front of the content text. The text in the Returns element should usually be a short hint about the type of object returned by the function. This is intended to give a good mnemonic for the use of a function (together with a good choice of names for the formal arguments). ManSections are also sectioning elements which count as subsections. Usually there should be no Heading-element in a ManSection, in that case a heading is generated automatically from the first Func-like element. Sometimes this default behaviour does not look appropriate, for example when there are several Func-like elements. For such cases an optional Heading is allowed. 3.4-2  ──────────────────────────── From gapdoc.dtd ─────────────────────────────   ──────────────────────────────────────────────────────────────────────────── This element is used within a ManSection element to specify the usage of a function. The Name attribute is required and its value is the name of the function. The value of the Arg attribute (also required) contains the full list of arguments including optional parts, which are denoted by square brackets. The argument names can be separated by whitespace, commas or the square brackets for the optional arguments, like "grp[, elm]" or "xx[y[z] ]". If GAP options are used, this can be followed by a colon : and one or more assignments, like "n[, r]: tries := 100". The name of the function is also used as label for cross referencing. When the name of the function appears in the text of the document it should always be written with the Ref element, see 3.5-1. This allows to use a unique typesetting style for function names and automatic cross referencing. If the optional Label attribute is given, it is appended (with a colon : in between) to the name of the function for cross referencing purposes. The text of the label can also appear in the document text. So, it should be a kind of short explanation. ──────────────────────────────── Example ─────────────────────────────────  ──────────────────────────────────────────────────────────────────────────── The optional Comm attribute should be a short description of the function, usually at most one line long (this is currently nowhere used). This element automatically produces an index entry with the name of the function and, if present, the text of the Label attribute as subentry (see also 3.2-16 and 3.5-4). 3.4-3  ──────────────────────────── From gapdoc.dtd ─────────────────────────────   ──────────────────────────────────────────────────────────────────────────── This element is used within a ManSection element to specify the usage of an operation. The attributes are used exactly in the same way as in the Func element (see 3.4-2). Note that multiple descriptions of the same operation may occur in a document because there may be several declarations in GAP. Furthermore there may be several ManSections for methods of this operation (see 3.4-5) which also use the same name. For reference purposes these must be distinguished by different Label attributes. 3.4-4  ──────────────────────────── From gapdoc.dtd ─────────────────────────────   ──────────────────────────────────────────────────────────────────────────── This element is used within a ManSection element to specify the usage of a constructor. The attributes are used exactly in the same way as in the Func element (see 3.4-2). Note that multiple descriptions of the same constructor may occur in a document because there may be several declarations in GAP. Furthermore there may be several ManSections for methods of this constructor (see 3.4-5) which also use the same name. For reference purposes these must be distinguished by different Label attributes. 3.4-5  ──────────────────────────── From gapdoc.dtd ─────────────────────────────   ──────────────────────────────────────────────────────────────────────────── This element is used within a ManSection element to specify the usage of a method. The attributes are used exactly in the same way as in the Func element (see 3.4-2). Frequently, an operation is implemented by several different methods. Therefore it seems to be interesting to document them independently. This is possible by using the same method name in different ManSections. It is however required that these subsections and those describing the corresponding operation are distinguished by different Label attributes. 3.4-6  ──────────────────────────── From gapdoc.dtd ─────────────────────────────   ──────────────────────────────────────────────────────────────────────────── This element is used within a ManSection element to specify the usage of a filter. The first four attributes are used in the same way as in the Func element (see 3.4-2), except that the Arg attribute is optional. The Type attribute can be any string, but it is thought to be something like "Category" or "Representation". 3.4-7  ──────────────────────────── From gapdoc.dtd ─────────────────────────────   ──────────────────────────────────────────────────────────────────────────── This element is used within a ManSection element to specify the usage of a property. The attributes are used exactly in the same way as in the Func element (see 3.4-2). 3.4-8  ──────────────────────────── From gapdoc.dtd ─────────────────────────────   ──────────────────────────────────────────────────────────────────────────── This element is used within a ManSection element to specify the usage of an attribute (in GAP). The attributes are used exactly in the same way as in the Func element (see 3.4-2). 3.4-9  ──────────────────────────── From gapdoc.dtd ─────────────────────────────   ──────────────────────────────────────────────────────────────────────────── This element is used within a ManSection element to document a global variable. The attributes are used exactly in the same way as in the Func element (see 3.4-2) except that there is no Arg attribute. 3.4-10  ──────────────────────────── From gapdoc.dtd ─────────────────────────────   ──────────────────────────────────────────────────────────────────────────── This element is used within a ManSection element to document a family. The attributes are used exactly in the same way as in the Func element (see 3.4-2) except that there is no Arg attribute. 3.4-11  ──────────────────────────── From gapdoc.dtd ─────────────────────────────   ──────────────────────────────────────────────────────────────────────────── This element is used within a ManSection element to document an info class. The attributes are used exactly in the same way as in the Func element (see 3.4-2) except that there is no Arg attribute. 3.5 Cross Referencing and Citations Cross referencing in the GAPDoc system is somewhat different to the usual LaTeX cross referencing in so far, that a reference knows "which type of object" it is referencing. For example a "reference to a function" is distinguished from a "reference to a chapter". The idea of this is, that the markup must contain this information such that the converters can produce better output. The HTML converter can for example typeset a function reference just as the name of the function with a link to the description of the function, or a chapter reference as a number with a link in the other case. Referencing is done with the Ref element: 3.5-1  ──────────────────────────── From gapdoc.dtd ─────────────────────────────    ──────────────────────────────────────────────────────────────────────────── The Ref element is defined to be EMPTY. If one of the attributes Func, Oper, Constr, Meth, Prop, Attr, Var, Fam, InfoClass, Chap, Sect, Subsect, Appendix is given then there must be exactly one of these, making the reference one to the corresponding object. The Label attribute can be specified in addition to make the reference unique, for example if more than one method with a given name is present. (Note that there is no way to specify in the DTD that exactly one of the first listed attributes must be given, this is an additional rule.) A reference to a Label element defined below (see 3.5-2) is done by giving the Label attribute and optionally the Text attribute. If the Text attribute is present its value is typeset in place of the Ref element, if linking is possible (for example in HTML). If this is not possible, the section number is typeset. This type of reference is also used for references to tables (see 3.6-5). An external reference into another book can be specified by using the BookName attribute. In this case the Label attribute or, if this is not given, the function or section like attribute, is used to resolve the reference. The generated reference points to the first hit when asking "?book name: label" inside GAP. The optional attribute Style can take only the values Text and Number. It can be used with references to sectioning units and it gives a hint to the converter programs, whether an explicit section number is generated or text. Normally all references to sections generate numbers and references to a GAP object generate the name of the corresponding object with some additional link or sectioning information, which is the behavior of Style="Text". In case Style="Number" in all cases an explicit section number is generated. So ──────────────────────────────── Example ─────────────────────────────────  described in section   ──────────────────────────────────────────────────────────────────────────── produces: '' described in section 3.4-2. 3.5-2