The Basics of Deploying Structured Authoring Volume 4: Assembling Publications

The defining feature of structured documentation is content being divided between independently controlled modules. Such division entails that the contents must be assembled into a whole when they are being published. This article will discuss the details that must be accounted for when a publication is being assembled.

There are at least four noteworthy details related to assembling publications from material prepared using structured authoring. Firstly, the contents for the publication must be set in an appropriate sequence and hierarchy. Secondly, there must exist a stylesheet suited for the format of the publication. Stylesheets are discussed in more detail in the fifth installment to this series. Thirdly, the publication template itself must be assembled by determining the appropriate settings for both for the selection and layout of content. The fourth detail relates to accounting for differences between the different publishing languages. Each of these four details has a section dedicated to it here.

This is the fourth installment on the differences between structured authoring and direct text editing. Links to the previous installments are provided below.

Rakenteisen Dokumentoinnin Käyttöönoton Perusteet Osa 1: Erot Pohjamateriaalissa

Rakenteisen Dokumentoinnin Käyttöönoton Perusteet Osa 2: Täydennetty Sisältö

Rakenteisen Dokumentoinnin Käyttöönoton Perusteet Osa 3: Revisiointi ja Työnkierto

Organizing Content

*A demonstration of how topic trees are assembled in DoX CMS.*

Content modules may be stored based on any organizational logic. The more they are reused, the better idea it becomes to divide them based on subject, for example. As such, the content used in any given publication must first be picked from this set and organized accordingly.

The organization of content covers both its sequence and its hierarchy. The sequence of content consists of the linear arrangement of modules. The hierarchy of content, on the other hand, involves embedding content into levels of organization. Sub-sections have higher levels of organization which express how they contribute to a larger whole on a given subject.

For example, the number identifier for the first main section would be ‘1’ and the identifier of its first sub-section ‘1.1’, respectively, when the publication adheres to the ISO 2145 standard. Sub-sections are designated by adding the ordinal value of the sub-section to the number identifier of the section directly above it in the hierarchy. Thus, the identifier for the first sub-section of section ‘1.1’ would be ‘1.1.1’ while the identifier of the second one would be ‘1.1.2’.

In the case of DoX CMS, organizing content in this way happens with so-called topic trees. All content is part of a master topic tree and parts of it can be used to assemble new topic trees by dragging content to the proper positions in the correct menu.

Essentially, organizing content is only a part of the process of writing it within each module. Beyond that, it is its own step in the content creation process. This enables the same content to be used in different publications or even in multiple positions within the same publication should that prove necessary. The layout for content once it has been organized in this way depends on the specifications provided by the used stylesheet.

One possible exception are contents used in a non-linear publishing format such as a virtual environment. Unless they have a predetermined logic by which they are arranged based on their sequence, they must be arranged separately once they have been published. This is how DoX VAD will operate, for example.

Stylesheets

Since publications written using structured authoring consist of separate content modules which are not tied to any publishing format, determining the details for a given publication’s style depends on a suitable stylesheet.

The stylesheet helps determine forced page breaks and how much content each page around a page break must contain, for example. It also specifies how both the header and footer are constricted and the layout of both cover pages (unless the covers have been made elsewhere in advance). Should there be language-specific differences related to such details, the stylesheet may account for them. In the case of DoX CMS, this is also how you specify the effects of the element classes which are used to create exceptions to the general layout rules for the affected kinds of elements.

The possible languages for controlling layout are CSS and XSL. The latter also allows the contents within and between modules to be rearranged by using existing content to assemble new content where necessary. It once covered determining the style of a document in general (XSL-FO). Because this part of XSL is no longer supported, this section focuses on CSS in this respect, instead.

How CSS Works

Each command written in CSS consists of a selector and rules. The selector specifies the sections to which the rules are applied. The rules tell what is done to said sections.

A simple selector specifies a single type of element such as images (img). There are various ways in which it can be specified in more detail. A selector may, for example, apply the related rules to the images within the second cell of each row within a table that has a specified element class. Such a solution would help with situations like specifying the layout of a table that explains different types of warnings.

Rules are used to determine the changes made to the layout and style of the elements designated by a selector. Such rules can concern the typeface, backgrounds, page breaks, added elements, and so forth.

Three factors determine which set of rules applies when more than one selector would apply to a given section. Firstly, the rules related to more specific selectors override those related to more generic selectors. The order for this, from more generic to more specific, is as follows: 1. types of elements, 2. classes of elements, and 3. elements with unique identifiers. Secondly, rules at the same level of specificity but expressed later override those above them. The system thus always used the last applicable specifications. The third condition involves adding an exception which overrides the other rules unless it is repeated. To do this, you add the qualifier ‘!important’ after a rule.

A short string of CSS is provided below to demonstrate how all this works in practice. It would cause simpletable elements in DoX CMS to have a black background and white text whereas full tables (.dita-table) would have a white background with black text. Because the latter is based on a class rather than the element type, it will apply regardless of order. Finally, the code includes a condition based on an element class where either of the alternatives can be overridden to turn text red instead.

.dita-table {
     background-color: white;
     color: black;
}

table {
     background-color: black;
     color: white;
}

[doxelementclass="RedText"] {
     color: red !important;
}

Even though it is technically possible to include rules for different publishing formats in the same stylesheet, it is better to use separate stylesheets to address the needs of each such format. Any rules that only have an effect under the conditions related to a specific format cannot affect other publishing formats. However, it may be necessary to still specify the layout of the cover page anew between a paged publication and an online publication.

The Details of a Publication Template

*A demonstration of how publication templates are assembled in DoX CMS. Tags are selected in a separate tab.*

Once the other details that a publication requires have been determined, they need only be be combined to form a publication template that can be used to compile the publication. In the case of DoX CMS, a publication template requires the following details: 1. the used topic tree, 2. the used style, 3. cover pages, 4. values for publishing variables, 5. used tags, and 6. used revisions.

Topic trees and styles were discussed above. In these respects, it is only a matter of selecting one of the prepared alternatives. Cover pages are simply content modules that have been designated as possible cover pages. As such, this section will focus on the three other factors: publishing variables, tags, and revisions.

Publishing Variables

Publishing variables are a form of referenced content used within DoX CMS. They are also briefly alluded to in the first installment of this series.

Should the system being used not contain a similar feature, the same effect can be achieved by conditioning strings of the base material. In this case, the method to use corresponds to selecting tags in DoX CMS.

Revisions

Finally, you must select which revisions are used for any given module in each publication. The third installment in this series discusses content revisions in more detail. It can prove necessary to use older revisions when you are dealing with an older version of the publication or when the newest revisions remain work in progress.

DoX CMS lets users either select the used revisions for each module separately or apply a principle for which revisions each module uses. It is possible to, for example, lock revisions to the latest approved alternatives at the time. By default, the system updates new revisions of publications to always use the newest revisions of its content modules.

It is important to always make sure which revisions a publication uses by default. Should someone have locked specific revisions to be used in a prior revision of the publication, for example, such choices are inherited by later revisions. As a result, they must either be updated to the correct revisions or be set to update automatically.

Differences Between Languages

Even though the documents written using structured authoring are compiled automatically, it is best to review the layouts of the different versions. Prime among them are translations to different languages.

Proofreading translations may reveal otherwise unseen deficiencies in stylesheets. For example, it is easy to forget to translate the parts that are only included through the stylesheet such as the titles for different types of note elements. Additionally, differences related to details such as word length may break tables with automatically calculated column widths. There are rules that account for such situations, or you can add hyphenation by hand to resolve such issues. In some instances, it can even prove necessary to add forced page breaks to help the layout. In DoX CMS, this happens with the help of element classes as this piece explains.

You must also make sure that all parts that require translating have actually been translated. In the case of DoX CMS, variables are important to account for in this respect. Both complex variables and publishing variables can be included in translation projects. They include the title of the table of contents, for example. The name of the publication itself must also be translated if the variable which retrieves this information is used anywhere, such as on the cover of the publication.

Summary

Structured authoring makes determining the details of publications its own process. This process requires constructing parts distinct from content modules to be used for the publication. The most important among these are topic trees and stylesheets. Once the necessary material has been prepared, the publication template itself must be finished using them. Even then, different languages may require accounting for differences between them.

When content is organized into a topic tree, the modules in it are set to a publication-specific sequence and hierarchy. The same topic tree can be used in multiple publications if its content is properly conditioned. Outside topic trees, modules can be organized based on any logic at all.

Stylesheets control the layout and style of content. The primary language used to write them is CSS which consists of a set of selectors and the associated rules. Stylesheets can account for required page breaks, for example, and thus make the results of compiling content easier to anticipate. As such, content can also be written to account for the details of the stylesheet if its (intended) content is already known.

Assembling publication templates mostly involves a set of choices between prepared alternatives. It is particularly important to select the identifiers that determine what conditioned content is included. DoX CMS uses its tags for this purpose. You must also make sure that the correct revisions for content modules are being used.

Even though translated content is compiled based on the same specifications as the original publications, they must be proofread separately. This ensures that the stylesheet being used accounts for any differences adequately and that all required parts have in fact been translated.

Once the necessary preparations have been done to a satisfactory degree, however, it is simple to compile new publications and to reuse content. The more content is reused, the smaller the amount of effort required to control it becomes over the long term. Assembling publications as described in particular is easier than rewriting content or editing it separately for each publication.