Structured authoring

Understand what structured authoring is and how the content rules are defined in FrameMaker for structured authoring.

In an unstructured authoring workflow, you create relatively free-flow narrative based documents. For example, you can have headings, followed by paragraphs, or graphics with captions or alternate text. In case of structured authoring, the content rules enforce a consistent structure across similar pieces of information. For example, you can decide to enforce the following content rules:

A topic must always start with a title.
A heading must be followed by a paragraph.
A table must have a heading row.
A graphic must have a caption.

These content rules are defined in either a document type definition (DTD) or an XML schema. Conformance to these content rules is automatically checked against the DTD or schema.

For example, consider the structure of a home address. Suppose that the content rules require address to contain an employee name, house number, street, city, and ZIP code. In unstructured authoring, an address without a house number can be discovered only through editing or review. In structured authoring, the structure is validated and automatically checked for completeness. Consistent organization and sequence are therefore enforced and assured.

Benefits

Enforces a consistent organization of information

You can create a structured application to ensure that a bulleted list must contain at least two items. Or an image must include a caption.

Automatically validates the organization of information

FrameMaker provides visual cues to indicate when the structure of a document is broken.

Here the title element is missing from DITA Topic

Visual cues to indicate the broken structure of a document

Consistency of content

Imposing structure results in improved consistency of content across multiple documents in a document-set.

Supports content reuse

FrameMaker provides user interface based content reuse functionality such as DITAVAL, Filter By Attribute, relationship tables, to allow users to easily reuse content.

Supports metadata to add information to documents

Besides content such as text and images, you can also associate metadata with a structured document. For example, the author of a document. You can also use attributes to associate metadata with specific elements in a document. The Filter by Attribute feature in FrameMaker allows you to set attribute values and then filter the content in a structured document based on these attributes.

Separating content and formatting

The writers focus on content. Formatting and the appearance of the final output is controlled by the publishing workflow. For example, print output may use a different font from online.

However, FrameMaker supports formatting in structured applications. This implies that the FrameMaker structured authoring environment displays formatted content. This provides visual cues to users regarding the formatting of a document.

XML View

WYSIWYG View

Reduces localization effort

Since structured documents separate content from formatting, the use of localization technologies can substantially reduce localization effort and cost.

SGML, XML, and XHTML

Using FrameMaker, you can import and export structured documents in either SGML or XML (including XHTML 1.0) format. Once you import a structured file, it is no longer an SGML or XML file; it is a structured FrameMaker document. To return it to its original format, save it as an SGML or XML file.

SGML

Standard Generalized Markup Language (SGML) is the international standard for all markup languages for data exchange and storage.

SGML is a descriptive, rather than procedural, markup language, meaning the same document can be processed by different systems. Each system applies different processing instructions to relevant sections. You can transfer SGML documents from one system (hardware and software environment) to another without any loss of data.

SGML was the first language to implement the Document Type Definition (DTD), which formally defines the document by its components and structure. Documents of the same type can then be verified and processed uniformly.

A document that conforms to the structure of a DTD is said to be valid.

XML

Extensible Markup Language (XML) is a generalized format for representing structured information, especially for the web. Like HTML and SGML, XML requires the use of elements and structure.

However XML differs from HTML in that it is extensible. You can define not only your tags but also their order, relationships among them, and the way they are processed and displayed. In terms of markup, XML has tags or elements which are similar to HTML markup except that they are defined by you.

Use XML to define and implement a structure that is appropriate for your content. An XML document that conforms to the structure of a DTD is said to be valid. An XML document that uses tags that conform to the standard XML specifications is said to be well-formed.

XHTML 1.0

Extensible Hypertext Markup Language (XHTML) is an extension of HTML that is based on XML and is designed to work with XML-based applications. It can be viewed, edited, and validated with standard XML tools. Using XHTML is an easy way to migrate from HTML to XML while retaining forward and backward compatibility of your content.

XML vs XHTML 1.0

Whereas HTML describes formatting, XML describes content itself. Humans can read HTML documents rendered in a browser. Both machines and humans can read XML.

Instead of style-based, paragraph-oriented word processing and desktop publishing, XML provides a foundation for structured authoring. XML describes content according to elements that are organized in a hierarchical tree.

In word-processing environments (such as unstructured FrameMaker), the relationship among the various document components is apparent through formatting on the page. The document file, however, does not capture these relationships because a word processor document is made up of a string of paragraphs. For example, unstructured FrameMaker does not capture the subordination of a Body paragraph tag to its preceding Heading1 tag. Structured authoring, however, does capture the hierarchical relationships among the document components.

DITA and DocBook

Two off-the-shelf structured applications available for technical documentation are DITA and DocBook.

DITA

DITA or Darwin Information Typing Architecture provides an off-the-shelf DTD and set of rules designed specifically for writing online documentation, such as software help files. It defines a tag structure suited to authoring, producing, and delivering technical documentation. The types of tags in DITA include <topic>, <title>, <shortdesc>, <prolog>, <body>, and <concept>. Following are some distinguishing DITA features:

DITA is topic-oriented. Each topic can be a piece of modular writing that can be reused in multiple contexts.
Because DITA separates content from context, multiple architectures of information are possible in DITA. DITA can also be extended to allow for the definition of information types.
DITA is topic-based. It provides three basic topic types but it allows for specialization of these topic types for individual needs.
DITA uses a ditamap which contains links to the XML files in the documentation set. Each XML file can be a topic or a collection of topics.
DITA outputs can be multiple ranging from PDF and HTML to variable documents. However, all output forms require some development work.
DITA is better suited for larger documentation sets.

DocBook

DocBook is also an open standard, designed for technical articles and documentation. DocBook provides a DTD for writing technical books and articles, with a structure that such forms imply. DocBook tags include <article>, <section>, <title>, <articleinfo>, and <pubdate>.

Following are some distinguishing DocBook features:

DocBook is more book or section oriented.
DocBook is hierarchical by nature and has to be developed for true single-sourcing. The content is not independent of its context.
DocBook has a fixed but a large set of elements and attributes.
DocBook provides an XML include file that contains all the other files.
DocBook outputs include PDF, HTML, and HTMLHelp. It can be extended for other output forms with some development work.
DocBook is easy to set up and is better suited for small to medium documentation sets.