NOTE: This article was originally published in the Center for Information-Development Management’s (CIDM) Best Practices newsletter in October 2016. The article was contributed by Richard Ackerman, Digabit’s Senior Director of Technical Sales.
The technical publishing world has increasingly gravitated toward structured data concepts and tools over the past decade. From XML to DITA to component content management systems (CCMSs), the evolution continues thanks to savings in labor, improved accuracy, and optimized workflows afforded by structured data methodologies.
While using structured data to publish technical documentation offers significant efficiencies, it is quite common to find publishers who impose unnecessary challenges upon themselves. This article addresses the top three traps publishers should recognize and avoid.
Stop Putting Intelligence in File Names!
What is the purpose of a file name?
From a software perspective, often the only limiting parameter for a file name is a unique string of accepted characters. However, it is extremely common in practice for individuals saving files to include information associated with the file, or “intelligence.”
While it is important to capture metadata, the file name is typically the worst place to include attributes of the content.
As a hypothetical exercise, let’s pretend the file name for a particular manufactured assembly includes the model number of the product where the assembly is used. If engineering decides to use the same assembly on a related model (with a different model number), will the publications department update the file name accordingly? Probably not.
While this is a simplistic example, using intelligence in a file name, whether to facilitate search or for some other well-meaning objective, inevitably results in conflict, confusion, or unnecessary work to maintain.
It is much easier to set up a system where the file name simply uses the next available number or some other arbitrary convention. This system allows all respective attributes to be applied as metadata.
There will never be confusion in the case of a one-to-many relationship if you are diligent in keeping intelligence separate from file names. And you will avoid the unnecessary work of maintaining file naming conventions that are based on some descriptive aspect of the content.
Best practices for structured data suggest that you should leverage software to manage the attributes of a file or data element. Consider the file name to be the equivalent to a “key” in a database. A file name in structured publishing should simply be a unique identifier. Don’t be lured by the appeal of making it mean something more.
Choose Your Storage Wisely!
Most companies have a huge assortment of software tools, databases, server locations, and so on, which offers multiple feasible locations to store important information. Depending on the type of information, there are clear advantages in storing the data in a specific location.
Let’s consider a scenario in which text-based data is stored within an engineering drawing. Maintaining this information in CAD is significantly more expensive than managing the same data in text-based databases. Revising data in CAD requires skilled, highly compensated resources (usually in short supply), and also involves more demanding workflows to implement changes.
Obviously, high-quality models and drawings require CAD; however, text-based information is better suited for storage in other applications.
One example occurs when OEMs place vendor information on a drawing. This practice is not suggested, even if a part or assembly is sole sourced. Instead, vendor data can be captured in an ERP system, and then programmatically applied to a related purchase order. If a new vendor is eventually contracted to supply the part in question, the time difference in maintenance is profound.
With regard to electronic parts catalogs, including text-based information in an illustration is similarly detrimental. Use software that has appropriate places to capture text-based data. It is always easier to update a text field than an illustration. This ideal is readily apparent if the text field has many instances of re-use.
Furthermore, combining data elements such as the illustration and text attributes limits the functionality of relational database behavior if changes need to be made on one element rather than both elements simultaneously. Combining elements like this negates one of the major benefits of implementing a database publishing system.
Don’t Combine Data Elements!
Manufacturers employ a number of common strategies to increase publishing efficiency. Some of these strategies endure, even though they were born in an unstructured world.
For instance, consider a parts list matrix within a parts book that shows what the corresponding part is for each model. In other words, one page is detailing all of the parts within the machine assembly for all models. While this was extremely effective when creating standalone PDF-type documents, it is incredibly constraining in a structured publishing environment.
With structured database publishing software, tying multiple data elements into a rigid form severely limits the ability for the relational database to manage the data. Each data element needs to be distinct.
This rule of thumb also applies to our earlier example of placing text-based information within an illustration. Another common error of this type is placing metadata (or information that can be captured in metadata) within a description field.
It is incredible how many unique strategies creative publishers have invented in order to save time. While these approaches may have been advantageous in the past, and may still offer short-term benefits, these “benefits” eventually add up to an opportunity cost that manufacturers cannot afford. Continuing legacy practices that are incompatible with new technologies is guaranteed to create havoc when a modern system is inevitably adopted.
It may be time to pause and sharpen your axe before swinging at the trees. Are the “keys” to your relational database unintelligent? Are you using the most efficient location to store corresponding data? Are all of your data elements discrete? If you answer, “No,” to any of these questions, you have great opportunities to maximize efficiency in your current—and future—publishing tools.