MEDIA COVERAGE
August, 2006
Business Solutions
“XML: The Future of Content Management?”
_view pdf |
|
XML Conversion
QAI offers cost-effective document conversion services for a wide variety of organizations adopting XML as a standard enterprise content format. Whatever your legacy format, QAI provides services to convert your content to any XML schema or DTD. QAI utilizes an innovative solution designed to minimize the costs of XML content conversion via a groundbreaking approach that can convert any file that can be printed to PostScript or PDF into XML. We use an automated process that eliminates human involvement for most documents, yet offers opportunities for user-intervention if desired.
This process results in flexible framework to support unique conversion workflows, and offers dramatic cost savings over traditional conversion methodologies. Unlike manual hand-tagging, our process is accurate, fast, and minimizes human resource requirements, enabling valuable expertise to be employed more effectively elsewhere. And unlike scripted conversion, there is no dependency on consistently-applied formatting styles, and no programming expertise required to develop/maintain configuration scripts. Our conversion solution uses visual cues to uncover a document’s structure. The valid XML output file not only maintains the original document’s content and logical structure, but also retains all relevant formatting information.
For more information on QAI's XML Conversion services, download our XML Solutions PDF.
XML Conversion Process |
 |
|
- The first step in the process analyzes a document’s PostScript or PDF representation
to extract all information about the appearance of the document.
This includes the characters in the document and their typography, and any
other visual objects. Because the process extracts text directly from the
input datastream, all content is accurately retained during conversion.
- The next step in the process identifies the basic building blocks of document
structure, including many important visual cues, and the large-scale
layout areas of the page.
- The third step places these now identified building blocks into a tree structure.
This phase identifies sections, paragraphs, quotes, lists, tables, footnotes,
and other graphical objects, and forms a complete, cohesive, internal
representation of the structured document.
- The final step in the process uses the internal representation of the document,
from step three, to export an XML file that not only presents the
document’s content and logical structure but also retains all relevant
formatting information.
|