XML: The Standard with so many flavors

Those of us who have been around since the late 1990s (and even earlier!) remember the promises of XML. It was going to simplify the world by establishing a standard syntax for document interchange between companies and applications, by providing a simple, self-describing tag language that inherently supports nesting of elements and descriptive attributes. Anyone would be able to use this easy-to-understand format to move data in/out of their applications.

Honestly, this notion had serious potential and was viewed as a “disruptive” technology shift. The XML “standard” described how XML documents were structurally composed and even extended the idea to an external document definition using Document Type Definitions (DTDs) and later through the use of XML Schema documents (XSDs) defining the document structure and element data-typing.  The standard even provides a method to avoid element name conflicts using a concept called Namespaces. This concept allows a document to use the field “address 1” in multiple contexts, such as Bill-To/Ship-To.  As you can imagine, this would be quite useful from a practical implementation standpoint.

The hidden “gotcha” is that XML does not address how a document is used in a business context. Let’s consider a Purchase Order, as an example.  We know that Purchase Orders contain a Bill-To, Ship-To and Line Items. However, XML doesn’t help us with this, from a business standpoint. It’s a mechanical way to express the data. The farthest that XML can go is to support nested elements, which can be leveraged to show collections of sub-items. The definition of this application-level standard is relegated to the software vendor, dominant partner in a business relationship, or some industry consortium that have appeared such as ebXML, OASIS and ECMA.

To understand the true impact of this issue, let’s investigate the urban legend that XML would replace EDI. I remember hearing this in early 1998, and by my reckoning, it’s 2009 and EDI is still going strong.  The significant reason is that EDI, and the specific business document implementation such as a PO, Invoice or Load Tender, is defined by a standards body; DISA’s X12 in the US and the UN/ECE’s EDIFCT outside the US. The standards body defines the fields, both structure and data typing, necessary to use the document for a specific business purpose, not just the fact that the document is a multi-format delimited flat-file.

Now, XML has been very useful and is widely used in the process of integrating data between applications and external organizations. To be successful with XML integrations, you really need the capability to transform from XML to XML because your XML document schema and my XML document schema are probably quite different.

XML also offers an interesting opportunity that other data formats, such as Flat-Files, cannot provide. I’m speaking about leveraging the tags used by XML elements to facilitate a Dictionary-based discovery of mappings between elements in XML document schemas. The idea of software leveraging a business dictionary to discover that “PO” really is equated to “Purchase Order Number” offers a powerful way to interface to external XML data sources. XML’s hierarchical structure and support for element attributes also allows software to discover semantic relationships between sub-branches of two different XML schemas. This capability allows the rise of systems that can receive XML files and make statistical attempts at creating mappings, on the fly, between the documents. These concepts, if they come to fruition, would reduce the efforts to integrate external XML into your organization. The reality is that we are not there yet. But, there is a lot of interesting work being done and it won’t be long before we start realizing the fulfillment of some of XML’s initial promises.

In our next installment, we’ll explore Flat-Files in their myriad forms and discuss some of the challenges in using this format to integrate to/from our back-end systems.

Leave a Reply

Your email address will not be published. Required fields are marked *


*