If you’ve ever dealt with changes to a working version of a schema, whether it is database, EDI, XML, or whatever format your data may be in, then you know how painful it is. In most shops in the typical data processing scenario, either a tool or a custom program is used to process the data in one format and convert it to another format to be piped off for further processing somewhere else. The most difficult to deal with example can be changes to an XML schema. The reason is that XML is so extensible and just about anything can be done with it. The contrasting example would be EDI data where the changes are usually miniscule and the structure itself does not vastly change. The typical example that most IT shops face is a change in a database which could be the addition of a table or column, the deletion of a table or column, the moving of a table or column, or a change in table/column properties.
If we look at this from the perspective of a model, a schema is really a tree or graph (depending on whether it’s recursive) with entities representing the schema structure. What this really boils down to is a series of additions, deletes, and renames to change the graph that represents the old schema into the graph that represents the new schema. The upgrade path answer is much simpler (but still painful) if we’re dealing with a custom program because the intent of the changes are known in house and should be absolutely clear. The real problem is coming up with a generalized solution.
While there is no panacea for the generalized upgrade problem a partial solution can be reached by trying to determine which elements of the old version of the schema are paired up with which elements of the new version of the schema. The only known algorithms to do this, which can never be 100% correct especially in complex cases, are heuristic algorithms that work with probabilities to determine the likelihood of a match of elements between the old and new schemas. In my next BLOG entry I will talk more in depth about EXTOL’s generalized solution to the problem.