Tag Archives: Data Integration

Creating XML in Oracle

In a previous posting (Creating XML in MSSQL : http://www.extol.com/blog/?p=651) I described the feature in MSSQL which allows for generation of XML data with a specially structured database query.   Other databases offer similar capabilities, so this post is meant to highlight how do accomplish the same task using Oracle.

For the purposes of this post I’ll be referring to Oracle 11g, which improved on the XML handling in 10g, and included features such as a binary XML data type superior XQuery support, and the ability to embed XML within PL/SQL. Continue reading

The Importance of Risk and Change Management Policies

As organizations continue to look for improved efficiencies and reductions in cost, significant emphasis is being placed on “Change Management” – particularly at a time when legacy applications are being replaced with newer, “leading edge”, technologies.  The ability for an organization to develop and execute critical risk and change management policies will often result in the success or failure of that organization.  Internal stress can be minimized if such policies are evaluated and put in place before (and utilized during) any implementation occurs.

Ask yourself the following two questions:

1.      How would our organization react if a critical business application or integration database became corrupted?

2.      What might happen if we implemented a change to a production data transformation process (e.g. APP/DB to XML) without first testing that change?

Risk management is forecasting and rating issues and formulating resolutions before a problem occurs.  This rating is a contrast of the likelihood of an issue to occur versus how significant the consequences are if the issue does occur.

Responding to the first question, if the database is unrecoverable then there are major consequences as all the objects within the application and database would need to be recreated.  Since this is a critical risk issue, proper backup and restoration procedures should be in place before the system is moved to a production state.

Responding to the second question, this relates to both change and risk management.   The possibility for failure is largely dependent on the severity of the change being made.  The consequences could include the transformation process failing completely, or the receiving entity rejecting all data.  If this represented high volume business transactions, reprocessing the erroneous data may impact system performance.  Depending on the time sensitivity of the data, the consequences can vary from minimal to severe.

Question:  How do we circumvent such a risk?

Answer:  Adopting a change management policy for updating production configurations.

By employing a fully-functional development system, changes to existing production configurations can be tested and then approved for production so the risk of a failure can be minimized or possibility eliminated.

The key to a successful implementation of a business integration application is to adopt strict risk and change management policies before moving into a production state.

EXTOL’s Migration Assistant

EXTOL’s Migration Assistant provides a path for the users to upgrade the schemas they’re using in a Transformation in place.  The Migration Assistant is a first class archivable object that can be found in its own node in the Workbench.  This feature works for all syntax categories.  Use cases would be an upgrade of an EDI version (i.e. 4010 850 to 5010 850), upgrade of an .xsd (i.e. elements/attributes added or removed from an xml schema or other structural changes), changes made to a database (i.e. tables or columns added or removed), upgrade of a delimited flat file (i.e. fields added or removed), or the upgrade of a Spreadsheet (i.e. cells or tabs added or removed).

The Migration Assistant is a two step process.  First the user is required to create a map called a Schema Association Map between the “Old Schema” and the “New Schema”.  This map is created by using a Ruleset like editor called the Schema Association Map Editor.  This can be done manually or it can use the Auto Association Algorithm.  The Auto Association Algorithm is used by toggling the “Auto-Associate Children” button in the toolbar and dragging a desired source node to a desired target node.  All child nodes of the desired source will be auto associated with all child nodes of the desired target.  The recommended Associations are displayed in an Approval Editor where the user can review and change Associations.  The suggested Associations are ranked by a confidence level.  It is recommended for schemas that are drastically different that if it is desired to use the Auto Association Algorithm that the user start at lower level source elements that they are confident belong with lower level target elements and work their way up to the root.  For schemas where the changes between the “Old” and “New” schemas are reasonable (no huge structural changes) it is permissible to associate the two roots and let all of their children be auto associated, but one should carefully review the results.

The second step to the process is to apply the Schema Association Map to the desired Ruleset.  This can be applied in a batch mode (used when a schema is being used in multiple Rulesets) or it can be applied to a single Ruleset.  The Schema Association Map can be applied to a single Ruleset by right clicking on a Ruleset in the Rulesets tab and clicking Convert Ruleset->Source/Target.  A dialog will be displayed asking the user to select which Schema Association Map they want to apply.  Once the proper Schema Association Map is selected the Ruleset Conversion Editor will be displayed.  This is a diff like editor that will allow the user to review and apply their desired changes as dictated by the Schema Association Map.  To initiate the batch conversion mode the user can right click on the desired Schema Association Map and select Convert Rulesets.  A dialog will be displayed listing all of the Rulesets that the selected Schema Association Map can be applied to.  For each desired conversion the user must select whether they want to review their changes with the Ruleset Conversion Editor or if they want to automatically apply the changes dictated by the Schema Association Map to the Ruleset.

The hope is that the solution is flexible enough so that for simple cases there is very little for the user to do but is able to handle the really difficult cases as well.  Hopefully this has been a helpful and informative guide to upgrading your schemas in EBI.

Small- versus Large-Scale Provisioning

As applied to business integration, the term “provisioning” has many meanings, but overall, it refers to the process of defining integration endpoints and establishing connections and integration processes between them.   If an integration service that connects a pair of endpoints is simple and tightly constrained – for example, a data syndication service with a fixed process model and limited output options – provisioning can be as simple as selecting from a fixed list of connection and data format delivery options, and specifying the delivery endpoint’s address.

In most cases, however, business integration provisioning involves more steps, because the business problem to be solved requires tailored integration between some set of sources and targets, e.g., integration of an XML transaction set with a Warehouse Management System.  Those steps might include definition or specification of endpoints, communication and interface connections, documents / messages and envelopes, data routing, business processes, and data transformations.  By combining building blocks that implement such object types, you can solve most kinds of business-to-business, application, and data integration problems. Continue reading

Dipping a Toe in the NoSQL Pool

There’s a great deal of data floating around and it has to be put somewhere. In 1970, the relational database (RDBMS) was invented to help store it. Data that was *related* could be grouped together in a table. A schema was used to define the structure of the data within the database and the relationships among it. SQL was used to manipulate the database and the data it contained. There are many databases that use this model, such as MySQL, PostgresSQL, and Oracle.

For many years, relational databases were the cornerstones of applications. Organizations used them as the backend store for their thick-clients as well as being an integral part of the LAMP stack used in early web applications. Only recently has our software needed a little extra oomph.

The NoSQL movement promises to fulfill requirements of high availability, horizontal scaling, replication, schemaless design, and complex computational capabilities. It contests the notion that RDBMS are always the best place to store your data and opens the doors to greater freedom when choosing your storage mechanism.

The framework used to evaluate these systems is based on consistency, availability, and partition tolerance (CAP). The CAP theorem was developed by Eric Brewer to formally talk about tradeoffs in highly scalable systems1. Like other decisions made in the software world, you can only pick two out of the three criteria.

The NoSQL movement doesn’t subscribe to a particular data model like RDBMS do. There are three other models that are part of the crowd:

key-value: much like a map that supports put, get, and remove (Redis, Dynamo)

column-oriented: still uses tables like the relational model, but without joins (BigTable)

document-oriented: stores structured documents like JSON or XML (CouchDB)

You may be thinking, “Ok, so what is the best one?” I only wish the answer was that simple. Many different factors go into choosing and you are not limited to one mechanism per application. You can choose different stores for different types of data and functionality.2, 3

Structuring your application to take advantage of these data store capabilities requires analysis of your data requirements. You may need fast access or maybe your data is written more than it’s read. Perhaps you need to perform calculations such as map/reduce or graph manipulations. Maybe your data is of the binary variety. And of course, the availability rabbit hole – do you trust your server not to fail when you’ve just been featured on the 6 o’clock news (or Digg)?

While this is a lot to think about, the benefits of charting your way through the NoSQL forest are worth the effort in the long run. Your application will be better suited to expandability and your maintenance efforts may be decreased. However, there’s no cause to throw out your SQL books…just yet.

More info & references:

1. http://www.julianbrowne.com/article/viewer/brewers-cap-theorem

2. http://blog.nahurst.com/visual-guide-to-nosql-systems

3. http://blog.heroku.com/archives/2010/7/20/nosql/

4. http://architects.dzone.com/news/nosql-old-wine-new-bottle