‘High Availability’ 101 – Replication

On Friday we discussed redundancy as a universal attribute of highly available (HA) components. See “’High Availability’ 101 – Redundancy.” This time we’ll explore data replication, an application of redundancy that is common in many HA systems.

Often, the first hurdle in creating an HA component involves accessing data in a highly available fashion. Unfortunately, it is also one of the most complicated and expensive problems to solve. Applying the principle of redundancy introduced on Friday, highly available data means redundant data. Redundant data requires the act of copying it to redundant locations. This copying is sometimes referred to as ‘replication’. Highly available data requires some form of replication. Complications may result from the potential overhead of this operation, and maintaining consistency between the disparate copies. Consistency is especially troublesome if the data is capable of being read and modified concurrently.

Fortunately, in a layered architecture, data availability may be inherited. A layer higher in the stack may offer data availability because it is relying on another component for data management, and that component provides data availability. This greatly reduces, or eliminates, complexity for the higher layer.

Take, for instance, a messaging broker (like a JMS provider). The act of publishing new messages may be made highly available by creating redundant brokers, each capable of accepting messages for a particular destination. If one fails, the user is routed to a working copy. However, the act of consuming messages at the other end may be problematic.  The connection may always be available, but the messages themselves may not.  If each redundant broker is maintaining its own data store, and a broker becomes unavailable, so do the messages on that broker. It is important to note that messages on the broker may be safe in stable storage; they just aren’t available to be delivered until the broker resumes normal functioning. However, new messages continue to flow though the system.

If there are availability requirements for the messages themselves, they need to be replicated amongst the brokers. If any one node becomes unavailable, the messages are deliverable via their replicated copies. As stated previously, the broker system doesn’t necessarily have to implement that replication itself. All the redundant brokers could share a central data store, such as a database. If the central store is not highly available, neither is the broker. But, if the database offers data availability (it is doing replication under the covers), the broker can offer availability of its messages as well.

Note that this is just an example of the possibility, not that it is the ultimate goal in this or any other situation. It certainly reduces the complexity of the broker’s internals, but there may be other details necessitating the broker to be in charge of replication.

I’ve purposefully left out many of the complicating details, but if you want to analyze a system that makes high availability claims, look for the location of redundancy and replication. From enterprise DBMS, to virtualization and ‘cloud’ technologies, JEE Application servers, SAN, RAID, etc… They all use these basic principles. It can be a useful technique for comparing apples to apples.

One thought on “‘High Availability’ 101 – Replication

  1. Scott Nye

    This post about high availibility is very accurate. Jason Barkanic really seems to know his stuff. Please heve him do more posts on this subject in detail. I once heard him speak at the University in TN on Interacting Galaxies and modeling. The guy is a genius.

Leave a Reply

Your email address will not be published. Required fields are marked *