This document provides an introduction to using the High Availability and Fault Tolerant Architecture (HAFTA) Toolkit..
This document assumes a working knowledge of the requirements for high availability and fault tolerance.
HAFTA provides two major components to support developing highly available and/or fault tolerant systems.
The first component is a checkpoint library which allows you to design and develop your application with the concept of sequenced operations, checkpoints and rollbacks built in. Because this is something that has to be designed in from the initial design, it is best applied when the initial low level design is conceived.
The checkpoint library includes:
The second component is a runtime watchdog process which reads and implements a system definition description. This description tells the runtime, called the overlord, which process depend on which resources and what actions to take if the dependancy fails. This component can be added after the systems has been developed but the most complete implementation would integrate it into each component of your system.
The overlord includes:
HAFTA is implemented as a platform/os independant toolkit which currently runs on QNX6, QNX4, Linux and Solaris and should run on any Posix compliant OS which has the POSIX Realtime Extensions available.
We will take you through the initial creation of your own working copy of HAFTA, followed by a simple example to see how it can be used for your projects.