What is fault tolerance? And how does fault tolerance help to ensure high availability functions?
Fault tolerance is the ability of a system to experience a fault but continue to operate nonetheless. Furthermore, it is the ability of the system to continue operating with interruption even if some of its components fail. The objective of establishing fault-tolerant systems is to prevent interruptions arising from a single point of failure (SPoF) and thereby ensuring the high availability (BC) and business continuity (BC) of mission critical systems of an organization.
Fault tolerant systems use redundant components that automatically switch and take over the failed components before loss of service occurs. The fault-tolerant systems may be achieved by one or more of the following components:
- Software systems
- Hardware systems
- Power supply
- Redundant links/paths