II Fault-Tolerant Cluster configurations
The cluster solution provides support to availability with three levels:
1) Hot Standby
2) Active takeover
3) fault tolerant
Hot standby is a redundant method in which a single system runs parallel with the identical system. When failure occurs, the hot standby system immediately replaces the identical system, so that there will be same data identical in both the systems. A hot standby system will be located close to the identical system in the same building or another building or even other country or state. Some examples that describe hot standby components include network printers, hard drives; audio or visual switches and so on… Active takeover is when a node fails, the application fails to fetch the available node present in the cluster, it may take some time to implement the failover, so that the user will experience some delay in the application. Failover cluster is when the component fails; this failover technique will make the remaining components take the job of the failed cluster in order to maintain availability. To identify whether the node or the system is active or not, a heartbeat technique is used to send a stream of heartbeat messages from one cluster to another cluster, if the system does not receive the heartbeat then we the system can conclude that the node is failed.
A cluster uses multiple networks to connect its multiple nodes. One node will be master node and the remaining nodes will be slave nodes. Each slave node will send a heartbeat message to the master node, now the master node will detect the failure if it does not receive the message from the slave nodes through both the networks. Once the failure is identified, the system will send notification to the failed node and the node will again send the load of message to the master node. The failed component has to be recovered using two recovery schemes – Backward recovery and Forward recovery. The drive holding the data was crashed due to some issues and the backup of the data has to be taken for the past days, the log file can be used to rollback to all the transactions that was completed but lost because of system crash. This process is Forward recovery. The transactions that are not completed but needs to be roll backed for usage of those transactions later is backward recovery.
Fault tolerant in the real world:
The system failure can be either by hardware and software. The software issue could be some bug that made the system to hang or some bud that crash the system. The hardware issue could be an operating system crash, hard disk failure and so on… If your application that is running currently suffers some issue in the power supply and if the application is running on server at that location, then a separate provider is needed to function the application, but if the system has fault tolerance and if it hosted in multiple locations, if one goes down then the other will continue the tasks that are handled by the previous node. So that the users will not experience any issue while accessing the application