Last updated on September 30, 2015

Overview of Recovery and Failover

Understanding Failover

Failover is a backup operational process in which the functions of a primary system are assumed by a secondary system when the primary system becomes unavailable through either failure or scheduled downtime.

Determining which failover system is best for your business is dependent on a number of business and manpower factors. While an automatic failover system is often favored by organizations operating business critical systems, a manual failover system can be equally effective with the right monitoring tools and manpower.

The below discusses the two prominent methods of backup and recovery.

The Cold Standby Method

Cold standby involves having an additional server acting as a backup to the primary server. The secondary server remains offline in the absence of failure and is only called upon when the primary server fails.

The primary server’s data is preserved through periodic database and content backups, which will be restored on the secondary server in the event of a failure. The final switchover occurs when network traffic is diverted from the primary to the secondary server thus resuming application availability.

While this strategy is typically assumed to introduce larger downtime during a failure, it is really an organization’s back-up strategy (e.g. every few hours, every day), that can make this an equally effective solution for recovery. With the right tools, a cold standby server can be operational in less than thirty minutes. Cold standby systems are typically favored for non-critical applications or in cases where data is changed infrequently.

For a sample implementation of this method, read Backup Using Cold Standby

The Warm Standby Method

Similar to the Cold Standby model, the Warm Standby method utilizes two servers with one acting as a primary server and one acting as a secondary server. However, with this method, both servers will be online at all times.

Data replication occurs at regular intervals between the two servers ensuring both servers are up to date with one another. Switchover occurs via a heartbeat system that connects one server to the other, functioning as a pulse to determine the health of both servers. Should the secondary server fail to detect the heartbeat of the primary server, it automatically assumes the role of the primary server.

For a sample implementation of this method, read Backup Using Warm Standby