Resource · Glossary

    What Is Failover?

    Failover is the automatic transfer of work from a failed component to a standby one — the mechanism that turns redundant hardware into actual resilience. When it works, users never notice. When it silently rots, the second failure becomes the outage the first one should have been.

    Where It Lives

    Failover at Every Layer

    Servers

    Clusters and VM restart/migration when a host dies.

    Network

    Redundant links, VRRP gateways, and dynamic rerouting.

    Storage

    Multipath I/O, RAID, and replicated arrays.

    Power

    Dual PSUs on A/B feeds, UPS, and generator transfer.

    The Quiet Failure Mode

    Redundancy Rots When Nobody Watches It

    The most dangerous state in infrastructure isn't a failure — it's running on the standby without knowing it. A power supply died last month and the second one is carrying the box alone; one path of the storage multipath dropped and nobody noticed; the standby's health degraded while it idled. Everything works, until the next single failure has no backup left.

    That's why failover is a monitoring problem as much as an architecture problem. Component-level hardware telemetry — both PSUs, every fan, each link and path — is what catches redundancy loss while it's still a maintenance ticket instead of a 2 a.m. outage.

    Detect running-on-standby conditions
    Alert on lost redundancy, not just loss
    Verify standby health continuously
    Track A/B power feed balance
    Test failover with confidence
    FAQ

    Common Questions About Failover

    What is failover in simple terms?

    Failover is the automatic switch to a standby system when the primary one fails — a second server, network path, or power feed takes over so the service keeps running.

    What is the difference between failover and high availability?

    High availability is the goal (a service that stays up); failover is one of the mechanisms that achieves it. HA designs combine redundancy, failover, and health monitoring to survive component failures.

    What is failover testing?

    Failover testing deliberately fails the primary — pulling a link, stopping a node — to verify the standby actually takes over within the expected time. Untested failover is a hope, not a design.

    Active-active vs active-passive failover?

    Active-passive keeps a standby idle until the primary fails. Active-active runs both nodes, sharing load and taking each other's traffic on failure — better utilization, but requires state synchronization.

    Know your redundancy is real — before you need it