Failover Testing:

What is a Failover Test?

  • Is a test to check a system’s ability to move to back-up systems during a system failure.

Function of Failover Tests:

  • Failover testing is conducted to demonstrate the ability of an application to Failover and Recover from one data center to another data center seamlessly, proving resiliency across data centers (tests should cover Active-Active, Active-Passive scenarios as applicable)
    • Active - Active: consists of at least two nodes, both actively running the same kind of service simultaneously. The main purpose of an active-active cluster is to achieve load balancing. Load balancing distributes workloads across all nodes in order to prevent any single node from getting overloaded. Because there are more nodes available to serve, there will also be a marked improvement in throughput and response times. [1]
      • Let it fail, but fix it fast. This is the premise behind active - active. [2]
      • Nodes and database copies are geographically distributed. Should a disaster take out a node or a database site, there are others in the network to take its place. [2]
      • Active - Active diagram link [1]
    • Active - Passive: consists of at least two nodes but not all nodes are going to be active, so one node is active all others are not active. [1]
      • The passive server serves as the Failover/Backup server and is activated if the other server goes down
      • For both types it is important that all instances of the server have exactly the same settings
      • Active - Passive diagram link [1]
  • Failover determines if a system is capable of handling extra resource like additional CPU or servers during critical failures or at the point the system reaches a performance threshold.

Testing Results:

  • Developers need to show evidence that the Primary environment is running without error (prior to Failover), should show the system is active and the app is running without error
  • Then the Secondary environment is enabled so that both environments are running at the same time
  • Then the Primary environment is turned off, devs should show the Secondary system is active and the app is running without error by conducting a Smoke test. This proves everything is running as expected.

Oracle Failover Diagram [3]:

OracleDB_Failover.gif


SQL Failover Diagram [4]:

FailoverSQLDB.png


Resource(s):


Index of Testing Types: https://ultra.guide/bin/view/Testing/DifferentTypesSoftwareTestingIndex
Topic revision: r10 - 28 Apr 2020, KellyEverlyHall
© 2020 Ultranauts - 75 Broad Street, 2nd Floor, Suite 206, New York, NY 10004 - info@ultranauts.co