The Pure Storage FlashArray has built in data protection mechanisms that enable simple and streamlined replication solutions without any add-ons or additional software costs. Depending on the organizations RPO requirements for their specific workloads will ultimately determine which of the three protection mechanisms they should utilize. In this blog, I wanted to discuss our latest replication capability called ActiveDR.
ActiveDR is a continuous source-to-target data replication technique that enables you to replicate data between a source FlashArray and a target FlashArray regardless of distance or latency. In most cases, update-by-update replication makes it possible to meet RPOs of seconds rather than minutes or longer.
ActiveDR leverages pods similar to active cluster, so volumes, pgroups, snapshots, and configuration changes are all replicated. With its journal-based architecture, ActiveDR supports non-disruptive failover testing of DR procedures without affecting the production workloads. In the event of an actual disaster on the primary FlashArray, the Administrator can promote the ActiveDR replica and be back up within minutes.
Here is a quick demo of ActiveDR in action.
Deciding which replication methodology to implement will, ultimately depend on a two primary factors:
1. Distance between Arrays and the associated latency
2. RPO requirements of the data
Active Cluster Synchronous replication:
ActiveCluster replicates data between two FlashArrays synchronously and requires peer latency under 15ms. There are no source and target roles-with the help of a remote mediator (Either Pure1 or an on prem mediator), both arrays maintain identical images of replicated data. Hosts access the same volumes regardless of which array they connect to. They read, write, reconfigure, snap, clone, and destroy replicated volumes via either array. ActiveCluster maintains data identically at both locations by staging every update in both arrays before acknowledging write completion to a host. (hence the low latency requirement) In many cases, hosts connect to both arrays so that if an array or path fails, host multipathing provides uninterrupted access. This provides a RPO of 0.
FlashArray Asynchronous replication leverages snapshot replication between arrays. This protection scheme forwards differences between successive snapshots of a protection group (pgroup) of volumes on a source FlashArray to a target FlashArray. The target array uses the differences to create replicas of the pgroup snapshots. These snapshots are immutable, so to use a replica of one, a target array administrator clones the needed volumes from it and restarts applications connecting them to the clones. With asynchronous replication, RPOs of 5–10 minutes can usually be satisfied.
ActiveDR adds a third option-it replicates updates from a source FlashArray to a target FlashArray as they occur. Sending updates continuously rather than in periodic snapshot deltas minimizes lag, which varies due to several factors (source update rate, both arrays’ other loading, available network bandwidth), but is typically on the order of a few seconds.
ActiveDR is unidirectional-it replicates from a source array to a target array, which maintains a distinct copy of replicated data. Unlike asynchronous replication, however, ActiveDR forwards updates to the target continuously. It applies updates to volume replicas in order of occurrence, so an ActiveDR replica’s content always represents some previous state of data at the source site.
ActiveDR runs at lower priority than host I/O execution on both source and target arrays, so it has negligible effect on host response time. The distance between source and target arrays can therefore extend to thousands of kilometers, protecting against data loss in virtually any conceivable disaster scenario.
For applications such as E-Mail, knowledge workers’ home directories, project shares, and others that can tolerate seconds, but not minutes, of lost updates in disaster situations, ActiveDR makes it possible to enable recovery sites at very long distances from production data centers with minimal risk of lost data and achieving a Near-0 RPO.
- Pod-based replication- Uses a storage pod as a management container for replication, failover, and consistency. An active pod on a source array can be linked to a passive pod on a target array to form a pod-to-pod replication pair.
- Near-zero Recovery Point Objective (RPO)- Achieves near-zero data loss for rapid disaster recovery at the DR site, enabling you to keep the data on the source and target FlashArray almost synchronized.
- Test recovery without disrupting replication- Enables failover testing without disrupting data replication to the recovery site to maintain the RPO.
- Pre-configured volume and host connection- Allows hosts to be connected to the volumes on the target FlashArray at the recovery site before a failover to speed up and simplify the failover process.
- Bidirectional replication- Allows different pods in the same two FlashArray’s to link and replicate in opposite directions across sites.