Data availability and durability

This page discusses concepts related to data availability and durability in Cloud Storage, including how Cloud Storage redundantly stores data, the default replication behavior for dual-regions and multi-regions, the turbo replication feature for dual-regions, and the cross-bucket replication feature.

Key concepts

  • Cloud Storage is designed for 99.999999999% (11 9's) annual durability.

    • To achieve this, Cloud Storage uses erasure coding and stores data pieces redundantly across multiple devices located in multiple availability zones.

    • Cloud Storage redundantly stores objects that are written to it in at least two different availability zones before considering the write to be successful.

    • Checksums are stored and regularly revalidated to proactively verify the integrity of all data at rest as well as to detect corruption of data in transit. If required, corrections are automatically made using redundant data.

  • The monthly availability of data stored in Cloud Storage depends on the storage class of the data and the location type of the bucket. For more information, see available storage classes.

  • Objects stored in a dual-region or multi-region bucket are stored redundantly in at least two separate geographic places.

    • For dual-regions, you select the specific regions in which your objects are stored.

    • For multi-regions, the specific data centers used for storing your data are determined by Cloud Storage as needed, but are located within the geographic boundary of the multi-region and are separated by at least 100 miles. This provides redundancy across regions at a lower storage cost than dual-regions.

    • In the unlikely event of a region-wide outage, such as one caused by a natural disaster, dual-region and multi-region buckets remain available, with no need to change storage paths.

    For more information about region-specific considerations, see Geography and regions.

  • Objects stored in dual-region and multi-region buckets are typically replicated across geographic places using default replication.

    • If one of the places an object is stored becomes unavailable after the object is successfully uploaded but prior to it being replicated to the second location, Cloud Storage's strong consistency ensures that stale versions of the object won't be served and that subsequent overwrites aren't reverted when the region becomes available again.

    • Objects stored in dual-regions can optionally use turbo replication to achieve a faster, more predictable replication across regions.

  • To achieve redundancy between a region pairing not available as a dual-region, consider creating a separate bucket in each region and using Storage Transfer Service Event-driven transfers or cross-bucket replication to keep the buckets in sync.

Redundancy across regions

While traditional storage models often rely on an active-passive approach with "primary" and "secondary" geographic locations, Cloud Storage provides an active-active architecture based on a single bucket with redundancy across regions. This simplifies the disaster recovery process by eliminating the need for users to replicate data from one bucket to another or manually failover to a secondary bucket in the case of primary region downtime.

Cloud Storage always understands the current state of a bucket and transparently serves objects from an available region as required. As a result, dual-region and multi-region buckets are designed to have a recovery time objective (RTO) of zero, and temporary regional failures are normally invisible to users; in the case of a regional outage, dual-region and multi-region buckets automatically continue serving all data that has been replicated across regions.

However, redundancy across regions occurs asynchronously, and any data that does not finish replicating across regions prior to a region becoming unavailable is inaccessible until the downed region comes back online. Data could potentially be lost in the very unlikely case of physical destruction of the region.

Default replication in Cloud Storage is designed to provide redundancy across regions for 99.9% of newly written objects within a target of one hour and 100% of newly written objects within a target of 12 hours. Newly written objects include uploads, rewrites, copies, and compositions.

Turbo replication

Turbo replication provides faster redundancy across regions for data in your dual-region buckets, which reduces the risk of data loss exposure and helps support uninterrupted service following a regional outage. When enabled, turbo replication is designed to replicate 100% of newly written objects to the two regions that constitute a dual-region within the recovery point objective of 15 minutes, regardless of object size.

Note that even for default replication, most objects finish replication within minutes.

While redundancy across regions and turbo replication help support business continuity and disaster recovery (BCDR) efforts, administrators should plan and implement a full BCDR architecture that's appropriate for their workload.

For more information, see the Step-by-step guide to designing disaster recovery for applications in Google Cloud.

Limitations

  • Turbo replication is only available for buckets in dual-regions.

  • Turbo replication cannot be managed through the XML API, including creating a new bucket with turbo replication enabled.

  • When turbo replication is enabled on a bucket, it can take up to 10 seconds before it begins to apply to newly written objects.

  • Object writes that began prior to enabling turbo replication on a bucket replicate across regions at the default replication rate.

    • Object composition that uses any source objects written using default replication in the last 12 hours creates a composite object that also uses default replication.

Cross-bucket replication

In some cases, you might want to maintain a copy of your data in a second bucket. Cross-bucket replication copies new and updated objects asynchronously from a source bucket to a destination bucket.

Cross-bucket replication differs from default replication and turbo replication in that your data exists in two buckets, each with their own configurations such as storage location, encryption, access, and storage class. As a result, it offers data recovery and availability, but is also suitable for:

  • Data sovereignty: Maintain data across geographically distant regions.
  • Maintain separate development and production versions: Create distinct buckets and namespaces, so that development doesn't affect your production workload.
  • Share data: Replicate data to a bucket owned by a vendor or partner.
  • Data Aggregation: Combine data from different buckets into a single bucket to run analytics workloads.
  • Manage cost, security, and compliance: Maintain your data under different ownerships, storage classes, and retention periods.

Cross-bucket replication uses Storage Transfer Service to replicate objects and Pub/Sub to get alerted of changes to the source and destination buckets. Cross-bucket replication can be enabled on new buckets you create and on existing buckets. Most objects can be replicated in the order of minutes, while objects larger than one GiB can take several hours.

For instructions on using cross-bucket replication, see Use cross-bucket replication.

Limitations

  • Object deletions in the source bucket are not replicated to the destination bucket.

  • Object lifecycle configurations aren't replicated.

  • When objects get replicated, timestamp metadata (for example, timeCreated and timeUpdated) don't get preserved. See Transfers between Cloud Storage buckets for details on metadata preservation.

Performance monitoring

Cloud Storage monitors the oldest unreplicated objects. If an object remains unreplicated for longer than its RPO (Recovery Point Objective) time, it's considered to be out of RPO. Each minute in which one or more objects are out of RPO is counted as a "bad" minute.

For example, if one object yielded 20 bad minutes from 9:00-9:20 AM, and another object yielded 10 bad minutes from 9:15-9:25 AM, then there are two objects for the month that are out of RPO. The total number of bad minutes for the month is 25 minutes, because from 9:00 AM to 9:25 AM there was at least one object that was missing its RPO.

  • For buckets using turbo replication, the RPO for objects is 15 minutes.

  • For buckets using default replication, the RPO for objects is 12 hours.

    • For buckets that use default replication, objects are typically replicated in one hour or less.
  • Cross-bucket replication doesn't provide an RPO.

Within the Google Cloud console, the Percent of minutes out of RPO graph lets you monitor the percentage of bad minutes during the past 30 days for your bucket. This service level indicator can be used to monitor your bucket's Monthly Replication Time Conformance. Similarly, the Percent of objects out of target tracks object replications that did not occur within the RPO. This service level indicator can be used to monitor the bucket's Monthly Replication Volume Conformance. For more information, see Cloud Storage monitoring and Cloud Storage SLA.

What's next