AWS S3 (Cross Region) Replication

AWS S3 Cross Region Replication is a bucket-level configuration that enables automatic, asynchronous copying of objects across buckets in different AWS Regions, these buckets are referred to as source bucket and destination bucket.

When replication is set up by default;

  • Replicas have the same key names and the same metadata—for example, creation time, user-defined metadata, and version ID
  • Amazon S3 stores object replicas using the same storage class as the source object, unless you explicitly specify a different storage class in the replication configuration
  • Assuming that the object replica continues to be owned by the source object owner, when Amazon S3 initially replicates objects, it also replicates the corresponding object access control list (ACL)

Use cases

Cross region replication is useful for various reasons, including;

  • Compliance requirements might dictate that data is stored at a certain distance, cross-region replication allows you to replicate data between distant AWS Regions to satisfy these compliance requirements.
  • If you have clients at different geographic locations to minimize latencyin accessing objects, copis of objects can be maintained in AWS Regions that are geographically closer to the users.
  • If there are compute clusters in different regions it might be a good idea operationally to maintain object copies in those regions for performance reasons.

Requirements for cross-region replication

  • The source and destination buckets must have versioning enabled
  • The source and destination buckets must be in different AWS Regions.
  • Amazon S3 must have permissions to replicate objects from that source bucket to the destination bucket on your behalf, these permissions are easier granted through IAM Roles

Setting up

In this article I will set up replication on the bucket called haddad-cloud-bucket-1 which is in the EU (London) region to a bucket called haddad-cloud-bucket-1-tokyo in the Asia Pacific (Tokyo) region. Configuring replication on the AWS console is done through the Management tab of the bucket page

Bucket Management Page

Clicking on the Add rule link will start the wizard, when configuring the source you can choose to replicate everything in the bucket or a prefix, which means a folder path in the bucket

Replication Source Configuration

By default, Amazon S3 uses the storage class of the source object to create an object replica. When configuring the destination in the next step of the wizard you can optionally specify a different, cheaper, class of storage, like Infrequently Accessed.

For replication to work both the source and destination buckets must have versioning enabled. It is not possible replicate to multiple destination buckets or use daisy chaining of replicating buckets.

Replication Destination Configuration

The next stage is for setting up permissions through IAM Roles

IAM Roles

Finally the wizard lets you review your options before you create the rule

Summary

Once the rule is in place replication will soon begin, however it is worth noting a few points about what is replicated and what isn’t.

What is Replicated

  • Any new objects created after replication is configured
  • Object metadata is also replicated
  • Objects tags, if any, are replicated
  • Any object ACL updates are replicated, unless Amazon S3 is configured to change the replica ownership in a cross-account scenario
  • Only objects in the source bucket for which the bucket owner has permissions to read objects and read ACLs will be replicated
  • When an object is deleted from the source bucket;
    • If a DELETE request is made without specifying an object version ID, Amazon S3 adds a delete marker, which cross-region replication replicates to the destination bucket.
    • If a DELETE request specifies a particular object version ID to delete, Amazon S3 deletes that object version in the source bucket, but it does not replicate the deletion in the destination bucket (in other words, it does not delete the same object version from the destination bucket). This behavior protects data from malicious deletions.

What is Not Replicated

  • Objects that existed in the source bucket before replication was configured are not retroactively replicated
  • Objects in the source bucket for which the bucket owner does not have permissions
  • Objects in the source bucket that are replicas, created by another cross-region replication, are not replicated.
  • The following encrypted objects are not replicated:
    • Objects created with server-side encryption using customer-provided (SSE-C) encryption keys.
    • Objects created with server-side encryption using AWS KMS–managed encryption (SSE-KMS) keys, unlessthis option is explicitly enabled.
  • Updates to bucket-level subresources are not replicated.
    • For example, you might change lifecycle configuration on your source bucket or add notification configuration to your source bucket.
    • These changes are not applied to the destination bucket.
    • This allows you to have different bucket configurations on the source and destination buckets.
  • Only customer actions are replicated. Actions performed by lifecycle configuration are not replicated.
    • For example, if lifecycle configuration is enabled only on your source bucket, Amazon S3 creates delete markers for expired objects, but it does not replicate those markers.
    • However, you can have the same lifecycle configuration on both the source and destination buckets if you want the same lifecycle configuration applied to both buckets.

In the example below the source bucket has 3 files (2.jpg, Dockerfile, and banana-3.jpg), but only the file called Dockerfile was updated after the replication was set up

Source Bucket

As you can see below, the destination bucket only has the one file (Dockerfile) which was updated after the replication was set up.

Destination Bucket

 
comments powered by Disqus