Set cache-control for entire S3 bucket automatically (using bucket policies?)

Amazon S3S3fs

Amazon S3 Problem Overview


I need to set cache-control headers for an entire s3 bucket, both existing and future files and was hoping to do it in a bucket policy. I know I can edit the existing ones and I know how to specify them on put if I upload them myself but unfortunately the app that uploads them cannot set the headers as it uses s3fs to copy the files there.

Amazon S3 Solutions


Solution 1 - Amazon S3

There are now 3 ways to get this done: via the AWS Console, via the command line, or via the s3cmd command line tool.


AWS Console Instructions

This is now the recommended solution. It is straight forward, but it can take some time.

  • Log in to AWS Management Console
  • Go into S3 bucket
  • Select all files by route
  • Choose "More" from the menu
  • Select "Change metadata"
  • In the "Key" field, select "Cache-Control" from the drop down menu max-age=604800 Enter (7 days) for Value
  • Press "Save" button

(thanks to @biplob - please give him some love below)


AWS Command Line Solution

Originally, when I created this bucket policies were a no go, so I figured how to do it using aws-cli, and it is pretty slick. When researching I couldn't find any examples in the wild, so I thought I would post some of my solutions to help those in need.

NOTE: By default, aws-cli only copies a file's current metadata, EVEN IF YOU SPECIFY NEW METADATA.

To use the metadata that is specified on the command line, you need to add the '--metadata-directive REPLACE' flag. Here are a some examples.

For a single file

aws s3 cp s3://mybucket/file.txt s3://mybucket/file.txt --metadata-directive REPLACE \
--expires 2034-01-01T00:00:00Z --acl public-read --cache-control max-age=2592000,public

For an entire bucket (note --recursive flag):

aws s3 cp s3://mybucket/ s3://mybucket/ --recursive --metadata-directive REPLACE \
--expires 2034-01-01T00:00:00Z --acl public-read --cache-control max-age=2592000,public

A little gotcha I found, if you only want to apply it to a specific file type, you need to exclude all the files, then include the ones you want.

Only jpgs and pngs:

aws s3 cp s3://mybucket/ s3://mybucket/ --exclude "*" --include "*.jpg" --include "*.png" \
--recursive --metadata-directive REPLACE --expires 2034-01-01T00:00:00Z --acl public-read \
--cache-control max-age=2592000,public

Here are some links to the manual if you need more info:

Known Issues:

"Unknown options: --metadata-directive, REPLACE"

this can be caused by an out of date awscli - see @eliotRosewater's answer below


S3cmd tool

S3cmd is a "Command line tool for managing Amazon S3 and CloudFront services". While this solution requires a git pull it might be a simpler and more comprehensive solution.

For full instructions, see @ashishyadaveee11's post below


Hope it helps!

Solution 2 - Amazon S3

Now, it can be changed easily from the AWS console.

  • Log in to AWS Management Console
  • Go into S3 bucket
  • Select all files by route
  • Choose "More" from the menu
  • Select "Change metadata"
  • In the "Key" field, select "Cache-Control" from the drop down menu
  • max-age=604800 Enter (7 days) for Value
  • Press "Save" button

It takes time to execute depends on your bucket files. Redo from the beginning if you accidentally close the browser.

Solution 3 - Amazon S3

steps

  1. git clone https://github.com/s3tools/s3cmd
  2. Run s3cmd --configure (You will be asked for the two keys - copy and paste them from your confirmation email or from your Amazon account page. Be careful when copying them! They are case sensitive and must be entered accurately or you'll keep getting errors about invalid signatures or similar. Remember to add s3:ListAllMyBuckets permissions to the keys or you will get an AccessDenied error while testing access.)
  3. ./s3cmd --recursive modify --add-header="Cache-Control:public ,max-age= 31536000" s3://your_bucket_name/

Solution 4 - Amazon S3

Were it that my reputation score were >50, I'd just comment. But it's not (yet) so here's another full answer.


I've been banging my head on this problem for a while now. Until I found & read the docs. Sharing that here in case it helps anyone else:

What ended up reliably working for me was this command. I chose a 1 second expiration time for testing to verify expected results:

aws s3 cp \
  --metadata-directive REPLACE \
  --cache-control max-age=1,s-maxage=1 \
  s3://bucket/path/file \
  s3://bucket/path/file
  • --metadata-directive REPLACE is required when "cp" modifying metadata on an existing file in S3
  • max-age sets Browser caching age, in seconds
  • s-maxage sets CloudFront caching, in seconds

Likewise, if setting these Cache-Control header values on a file while uploading to S3, the command would look like:

aws s3 cp \
  --cache-control max-age=1,s-maxage=1 \
  /local/path/file \
  s3://bucket/path/file

Solution 5 - Amazon S3

I don't think you can specify this at the bucket level but there are a few workarounds for you.

  1. Copy the object to itself on S3 setting the appropriate cache-control headers for the copy operation.

  2. Specify response headers in the url to the files. You need to use pre-signed urls for this to work but you can specify certain response headers in the querystring including cache-control and expires. For a full list of the available options see: http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTObjectGET.html?r=5225

Solution 6 - Amazon S3

You can always configure a lambda with a trigger on PUTOBJECT on S3, the lambda will simply change the header of this particular object that was just put.

Then you can run the copy command mentioned above one last time, and all the new objects will be fixed by the lambda.

UPDATE:

Here is a good place to start from: https://www.aaronfagan.ca/blog/2017/how-to-configure-aws-lambda-to-automatically-set-cache-control-headers-on-s3-objects/

Solution 7 - Amazon S3

Bucket policies are to give permissions to the bucket and the object stored inside, so this road won't yield the results you are looking for. The other answers modify the object metadata using automated means, but you can also use Lambda@Edge if you are willing to move the bucket behind CloudFront.

With Lambda@Edge you can run arbitrary code for each client request and it can change the headers returned from the origin (S3 bucket in this case). It requires a bit more configuration and it costs some money, but here's a blueprint of the solution:

  • create a CloudFront distribution
  • add the S3 bucket as the origin
  • create a lambda function that modifies the response header
  • use the CloudFront distribution's URL to access the files

The AWS documentation has an example how to modify response headers. If you happen to use Terraform to manage the infrastructure I've written an article how to do it.

Solution 8 - Amazon S3

To those attempting to use Dan's answer and getting the error:

> "Unknown options: --metadata-directive, REPLACE"

I ran into the issue, and the problem was that I installed awscli using

> sudo apt-get install awscli

This installed an old version of the awscli which is missing the --metadata-directive command. So I used sudo apt-get remove awscli to remove it.

Then reinstalled following the procedure from amazon: http://docs.aws.amazon.com/streams/latest/dev/kinesis-tutorial-cli-installation.html

The only difference is that I had to use sudo -H because of permission issues which others might run into also.

Solution 9 - Amazon S3

Previous answers either don't really correspond with the question or incur a cost (Lambda).

What you should do is to set "cache-control" header when you upload the file (PutObject or MultiPartUpload).

Depending on your language, it can be somewhat different. The documentation is not very clear (as presumably AWS hopes you would pay them with the other solutions).

An example with PHP:

$uploader = new MultipartUploader ($s3,$filename,[
    ...,
    'before_initiate' => function(\Aws\Command $command){
        $command['CacheControl'] = 'max-age=31536000,public';
    },
...
]);

Another example with Go:

cc := "max-age=31536000,public"
input := &s3.PutObjectInput{
	...,
	CacheControl: &cc,
}

Solution 10 - Amazon S3

Figured I'd share my usage since previous answers misled me. Only two commands with AWS CLI:

aws s3 cp s3://bucketname/ s3://bucketname/ --cache-control max-age=12345 --recursive

That's it for already existing stuff, using cp. Setting --cache-control like that is a valid option.

If you are uploading you might as well sync, for which the command is:

aws s3 sync z:\source\folder s3://bucketname/folder --delete --cache-control max-age=12345 --acl public-read

Notice that I do not use --metadata-directive AT ALL, since by using it you'll lose your guessed content types which will make stuff like images not display by a browser but get downloaded instantly. My solution preserves the guessed value, and allows the guessing with the sync.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionthattommyhallView Question on Stackoverflow
Solution 1 - Amazon S3Dan WilliamsView Answer on Stackoverflow
Solution 2 - Amazon S3biplobView Answer on Stackoverflow
Solution 3 - Amazon S3ashishyadaveee11View Answer on Stackoverflow
Solution 4 - Amazon S3roensView Answer on Stackoverflow
Solution 5 - Amazon S3Geoff ApplefordView Answer on Stackoverflow
Solution 6 - Amazon S3Ibrahim Bou NcoulaView Answer on Stackoverflow
Solution 7 - Amazon S3Tamás SallaiView Answer on Stackoverflow
Solution 8 - Amazon S3eliotRosewaterView Answer on Stackoverflow
Solution 9 - Amazon S3Jiulin TengView Answer on Stackoverflow
Solution 10 - Amazon S3Firsh - justifiedgrid.comView Answer on Stackoverflow