AWS S3: How to check if a file exists in a bucket using bash

BashAmazon Web-ServicesAmazon S3

Bash Problem Overview


I'd like to know if it's possible to check if there are certain files in a certain bucket.

This is what I've found:

https://stackoverflow.com/questions/17455782/checking-if-a-file-is-in-a-s3-bucket-using-the-s3cmd

> It should fix my problem, but for some reason it keeps returning that the file doesn't exist, while it does. This solution is also a little dated and doesn't use the doesObjectExist method.

Summary of all the methods that can be used in the Amazon S3 web service

> This gives the syntax of how to use this method, but I can't seem to make it work.

Do they expect you to make a boolean variable to save the status of the method, or does the function directly give you an output / throw an error?

This is the code I'm currently using in my bash script:

existBool=doesObjectExist(${BucketName}, backup_${DomainName}_${CurrentDate}.zip)

if $existBool ; then
        echo 'No worries, the file exists.'
fi

I tested it using only the name of the file, instead of giving the full path. But since the error I'm getting is a syntax error, I'm probably just using it wrong.

Hopefully someone can help me out and tell me what I'm doing wrong.

!Edit

I ended up looking for another way to do this since using doesObjectExist isn't the fastest or easiest.

Bash Solutions


Solution 1 - Bash

Last time I saw performance comparisons getObjectMetadata was the fastest way to check if an object exists. Using the AWS cli that would be the head-object method, example:

aws s3api head-object --bucket www.codeengine.com --key index.html

which returns:

{
    "AcceptRanges": "bytes",
    "ContentType": "text/html; charset=utf-8",
    "LastModified": "Sun, 08 Jan 2017 22:49:19 GMT",
    "ContentLength": 38106,
    "ContentEncoding": "gzip",
    "ETag": "\"bda80810592763dcaa8627d44c2bf8bb\"",
    "StorageClass": "REDUCED_REDUNDANCY",
    "CacheControl": "no-cache, no-store",
    "Metadata": {}
}

Solution 2 - Bash

Following to @DaveMaple & @MichaelGlenn answers, here is the condition I'm using:

aws s3api head-object --bucket <some_bucket> --key <some_key> || not_exist=true
if [ $not_exist ]; then
  echo "it does not exist"
else
  echo "it exists"
fi

Solution 3 - Bash

Note that "aws s3 ls" does not quite work, even though the answer was accepted. It searches by prefix, not by a specific object key. I found this out the hard way when someone renamed a file by adding a '1' to the end of the filename, and the existence check would still return True.

(Tried to add this as a comment, but do not have enough rep yet.)

Solution 4 - Bash

One simple way is using aws s3 ls

exists=$(aws s3 ls $path_to_file)
if [ -z "$exists" ]; then
  echo "it does not exist"
else
  echo "it exists"
fi

Solution 5 - Bash

I usually use set -eufo pipefail and the following works better for me because I do not need to worry about unset variables or the entire script exiting.

object_exists=$(aws s3api head-object --bucket $bucket --key $key || true)
if [ -z "$object_exists" ]; then
  echo "it does not exist"
else
  echo "it exists"
fi

Solution 6 - Bash

This statement will return a true or false response:

aws s3api list-objects-v2 \
  --bucket <bucket_name> \
  --query "contains(Contents[].Key, '<object_name>')"

So, in case of the example provided in the question:

aws s3api list-objects-v2 \
  --bucket ${BucketName} \
  --query "contains(Contents[].Key, 'backup_${DomainName}_${CurrentDate}.zip')"

I like this approach, because:

  • The --query option uses the JMESPath syntax for client-side filtering and it is well documented here how to use it.

  • Since the --query option is build into the aws cli, no additional dependencies need to be installed.

  • You can first run the command without the --query option, like:

      aws s3api list-objects-v2 --bucket <bucket_name> 
    

    That returns a nicely formatted JSON, something like:

> { > "Contents": [ > { > "Key": "my_file_1.tar.gz", > "LastModified": "----", > "ETag": ""-----"", > "Size": -----, > "StorageClass": "------" > }, > { > "Key": "my_file_2.txt", > "LastModified": "----", > "ETag": ""----"", > "Size": ----, > "StorageClass": "----" > }, > ... > ] > }

  • This then allows you to design an appropriate query. In this case you want to check if the JSON contains a list Contents and that an item in that list has a Key equal to your file (object) name:

    --query "contains(Contents[].Key, '<object_name>')"
    

Solution 7 - Bash

From awscli, we do a ls along with a grep.

Example: aws s3 ls s3:// | grep 'filename'

This can be included in the bash script.

Solution 8 - Bash

Inspired by the answers above, I use this to also check the file size, because my bucket was trashed by some script with a 404 answers. It requires jq tho.

minsize=100
s3objhead=$(aws s3api head-object \
  --bucket "$BUCKET" --key "$KEY" 
  --output json || echo '{"ContentLength": 0}')

if [ $(printf "%s" "$s3objhead" | jq '.ContentLength') -lt "$minsize" ]; then
  # missing or small
else
  # exist and big
fi

Solution 9 - Bash

A simpler solution, but not as sophisticated as other aws s3 api's is to use the exit code

aws s3 ls <full path to object>

Returns a non-zero return code if the object doesn't exist. 0 if it exists.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJ. SwaelenView Question on Stackoverflow
Solution 1 - BashDave MapleView Answer on Stackoverflow
Solution 2 - BashItayBView Answer on Stackoverflow
Solution 3 - BashMichael GlennView Answer on Stackoverflow
Solution 4 - BashtraceformulaView Answer on Stackoverflow
Solution 5 - BashAmriView Answer on Stackoverflow
Solution 6 - BashArjaan BuijkView Answer on Stackoverflow
Solution 7 - BashSandyView Answer on Stackoverflow
Solution 8 - BashColinView Answer on Stackoverflow
Solution 9 - BashSoundararajanView Answer on Stackoverflow