AWS S3: How to check if a file exists in a bucket using bash
BashAmazon Web-ServicesAmazon S3Bash Problem Overview
I'd like to know if it's possible to check if there are certain files in a certain bucket.
This is what I've found:
https://stackoverflow.com/questions/17455782/checking-if-a-file-is-in-a-s3-bucket-using-the-s3cmd
> It should fix my problem, but for some reason it keeps returning that the file doesn't exist, while it does. This solution is also a little dated and doesn't use the doesObjectExist
method.
Summary of all the methods that can be used in the Amazon S3 web service
> This gives the syntax of how to use this method, but I can't seem to make it work.
Do they expect you to make a boolean variable to save the status of the method, or does the function directly give you an output / throw an error?
This is the code I'm currently using in my bash script:
existBool=doesObjectExist(${BucketName}, backup_${DomainName}_${CurrentDate}.zip)
if $existBool ; then
echo 'No worries, the file exists.'
fi
I tested it using only the name of the file, instead of giving the full path. But since the error I'm getting is a syntax error, I'm probably just using it wrong.
Hopefully someone can help me out and tell me what I'm doing wrong.
!Edit
I ended up looking for another way to do this since using doesObjectExist
isn't the fastest or easiest.
Bash Solutions
Solution 1 - Bash
Last time I saw performance comparisons getObjectMetadata
was the fastest way to check if an object exists. Using the AWS cli that would be the head-object
method, example:
aws s3api head-object --bucket www.codeengine.com --key index.html
which returns:
{
"AcceptRanges": "bytes",
"ContentType": "text/html; charset=utf-8",
"LastModified": "Sun, 08 Jan 2017 22:49:19 GMT",
"ContentLength": 38106,
"ContentEncoding": "gzip",
"ETag": "\"bda80810592763dcaa8627d44c2bf8bb\"",
"StorageClass": "REDUCED_REDUNDANCY",
"CacheControl": "no-cache, no-store",
"Metadata": {}
}
Solution 2 - Bash
Following to @DaveMaple & @MichaelGlenn answers, here is the condition I'm using:
aws s3api head-object --bucket <some_bucket> --key <some_key> || not_exist=true
if [ $not_exist ]; then
echo "it does not exist"
else
echo "it exists"
fi
Solution 3 - Bash
Note that "aws s3 ls" does not quite work, even though the answer was accepted. It searches by prefix, not by a specific object key. I found this out the hard way when someone renamed a file by adding a '1' to the end of the filename, and the existence check would still return True.
(Tried to add this as a comment, but do not have enough rep yet.)
Solution 4 - Bash
One simple way is using aws s3 ls
exists=$(aws s3 ls $path_to_file)
if [ -z "$exists" ]; then
echo "it does not exist"
else
echo "it exists"
fi
Solution 5 - Bash
I usually use set -eufo pipefail
and the following works better for me because I do not need to worry about unset variables or the entire script exiting.
object_exists=$(aws s3api head-object --bucket $bucket --key $key || true)
if [ -z "$object_exists" ]; then
echo "it does not exist"
else
echo "it exists"
fi
Solution 6 - Bash
This statement will return a true
or false
response:
aws s3api list-objects-v2 \
--bucket <bucket_name> \
--query "contains(Contents[].Key, '<object_name>')"
So, in case of the example provided in the question:
aws s3api list-objects-v2 \
--bucket ${BucketName} \
--query "contains(Contents[].Key, 'backup_${DomainName}_${CurrentDate}.zip')"
I like this approach, because:
-
The --query option uses the JMESPath syntax for client-side filtering and it is well documented here how to use it.
-
Since the --query option is build into the aws cli, no additional dependencies need to be installed.
-
You can first run the command without the --query option, like:
aws s3api list-objects-v2 --bucket <bucket_name>
That returns a nicely formatted JSON, something like:
> { > "Contents": [ > { > "Key": "my_file_1.tar.gz", > "LastModified": "----", > "ETag": ""-----"", > "Size": -----, > "StorageClass": "------" > }, > { > "Key": "my_file_2.txt", > "LastModified": "----", > "ETag": ""----"", > "Size": ----, > "StorageClass": "----" > }, > ... > ] > }
-
This then allows you to design an appropriate query. In this case you want to check if the JSON contains a list
Contents
and that an item in that list has aKey
equal to your file (object) name:--query "contains(Contents[].Key, '<object_name>')"
Solution 7 - Bash
From awscli, we do a ls along with a grep.
Example: aws s3 ls s3://
This can be included in the bash script.
Solution 8 - Bash
Inspired by the answers above, I use this to also check the file size, because my bucket was trashed by some script with a 404 answers. It requires jq
tho.
minsize=100
s3objhead=$(aws s3api head-object \
--bucket "$BUCKET" --key "$KEY"
--output json || echo '{"ContentLength": 0}')
if [ $(printf "%s" "$s3objhead" | jq '.ContentLength') -lt "$minsize" ]; then
# missing or small
else
# exist and big
fi
Solution 9 - Bash
A simpler solution, but not as sophisticated as other aws s3 api's is to use the exit code
aws s3 ls <full path to object>
Returns a non-zero return code if the object doesn't exist. 0 if it exists.