Check file size on S3 without downloading?

Amazon S3

Amazon S3 Problem Overview


I have customer files uploaded to Amazon S3, and I would like to add a feature to count the size of those files for each customer. Is there a way to "peek" into the file size without downloading them? I know you can view from the Amazon control panel but I need to do it pro grammatically.

Amazon S3 Solutions


Solution 1 - Amazon S3

Send an HTTP HEAD request to the object. A HEAD request will retrieve the same HTTP headers as a GET request, but it will not retrieve the body of the object (saving you bandwidth). You can then parse out the Content-Length header value from the HTTP response headers.

Solution 2 - Amazon S3

Node.js example:

const AWS = require('aws-sdk');
const s3 = new AWS.S3();

function sizeOf(key, bucket) {
    return s3.headObject({ Key: key, Bucket: bucket })
        .promise()
        .then(res => res.ContentLength);
}


// A test
sizeOf('ahihi.mp4', 'output').then(size => console.log(size));

Doc is here.

Solution 3 - Amazon S3

You can simply use the s3 ls command:

aws s3 ls s3://mybucket --recursive --human-readable --summarize

Outputs

2013-09-02 21:37:53   10 Bytes a.txt
2013-09-02 21:37:53  2.9 MiB foo.zip
2013-09-02 21:32:57   23 Bytes foo/bar/.baz/a
2013-09-02 21:32:58   41 Bytes foo/bar/.baz/b
2013-09-02 21:32:57  281 Bytes foo/bar/.baz/c
2013-09-02 21:32:57   73 Bytes foo/bar/.baz/d
2013-09-02 21:32:57  452 Bytes foo/bar/.baz/e
2013-09-02 21:32:57  896 Bytes foo/bar/.baz/hooks/bar
2013-09-02 21:32:57  189 Bytes foo/bar/.baz/hooks/foo
2013-09-02 21:32:57  398 Bytes z.txt

Total Objects: 10
   Total Size: 2.9 MiB

Reference: https://docs.aws.amazon.com/cli/latest/reference/s3/ls.html

Solution 4 - Amazon S3

This is a solution for whoever is using Java and the S3 java library provided by Amazon. If you are using com.amazonaws.services.s3.AmazonS3 you can use a GetObjectMetadataRequest request which allows you to query the object length.

The libraries you have to use are:

<!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-s3 -->
<dependency>
    <groupId>com.amazonaws</groupId>
    <artifactId>aws-java-sdk-s3</artifactId>
    <version>1.11.511</version>
</dependency>

Imports:

import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;
import com.amazonaws.services.s3.model.*;

And the code you need to get the content length:

GetObjectMetadataRequest metadataRequest = new GetObjectMetadataRequest(bucketName, fileName);
final ObjectMetadata objectMetadata = s3Client.getObjectMetadata(metadataRequest);
long contentLength = objectMetadata.getContentLength();

Before you can execute the code above, you will need to build the S3 client. Here is some example code for that:

AWSCredentials credentials = new BasicAWSCredentials(
            accessKey,
            secretKey
);
s3Client = AmazonS3ClientBuilder.standard()
            .withRegion(clientRegion)
            .withCredentials(new AWSStaticCredentialsProvider(credentials))
            .build();

Solution 5 - Amazon S3

Using Michael's advice, my successful code looked like this:

require 'net/http'
require 'uri'

file_url = MyObject.first.file.url

url = URI.parse(file_url)
req = Net::HTTP::Head.new url.path
res = Net::HTTP.start(url.host, url.port) {|http|
  http.request(req)
}

file_length = res["content-length"]

Solution 6 - Amazon S3

.NET AWS SDK ---- ListObjectsRequest, ListObjectsResponse, S3Object

AmazonS3Client s3 = new AmazonS3Client();
SpaceUsed(s3, "putBucketNameHere");

static void SpaceUsed(AmazonS3Client s3Client, string bucketName)
    {
        ListObjectsRequest request = new ListObjectsRequest();
        request.BucketName = bucketName;
        ListObjectsResponse response = s3Client.ListObjects(request);
        long totalSize = 0;
        foreach (S3Object o in response.S3Objects)
        {
            totalSize += o.Size;
        }
        Console.WriteLine("Total Size of bucket " + bucketName + " is " +
            Math.Round(totalSize / 1024.0 / 1024.0, 2) + " MB");
    }

Solution 7 - Amazon S3

I do something like this in Python to get the cumulative size of all files under a given prefix:

import boto3

bucket = 'your-bucket-name'
prefix = 'some/s3/prefix/'

s3 = boto3.client('s3')

size = 0

result = s3.list_objects_v2(Bucket=bucket, Prefix=prefix)
size += sum([x['Size'] for x in result['Contents']])

while result['IsTruncated']:
    result = s3.list_objects_v2(
        Bucket=bucket, Prefix=prefix,
        ContinuationToken=result['NextContinuationToken'])
    size += sum([x['Size'] for x in result['Contents']])

print('Total size in MB: ' + str(size / (1000**2)))

Solution 8 - Amazon S3

There is better solution.

$info = $s3->getObjectInfo($yourbucketName, $yourfilename);
print $info['size'];

Solution 9 - Amazon S3

You can also do a listing of the contents of the bucket. The metadata in the listing contains the file sizes of all of the objects. This is how it's implemented in the AWS SDK for PHP.

Solution 10 - Amazon S3

Android Solution

Integrate aws sdk and you get a pretty much straight forward solution:

// ... put this in background thread
List<S3ObjectSummary> s3ObjectSummaries;
s3ObjectSummaries = s3.listObjects(registeredBucket).getObjectSummaries();
for (int i = 0; i < s3ObjectSummaries.size(); i++) {
    S3ObjectSummary s3ObjectSummary = s3ObjectSummaries.get(i);
    Log.d(TAG, "doInBackground: size " + s3ObjectSummary.getSize());
}
  • Here is a link to the official documentation.
  • Very important to execute the code in AsyncTask or any means to get you in a background thread, otherwise you get an exception for running network on ui thread.

Solution 11 - Amazon S3

The following python code will provide the size of top 1000 files printing them individually from s3:

import boto3

bucket = 'bucket_name'
prefix = 'prefix'

s3 = boto3.client('s3')
contents = s3.list_objects_v2(Bucket=bucket,  MaxKeys=1000, Prefix=prefix)['Contents']

for c in contents:
    print('Size (KB):', float(c['Size'])/1000)

Solution 12 - Amazon S3

Ruby solution with head_object:

require 'aws-sdk-s3'

s3 = Aws::S3::Client.new(
  region:               'us-east-1',     #or any other region
  access_key_id:        AWS_ACCESS_KEY_ID,
  secret_access_key:    AWS_SECRET_ACCESS_KEY
)

res = s3.head_object(bucket: bucket_name, key: object_key)
file_size = res[:content_length]

Solution 13 - Amazon S3

PHP code to check s3 object size (or any other object headers), notice the use stream_context_set_default to make sure it only uses a HEAD request

stream_context_set_default(
		    array(
		        'http' => array(
		            'method' => 'HEAD'
		        )
		    )
		);
		
$headers = get_headers('http://s3.amazonaws.com/bucketname/filename.jpg', 1);
$headers = array_change_key_case($headers);	

$size = trim($headers['content-length'],'"'); 

Solution 14 - Amazon S3

Golang example, same principle, run head request again the object in question:

func returnKeySizeInMB(bucketName string, key string) {
	output, err := svc.HeadObject(
		&s3.HeadObjectInput{
			Bucket: aws.String(bucketName),
			Key:    aws.String(key),
		})
	if err != nil {
		log.Fatalf("Unable to to send head request to item %q, %v", e.Detail.RequestParameters.Key, err)
	}

	return int(*output.ContentLength / 1024 / 1024)
}

Here, the parameter key means the path to the file.

For eg, if the URI of the file is S3://my-personal-bucket/folder1/subfolder1/myfile.pdf, then the syntax would look like:

output, err := svc.HeadObject(
		&s3.HeadObjectInput{
			Bucket: aws.String("my-personal-bucket"),
			Key:    aws.String("folder1/subfolder1/myfile.pdf"),
		})

Solution 15 - Amazon S3

Aws C++ solution to get file size

//! Step 1: create s3 client
Aws::S3::S3Client s3Client(cred, config); //!Used cred & config,You can use other options.

//! Step 2: Head Object request
Aws::S3::Model::HeadObjectRequest headObj;
headObj.SetBucket(bucket);
headObj.SetKey(key);

//! Step 3: read size from object header metadata
auto object = s3Client.HeadObject(headObj);
if (object.IsSuccess())
{
    fileSize = object.GetResultWithOwnership().GetContentLength();
}
else
{
    std::cout << "Head Object error: "
        << object .GetError().GetExceptionName() << " - "
		<< object .GetError().GetMessage() << std::endl;
}

Note: Do not use GetObject to extract size, It reads file to extract information.

Solution 16 - Amazon S3

If the file is a private one, we can get the header by SDK.

PHP example:

$head = $client->headObject(
 [
   'Bucket' => $bucket,
   'Key' => $key,
 ]
);
$result = (int) ($head->get('ContentLength') ?? 0);

Solution 17 - Amazon S3

These days you could also use Amazon S3 Inventory which gives you:

> Size – The object size in bytes.

Solution 18 - Amazon S3

If you are looking to do this with a single file, you can use aws cli head-object to get the metadata only without downloading the file itself:

$ aws s3api head-object --bucket mybucket --key myfile.csv | jq -r .ContentLength

Explanation

  • s3api head-object retrieves the object metadata in json format
  • jq -r .ContentLength parses the json to get the size of the body in bytes; the -r flag removes quotation marks.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionycseattleView Question on Stackoverflow
Solution 1 - Amazon S3Michael DowlingView Answer on Stackoverflow
Solution 2 - Amazon S3Ninh PhamView Answer on Stackoverflow
Solution 3 - Amazon S3Kyle BridenstineView Answer on Stackoverflow
Solution 4 - Amazon S3gil.fernandesView Answer on Stackoverflow
Solution 5 - Amazon S3LennonRView Answer on Stackoverflow
Solution 6 - Amazon S3Stephen CView Answer on Stackoverflow
Solution 7 - Amazon S3matt2000View Answer on Stackoverflow
Solution 8 - Amazon S3Ronak PatelView Answer on Stackoverflow
Solution 9 - Amazon S3Ryan ParmanView Answer on Stackoverflow
Solution 10 - Amazon S3hannunehgView Answer on Stackoverflow
Solution 11 - Amazon S3tahir siddiquiView Answer on Stackoverflow
Solution 12 - Amazon S3kliView Answer on Stackoverflow
Solution 13 - Amazon S3Ludo - Off the recordView Answer on Stackoverflow
Solution 14 - Amazon S3Jonny RimekView Answer on Stackoverflow
Solution 15 - Amazon S3AtomView Answer on Stackoverflow
Solution 16 - Amazon S3Denis ViunykView Answer on Stackoverflow
Solution 17 - Amazon S3MarcinView Answer on Stackoverflow
Solution 18 - Amazon S3enharmonicView Answer on Stackoverflow