Open S3 object as a string with Boto3

PythonAmazon S3BotoBoto3

Python Problem Overview


I'm aware that with Boto 2 it's possible to open an S3 object as a string with: get_contents_as_string()

Is there an equivalent function in boto3 ?

Python Solutions


Solution 1 - Python

read will return bytes. At least for Python 3, if you want to return a string, you have to decode using the right encoding:

import boto3

s3 = boto3.resource('s3')

obj = s3.Object(bucket, key)
obj.get()['Body'].read().decode('utf-8') 

Solution 2 - Python

I had a problem to read/parse the object from S3 because of .get() using Python 2.7 inside an AWS Lambda.

I added json to the example to show it became parsable :)

import boto3
import json

s3 = boto3.client('s3')

obj = s3.get_object(Bucket=bucket, Key=key)
j = json.loads(obj['Body'].read())

NOTE (for python 2.7): My object is all ascii, so I don't need .decode('utf-8')

NOTE (for python 3.6+): We moved to python 3.6 and discovered that read() now returns bytes so if you want to get a string out of it, you must use:

j = json.loads(obj['Body'].read().decode('utf-8'))

Solution 3 - Python

This isn't in the boto3 documentation. This worked for me:

object.get()["Body"].read()

object being an s3 object: http://boto3.readthedocs.org/en/latest/reference/services/s3.html#object

Solution 4 - Python

Python3 + Using boto3 API approach.

By using S3.Client.download_fileobj API and Python file-like object, S3 Object content can be retrieved to memory.

Since the retrieved content is bytes, in order to convert to str, it need to be decoded.

import io
import boto3

client = boto3.client('s3')
bytes_buffer = io.BytesIO()
client.download_fileobj(Bucket=bucket_name, Key=object_key, Fileobj=bytes_buffer)
byte_value = bytes_buffer.getvalue()
str_value = byte_value.decode() #python3, default decoding is utf-8

Solution 5 - Python

Decoding the whole object body to one string:

obj = s3.Object(bucket, key).get()
big_str = obj["Body"].read().decode("utf-8")

Decoding the object body to strings line-by-line:

obj = s3.Object(bucket, key).get()
reader = csv.reader(line.decode("utf-8") for line in obj["Body"].iter_lines())

When decoding as JSON, no need to convert to string, as json.loads accepts bytes too, since Python 3.6:

obj = s3.Object(bucket, key).get()
json.loads(obj["Body"].read())

Solution 6 - Python

If body contains a io.StringIO, you have to do like below:

object.get()['Body'].getvalue()

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionGahl LevyView Question on Stackoverflow
Solution 1 - PythonKamil SindiView Answer on Stackoverflow
Solution 2 - PythonEvgenyKolyakovView Answer on Stackoverflow
Solution 3 - PythonGahl LevyView Answer on Stackoverflow
Solution 4 - PythonGatsby LeeView Answer on Stackoverflow
Solution 5 - PythonericbnView Answer on Stackoverflow
Solution 6 - PythonPyglouthonView Answer on Stackoverflow