Serialize in JSON a base64 encoded data

JsonPython 3.xSerializationBase64

Json Problem Overview


I'm writing a script to automate data generation for a demo and I need to serialize in a JSON some data. Part of this data is an image, so I encoded it in base64, but when I try to run my script I get:

Traceback (most recent call last):
  File "lazyAutomationScript.py", line 113, in <module>
    json.dump(out_dict, outfile)
  File "/usr/lib/python3.4/json/__init__.py", line 178, in dump
    for chunk in iterable:
  File "/usr/lib/python3.4/json/encoder.py", line 422, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/usr/lib/python3.4/json/encoder.py", line 396, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.4/json/encoder.py", line 396, in _iterencode_dict
    yield from chunks
  File "/usr/lib/python3.4/json/encoder.py", line 429, in _iterencode
    o = _default(o)
  File "/usr/lib/python3.4/json/encoder.py", line 173, in default
    raise TypeError(repr(o) + " is not JSON serializable")
  TypeError: b'iVBORw0KGgoAAAANSUhEUgAADWcAABRACAYAAABf7ZytAAAABGdB...
     ...
   BF2jhLaJNmRwAAAAAElFTkSuQmCC' is not JSON serializable

As far as I know, a base64-encoded-whatever (a PNG image, in this case) is just a string, so it should pose to problem to serializating. What am I missing?

Json Solutions


Solution 1 - Json

You must be careful about the datatypes.

If you read a binary image, you get bytes. If you encode these bytes in base64, you get ... bytes again! (see documentation on b64encode)

json can't handle raw bytes, that's why you get the error.

I have just written some example, with comments, I hope it helps:

from base64 import b64encode
from json import dumps

ENCODING = 'utf-8'
IMAGE_NAME = 'spam.jpg'
JSON_NAME = 'output.json'

# first: reading the binary stuff
# note the 'rb' flag
# result: bytes
with open(IMAGE_NAME, 'rb') as open_file:
    byte_content = open_file.read()

# second: base64 encode read data
# result: bytes (again)
base64_bytes = b64encode(byte_content)

# third: decode these bytes to text
# result: string (in utf-8)
base64_string = base64_bytes.decode(ENCODING)

# optional: doing stuff with the data
# result here: some dict
raw_data = {IMAGE_NAME: base64_string}

# now: encoding the data to json
# result: string
json_data = dumps(raw_data, indent=2)

# finally: writing the json string to disk
# note the 'w' flag, no 'b' needed as we deal with text here
with open(JSON_NAME, 'w') as another_open_file:
    another_open_file.write(json_data)

Solution 2 - Json

Alternative solution would be encoding stuff on the fly with a custom encoder:

import json
from base64 import b64encode

class Base64Encoder(json.JSONEncoder):
    # pylint: disable=method-hidden
    def default(self, o):
        if isinstance(o, bytes):
            return b64encode(o).decode()
        return json.JSONEncoder.default(self, o)

Having that defined you can do:

m = {'key': b'\x9c\x13\xff\x00'}
json.dumps(m, cls=Base64Encoder)

It will produce:

'{"key": "nBP/AA=="}'

Solution 3 - Json

> What am I missing?

The error is yelling that a binary is not JSON serializable.

from base64 import b64encode

# *binary representation* of the base64 string
assert b64encode(b"binary content")                 == b'YmluYXJ5IGNvbnRlbnQ='

# base64 string
assert b64encode(b"binary content").decode('utf-8') ==  'YmluYXJ5IGNvbnRlbnQ='

The latter is definitely "JSON serializable" because is the base64 string representation of the binary b"binary content".

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionfrolloView Question on Stackoverflow
Solution 1 - JsonspkyView Answer on Stackoverflow
Solution 2 - JsonssubbotinView Answer on Stackoverflow
Solution 3 - JsonFilippo VitaleView Answer on Stackoverflow