Binary buffer in Python

PythonBinaryIoBuffer

Python Problem Overview


In Python you can use StringIO for a file-like buffer for character data. Memory-mapped file basically does similar thing for binary data, but it requires a file that is used as the basis. Does Python have a file object that is intended for binary data and is memory only, equivalent to Java's ByteArrayOutputStream?

The use-case I have is I want to create a ZIP file in memory, and ZipFile requires a file-like object.

Python Solutions


Solution 1 - Python

You are probably looking for io.BytesIO class. It works exactly like StringIO except that it supports binary data:

from io import BytesIO
bio = BytesIO(b"some initial binary data: \x00\x01")

StringIO will throw TypeError:

from io import StringIO
sio = StringIO(b"some initial binary data: \x00\x01")

Solution 2 - Python

As long as you don't try to put any unicode data into your StringIO and you are careful NOT to use cStringIO you should be fine.

According to the StringIO documentation, as long as you keep to either unicode or 8-bits everything works as expected. Presumably, StringIO does something special when someone does a f.write(u"asdf") (which ZipFile does not do, to my knowledge). Anyway;

import zipfile
import StringIO

s = StringIO.StringIO()
z = zipfile.ZipFile(s, "w")
z.write("test.txt")
z.close()
f = file("x.zip", "w")
f.write(s.getvalue())
s.close()
f.close()

works just as expected, and there's no difference between the file in the resulting archive and the original file.

If you know of a particular case where this approach does not work, I'd be most interested to hear about it :)

Solution 3 - Python

Look at the struct package: https://docs.python.org/library/struct.html, it allows you to interpret strings as packed binary data.

Not sure if this will completely answer your question but you can use struct.unpack() to convert binary data to python objects.


import struct
f = open(filename, "rb")
s = f.read(8)
x, y = struct.unpack(">hl", s)


int this example, the ">" tells to read big-endian the "h" reads a 2-byte short, and the "l" is for a 4-byte long. you can obviously change these to whatever you need to read out of the binary data...

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionjelovirtView Question on Stackoverflow
Solution 1 - PythonakhanView Answer on Stackoverflow
Solution 2 - PythonHenrik GustafssonView Answer on Stackoverflow
Solution 3 - PythonmmattaxView Answer on Stackoverflow