Reading integers from binary file in Python

PythonFileBinaryInteger

Python Problem Overview


I'm trying to read a BMP file in Python. I know the first two bytes indicate the BMP firm. The next 4 bytes are the file size. When I execute:

fin = open("hi.bmp", "rb")
firm = fin.read(2)  
file_size = int(fin.read(4))  

I get:

> ValueError: invalid literal for int() with base 10: 'F#\x13'

What I want to do is reading those four bytes as an integer, but it seems Python is reading them as characters and returning a string, which cannot be converted to an integer. How can I do this correctly?

Python Solutions


Solution 1 - Python

The read method returns a sequence of bytes as a string. To convert from a string byte-sequence to binary data, use the built-in struct module: <http://docs.python.org/library/struct.html>;.

import struct

print(struct.unpack('i', fin.read(4)))

Note that unpack always returns a tuple, so struct.unpack('i', fin.read(4))[0] gives the integer value that you are after.

You should probably use the format string '<i' (< is a modifier that indicates little-endian byte-order and standard size and alignment - the default is to use the platform's byte ordering, size and alignment). According to the BMP format spec, the bytes should be written in Intel/little-endian byte order.

Solution 2 - Python

An alternative method which does not make use of 'struct.unpack()' would be to use NumPy:

import numpy as np

f = open("file.bin", "r")
a = np.fromfile(f, dtype=np.uint32)

'dtype' represents the datatype and can be int#, uint#, float#, complex# or a user defined type. See numpy.fromfile.

Personally prefer using NumPy to work with array/matrix data as it is a lot faster than using Python lists.

Solution 3 - Python

As of Python 3.2+, you can also accomplish this using the from_bytes native int method:

file_size = int.from_bytes(fin.read(2), byteorder='big')

Note that this function requires you to specify whether the number is encoded in big- or little-endian format, so you will have to determine the endian-ness to make sure it works correctly.

Solution 4 - Python

Except struct you can also use array module

import array
values = array.array('l') # array of long integers
values.read(fin, 1) # read 1 integer
file_size  = values[0]

Solution 5 - Python

As you are reading the binary file, you need to unpack it into a integer, so use struct module for that

import struct
fin = open("hi.bmp", "rb")
firm = fin.read(2)  
file_size, = struct.unpack("i",fin.read(4))

Solution 6 - Python

When you read from a binary file, a data type called bytes is used. This is a bit like list or tuple, except it can only store integers from 0 to 255.

Try:

file_size = fin.read(4)
file_size0 = file_size[0]
file_size1 = file_size[1]
file_size2 = file_size[2]
file_size3 = file_size[3]

Or:

file_size = list(fin.read(4))

Instead of:

file_size = int(fin.read(4))

Solution 7 - Python

Here's a late solution but I though it might help.

fin = open("hi.bmp", "rb")
firm = fin.read(2)
file_size = 0
for _ in range(4):  
    (file_size << 8) += ord(fin.read(1))

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionManuel AraozView Question on Stackoverflow
Solution 1 - PythoncodeapeView Answer on Stackoverflow
Solution 2 - PythonEmanuel EyView Answer on Stackoverflow
Solution 3 - PythonCrepeGoatView Answer on Stackoverflow
Solution 4 - PythonNick DandoulakisView Answer on Stackoverflow
Solution 5 - PythonAnurag UniyalView Answer on Stackoverflow
Solution 6 - PythonProgrammer SView Answer on Stackoverflow
Solution 7 - PythonMediumSpringGreenView Answer on Stackoverflow