Understanding Pickling in Python

PythonPicklePython 2.7

Python Problem Overview


I have recently got an assignment where I need to put a dictionary (where each key refers to a list) in pickled form. The only problem is I have no idea what pickled form is. Could anyone point me in the right direction of some good resources to help me learn this concept?

Python Solutions


Solution 1 - Python

The pickle module implements a fundamental, but powerful algorithm for serializing and de-serializing a Python object structure.

Pickling - is the process whereby a Python object hierarchy is converted into a byte stream, and Unpickling - is the inverse operation, whereby a byte stream is converted back into an object hierarchy.

Pickling (and unpickling) is alternatively known as serialization, marshalling, or flattening.

import pickle

data1 = {'a': [1, 2.0, 3, 4+6j],
         'b': ('string', u'Unicode string'),
         'c': None}

selfref_list = [1, 2, 3]
selfref_list.append(selfref_list)

output = open('data.pkl', 'wb')

# Pickle dictionary using protocol 0.
pickle.dump(data1, output)

# Pickle the list using the highest protocol available.
pickle.dump(selfref_list, output, -1)

output.close()

To read from a pickled file -

import pprint, pickle

pkl_file = open('data.pkl', 'rb')

data1 = pickle.load(pkl_file)
pprint.pprint(data1)

data2 = pickle.load(pkl_file)
pprint.pprint(data2)

pkl_file.close()

source - https://docs.python.org/2/library/pickle.html

Solution 2 - Python

Pickling is a mini-language that can be used to convert the relevant state from a python object into a string, where this string uniquely represents the object. Then (un)pickling can be used to convert the string to a live object, by "reconstructing" the object from the saved state founding the string.

>>> import pickle
>>> 
>>> class Foo(object):
...   y = 1
...   def __init__(self, x):
...     self.x = x
...     return
...   def bar(self, y):
...     return self.x + y
...   def baz(self, y):
...     Foo.y = y  
...     return self.bar(y)
... 
>>> f = Foo(2)
>>> f.baz(3)
5
>>> f.y
3
>>> pickle.dumps(f)
"ccopy_reg\n_reconstructor\np0\n(c__main__\nFoo\np1\nc__builtin__\nobject\np2\nNtp3\nRp4\n(dp5\nS'x'\np6\nI2\nsb."

What you can see here is that pickle doesn't save the source code for the class, but does store a reference to the class definition. Basically, you can almost read the picked string… it says (roughly translated) "call copy_reg's reconstructor where the arguments are the class defined by __main__.Foo and then do other stuff". The other stuff is the saved state of the instance. If you look deeper, you can extract that "string x" is set to "the integer 2" (roughly: S'x'\np6\nI2). This is actually a clipped part of the pickled string for a dictionary entry… the dict being f.__dict__, which is {'x': 2}. If you look at the source code for pickle, it very clearly gives a translation for each type of object and operation from python to pickled byte code.

Note also that there are different variants of the pickling language. The default is protocol 0, which is more human-readable. There's also protocol 2, shown below (and 1,3, and 4, depending on the version of python you are using).

>>> pickle.dumps([1,2,3])
'(lp0\nI1\naI2\naI3\na.'
>>> 
>>> pickle.dumps([1,2,3], -1)
'\x80\x02]q\x00(K\x01K\x02K\x03e.'

Again, it's still a dialect of the pickling language, and you can see that the protocol 0 string says "get a list, include I1, I2, I3", while the protocol 2 is harder to read, but says the same thing. The first bit \x80\x02 indicates that it's protocol 2 -- then you have ] which says it's a list, then again you can see the integers 1,2,3 in there. Again, check the source code for pickle to see the exact mapping for the pickling language.

To reverse the pickling to a string, use load/loads.

>>> p = pickle.dumps([1,2,3])
>>> pickle.loads(p)
[1, 2, 3]

Solution 3 - Python

Pickling is just serialization: putting data into a form that can be stored in a file and retrieved later. Here are the docs on the pickle module:

http://docs.python.org/release/2.7/library/pickle.html

Solution 4 - Python

http://docs.python.org/library/pickle.html#example

import pickle

data1 = {'a': [1, 2.0, 3, 4+6j],
         'b': ('string', u'Unicode string'),
         'c': None}

selfref_list = [1, 2, 3]
selfref_list.append(selfref_list)

output = open('data.pkl', 'wb')

# Pickle dictionary using protocol 0.
pickle.dump(data1, output)

# Pickle the list using the highest protocol available.
pickle.dump(selfref_list, output, -1)

output.close()

Solution 5 - Python

Pickling in Python is used to serialize and de-serialize Python objects, like dictionary in your case. I usually use cPickle module as it can be much faster than the Pickle module.

import cPickle as pickle    

def serializeObject(pythonObj):
    return pickle.dumps(pythonObj, pickle.HIGHEST_PROTOCOL)

def deSerializeObject(pickledObj):
    return pickle.loads(pickledObj)

Solution 6 - Python

Sometimes we want to save the objects to retrieve them later (Even after the Program that generated the data has terminated). Or we want to transmit the object to someone or something else outside our application. Pickle module is used for serializing and deserializing the object.

serializing object (Pickling): Create a representation of an object.
deserializing object (Unpickling): Re-load the object from representation.

dump: pickle to file
load: unpickle from file
dumps: returns a pickled representation. We can store it in a variable.
loads: unpickle from the supplied variable.

Example:

import pickle

print("Using dumps and loads to store it in variable")
list1 = [2, 4]
dict1 = {1: list1, 2: 'hello', 3: list1}
pickle_dict = pickle.dumps(dict1)
print(pickle_dict)

dict2 = pickle.loads(pickle_dict)
print(dict2)

# obj1==obj2 => True
# obj1 is obj2 => False

print(id(dict1.get(1)), id(dict1.get(3)))
print(id(dict2.get(1)), id(dict2.get(3)))
print("*" * 100)
print("Using dump and load to store it in File ")

cars = ["Audi", "BMW", "Maruti 800", "Maruti Suzuki"]
file_name = "mycar.pkl"
fileobj = open(file_name, 'wb')
pickle.dump(cars, fileobj)
fileobj.close();

file_name = "mycar.pkl"
fileobj = open(file_name, 'rb')
mycar = pickle.load(fileobj)
print(mycar)

Solution 7 - Python

Pickling allows you to serialize and de-serializing Python object structures. In short, Pickling is a way to convert a python object into a character stream so that this character stream contains all the information necessary to reconstruct the object in another python script.

import pickle

def pickle_data():
    data = {
           'name': 'sanjay',
           'profession': 'Software Engineer',
           'country': 'India'
        }
    filename = 'PersonalInfo'
    outfile = open(filename, 'wb')
    pickle.dump(data,outfile)
    outfile.close()

pickle_data()

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionSpencerView Question on Stackoverflow
Solution 1 - PythonSrikar AppalarajuView Answer on Stackoverflow
Solution 2 - PythonMike McKernsView Answer on Stackoverflow
Solution 3 - PythonTom ZychView Answer on Stackoverflow
Solution 4 - PythonJohn RiselvatoView Answer on Stackoverflow
Solution 5 - PythonRaunakView Answer on Stackoverflow
Solution 6 - PythonSandeep MakwanaView Answer on Stackoverflow
Solution 7 - PythonsanjayView Answer on Stackoverflow