Python Forum
Why does unpickling only work ouside of a function?
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Why does unpickling only work ouside of a function?
#1
I wrote a little test program to understand pickling and unpickling a custom class.

Pickling a custom class by passing it as a parameter to a function that does the pickling works just fine. Trying to unpickle in the same way does not work. You have to make the unpickling function return the unpickled data as the function value and assign the function result to the class instance outside of the function.

Why is that? Any explanation you can provide would be appreciated.

Example code and output for the non-working and working versions are below. System is WIn10-64, Python is 3.8.5.

Peter

Unpickling inside a function does not work:

tstpikl1.py:
import numpy as np


class Pkltest:
    def __init__(self):
        self.vect = np.full([5, 5], 0, dtype=np.int32)
        self.int1 = 0
        self.real1 = 0.0
        self.bool1 = False


def saveclass(class2pkl, savename):
    try:
        fp = open(savename, "wb")
    except IOError:
        print("Can't open file {}".format(savename))
        raise
    from pickle import dump
    dump(class2pkl, fp)
    fp.close()
    return


def restclass(class2lod, loadname):
    try:
        fp = open(loadname, "rb")
    except IOError:
        print("Can't open file {}".format(loadname))
        raise
    from pickle import load
    class2lod = load(fp)
    fp.close()
    return


savename = "classsave.pkl"
pktest1 = Pkltest()

pktest1.vect[1:-1, 1:-1] = 10
pktest1.int1 = 10
pktest1.real1 = 10
pktest1.bool1 = True

print("1)Before function save:\n{}\n".format(pktest1.__dict__))
saveclass(pktest1, savename)

pktest1.vect[1:-1, 1:-1] = 5
pktest1.int1 = 5
pktest1.real1 = 5
pktest1.bool1 = False
print("2)After modify:\n{}\n".format(pktest1.__dict__))

restclass(pktest1, savename)

print("3)After function load:\n{}\n".format(pktest1.__dict__))

try:
    fp = open(savename, "rb")
except IOError:
    print("Can't open file {}".format(savename))
    raise
from pickle import load
pktest1 = load(fp)
fp.close()

print("4)After inline load:\n{}\n".format(pktest1.__dict__))
Output from tstpikl1.py:
Output:
1)Before function save: {'vect': array([[ 0, 0, 0, 0, 0], [ 0, 10, 10, 10, 0], [ 0, 10, 10, 10, 0], [ 0, 10, 10, 10, 0], [ 0, 0, 0, 0, 0]]), 'int1': 10, 'real1': 10, 'bool1': True} 2)After modify: {'vect': array([[0, 0, 0, 0, 0], [0, 5, 5, 5, 0], [0, 5, 5, 5, 0], [0, 5, 5, 5, 0], [0, 0, 0, 0, 0]]), 'int1': 5, 'real1': 5, 'bool1': False} 3)After function load: {'vect': array([[0, 0, 0, 0, 0], [0, 5, 5, 5, 0], [0, 5, 5, 5, 0], [0, 5, 5, 5, 0], [0, 0, 0, 0, 0]]), 'int1': 5, 'real1': 5, 'bool1': False} 4)After inline load: {'vect': array([[ 0, 0, 0, 0, 0], [ 0, 10, 10, 10, 0], [ 0, 10, 10, 10, 0], [ 0, 10, 10, 10, 0], [ 0, 0, 0, 0, 0]]), 'int1': 10, 'real1': 10, 'bool1': True}
Unpickling outside of a function works:

tstpikl2.py:
import numpy as np


class Pkltest:
    def __init__(self):
        self.vect = np.full([5, 5], 0, dtype=np.int32)
        self.int1 = 0
        self.real1 = 0.0
        self.bool1 = False


def saveclass(class2pkl, savename):
    try:
        fp = open(savename, "wb")
    except IOError:
        print("Can't open file {}".format(savename))
        raise
    from pickle import dump
    dump(class2pkl, fp)
    fp.close()
    return


def restclass(loadname):
    try:
        fp = open(loadname, "rb")
    except IOError:
        print("Can't open file {}".format(loadname))
        raise
    from pickle import load
    class2lod = load(fp)
    fp.close()
    return class2lod


savename = "classsave.pkl"
pktest1 = Pkltest()

pktest1.vect[1:-1, 1:-1] = 10
pktest1.int1 = 10
pktest1.real1 = 10
pktest1.bool1 = True

print("1)Before function save:\n{}\n".format(pktest1.__dict__))
saveclass(pktest1, savename)

pktest1.vect[1:-1, 1:-1] = 5
pktest1.int1 = 5
pktest1.real1 = 5
pktest1.bool1 = False
print("2)After modify:\n{}\n".format(pktest1.__dict__))

pktest1 = restclass(savename)

print("3)After function load:\n{}\n".format(pktest1.__dict__))
Output from tstpikl2.py:
Output:
1)Before function save: {'vect': array([[ 0, 0, 0, 0, 0], [ 0, 10, 10, 10, 0], [ 0, 10, 10, 10, 0], [ 0, 10, 10, 10, 0], [ 0, 0, 0, 0, 0]]), 'int1': 10, 'real1': 10, 'bool1': True} 2)After modify: {'vect': array([[0, 0, 0, 0, 0], [0, 5, 5, 5, 0], [0, 5, 5, 5, 0], [0, 5, 5, 5, 0], [0, 0, 0, 0, 0]]), 'int1': 5, 'real1': 5, 'bool1': False} 3)After function load: {'vect': array([[ 0, 0, 0, 0, 0], [ 0, 10, 10, 10, 0], [ 0, 10, 10, 10, 0], [ 0, 10, 10, 10, 0], [ 0, 0, 0, 0, 0]]), 'int1': 10, 'real1': 10, 'bool1': True}
padma121 likes this post
Reply
#2
Example with pickle and json.

import pickle
import json


class Dummy:
    def __init__(self, name, value):
        """
        Minimal Dummy class to demonstrate functionality
        """
        self.name = name
        self.value = value

    def __repr__(self):
        return f"Dummy('{self.name}', {self.value})"

    @classmethod
    def from_pickle(cls, file):
        """
        Restore object from pickle file.
        """
        with open(file, "rb") as fd:
            return pickle.load(fd)

    @classmethod
    def from_json(cls, file):
        """
        Restore object from json file.
        """
        with open(file, "rb") as fd:
            data = json.load(fd)
            return cls(data["name"], data["value"])

    def save_pickle(self, file):
        """
        Save object to pickle file.
        """
        with open(file, "wb") as fd:
            pickle.dump(self, fd)

    def save_json(self, file):
        """
        This method saves only the attributes name and value.
        Use the method from_json to restore the object.
        """
        with open(file, "wt", encoding="utf8") as fd:
            json.dump({"name": self.name, "value": self.value}, fd)


if __name__ == "__main__":
    print("With pickle:")
    dummy1 = Dummy("foo", 42)
    dummy1.extra = "test"
    dummy1.save_pickle("dummy1.pickle")
    dummy2 = Dummy.from_pickle("dummy1.pickle")
    print(dummy1)
    print(dummy2)
    print(vars(dummy2))

    print()

    print("With json:")
    # json can not serialize user defined classes
    # the save_json and from_json methods are different
    dummy3 = Dummy("foo", 13)
    dummy3.extra = "test"  # this attribute is lost
    dummy3.save_json("dummy3.json")
    dummy4 = Dummy.from_json("dummy3.json")
    print(dummy3)
    print(dummy4)
    print(vars(dummy4))
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#3
(Dec-03-2020, 11:53 AM)DeaD_EyE Wrote: Example with pickle and json.
<Example code snipped>

Thanks for the response, but that does not answer my question. In your example you have this line in the "main" code:

dummy2 = Dummy.from_pickle("dummy1.pickle")

My question is why does the function have to RETURN the value of pickle.load(fp)? Why can't a class instance be passed as an argument to a function that does the pickle.load(fp) and be directly assigned the value returned from pickle.load(fp) INSIDE the function? I realize that probably isn't possible (or at least seems to me like it ought not to be possible) inside of a class method of the class being loaded, but it should be possible in a function defined outside of the class at a global level.

My understanding (perhaps incorrect) is that function arguments are passed by reference, not by value, and not a reference to an intermediate copy but a reference to the actual object. If that is the case, why can't a function just assign the value returned by pickle.load() to the argument reference inside the function thus changing the object pointed to by the passed class instance variable?

Peter
Reply
#4
Quote:My question is why does the function have to RETURN the value of pickle.load(fp)? Why can't a class instance be passed as an argument to a function that does the pickle.load(fp) and be directly assigned the value returned from pickle.load(fp) INSIDE the function?

The way I showed, is how to create a new instance from a class.
The from_pickle is an alternate method to __init__.
There are many examples in the standard library.


itertools.chain is a class. If you call this, the __init__ method is executed after creation of the instance. The __init__ method populates the object with attributes. Then there is the classmethod itertools.chain.from_iterable.

Classmethods takes the class as first argument. So instead of self, cls should be used as a placeholder. The decorator @classmethod converts the method into a classmethod.

    @classmethod # <- make a classmethod from_pickle
    def from_pickle(cls, file): # <- cls is the Class itself
        """
        Restore object from pickle file.
        """
        with open(file, "rb") as fd:
            return pickle.load(fd) # <- this is annoying
The last line is a bit annoying because you don't see what really is happening in the background.
I expect usually a call of the cls and not a direct return of something. It even could be a complete different object.
In the first place I take the already existing instance of the class and serialize it with pickle.
It's like doing a dump of the object in your memory, copy the Values, putting them into a structure, save it on disk.
The load(s) method, does the inverse. It reconstructs the object and then it's in your memory like before.

You can see this difference clearly with the from_json method. Json can't serialize classes and user defined objects. Only a subset of data types is supported. Instead of returning the object from json, I extract the data from the object, creating an instance with the cls() call and returning this new created instance.


I think the way you wanted to go is:
  • create instance of class
  • save it
  • continue using the same instance
  • loading it from disk
  • change the existing instance with the loaded data

My time is up. I'll post later an example with the other way.
Just look up what classmethod, staticmethod and property does.

The @ before the classmethod is a syntactic sugar.

@classmethod
def foo(cls):
    ...


# is the same like

def foo(cls):
    ...

foo = classmethod(foo)
classmethod returns the modified function object foo.
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Reply
#5
(Dec-04-2020, 07:19 AM)DeaD_EyE Wrote:
Quote:My question is why does the function have to RETURN the value of pickle.load(fp)? Why can't a class instance be passed as an argument to a function that does the pickle.load(fp) and be directly assigned the value returned from pickle.load(fp) INSIDE the function?
The way I showed, is how to create a new instance from a class.
The from_pickle is an alternate method to __init__.
There are many examples in the standard library.

<Snipped>

I think the way you wanted to go is:
  • create instance of class
  • save it
  • continue using the same instance
  • loading it from disk
  • change the existing instance with the loaded data
<Snipped>

That is exactly what I am trying to model. Save the instance to disk, perhaps continue to use it for some purpose, then reload the saved version of the instance and restore the current instance to the same state as the saved instance.

Your examples seem to assume that I want the "save" and "restore" functions to be methods of the class I am saving, but that is not the case. I expect the "save class" and "reload class" functionality to be done in globally defined ordinary functions that can be used to save or restore any custom class at all, not just one class, so the save and restore operations should not be methods of any class. What I originally tried to do with the "non-working" example was to assign to the function argument passed in to the "reload class" function (which was the current instance of the class) the result of the pickle.load(fp) value, but that does not work and I do not understand why it does not. It just bothers me that I have to return the unpickled data and assign it to the class instance at the outer level. If that works, why does assigning inside the function not work?

I believe I do understand the usefulness of staticmethod's, though the use of the property decorator has yet to seem useful in the programming I have been doing thus far.

I will reread the official documentation of those three again however and see if any new comprehension comes to me.

Regards, and thanks for your replies.

Peter
Reply
#6
I have found the answer to my basic question in this tutorial:

Pass by Reference in Python

Understanding that function arguments are passed by *assignment* and not, as I incorrectly thought, by reference (i.e., by passing a pointer to an object in the classic C language sense) makes the necessity to return the unpickled value clear to me.

The function argument name is ASSIGNED the object value of the passed argument, but changing the value of the function argument name within the function only changes the LOCAL variable represeted by that argument name, not the original name or object passed into the function.

The only way to get the new, unpickled value back to the function caller is to return the unpickled value itself to where it can be assigned to the original variable passed into the function.

It could also be done by using a "global" statement in the function and directly assigning to a specific external name, but I can see many reasons why that is frowned upon. It requires that the function know the right "global" name to which to assign the returning value, and for a generic "unpickle" function that would be almost completely untenable, and at the very least a maintenance headache.

Tricky stuff, so I am replying to myself so that others may learn as I have.

Peter
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  print doesnt work in a function ony 2 233 Mar-11-2024, 12:42 PM
Last Post: Pedroski55
  I dont know why my function won't work? MehHz2526 3 1,150 Nov-28-2022, 09:32 PM
Last Post: deanhystad
  time function does not work tester_V 4 2,946 Oct-17-2021, 05:48 PM
Last Post: tester_V
  write new function or change the old one to work "smartter? korenron 3 1,925 Aug-09-2021, 10:36 AM
Last Post: jamesaarr
  string function doesn't work in script ClockPillow 3 2,333 Jul-13-2021, 02:47 PM
Last Post: deanhystad
  Cannot unpickle tcp data? unpickling stack underflow portafreak 4 4,144 Nov-27-2020, 02:20 AM
Last Post: bowlofred
  len() function, numbers doesn't work with Geany Editor Penguin827 3 2,938 May-08-2020, 04:08 AM
Last Post: buran
  Powerset function alternative does not work oClaerbout 1 1,959 Feb-11-2020, 11:34 AM
Last Post: Larz60+
  why my function doesn't work cimerio 4 2,828 Jan-20-2020, 08:11 PM
Last Post: cimerio
  Doesn't work function pyautogui.typewrite() aliyevmiras 1 4,755 Dec-22-2019, 11:35 AM
Last Post: aliyevmiras

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020