Python Forum
Help with object interning in Python 3
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Help with object interning in Python 3
#1


Just so we are on the same page, interning refers to the re-use or recycling of already created objects. So, if I ask for an object that has a particular signature already created, return that instead of creating a brand new one. In essence Python does this when it creates immutable objects.

I tried a few different methods for accomplishing this, but I decided on using metaclasses:

class MetaRegex(type):
    _instances = {}
    def __call__(cls, *args, **kwargs):
        key = cls.make_key(*args)

        if key in cls._instances:
            obj = cls._instances[key]
            obj.refcount += 1
        else:
            obj = cls._instances[key] = \
                super(MetaRegex, cls).__call__(*args, key=key, **kwargs)
        return obj
This works fine. It behaves as it's intended.

However, there are certain optimizations I perform within __new()__ of my derived classes. For example, if I'm creating an OR expression and either the left or right side is 0 (empty set), I can simplify the expression by returning either the left or right non-null child and skip creating this particular tree.

That works great except for Python's policy of:

https://docs.python.org/3/reference/data...ct.__new__

If __new__() returns an instance of cls, then the new instance’s __init()__ method will be invoked like __init__(self[, ...]), where self is the new instance and the remaining arguments are the same as were passed to __new__().

This behavior is BAD for me because if I return an already created object, it will call __init__() and proceed to screw up my object by re-initializing it's members, completely defeating the optimization.

To mitigate this, I would be forced to add silly protective code with hasattr(self, 'someattr') to protect all the initializers of my derived classes.

Is there a clean way to get around this? I can't be the only person wanting to do this kind of thing.
Reply
#2
Could you provide a minimal code snippet that reproduces the problem? Your code uses __call__ but your question seems to be about __new__.
Reply
#3
I have heard string constants (or immutable strings) called 'interned', back in the late 1960's up until some pseudo genius decided
that they were interned (because they were constant and couldn't be mutated). To me, constant describes what they are, and
should have been left alone, but so moves the world.

I seem to recall that collections as in symbol table was also involved somehow.
Reply
#4
Apologies for the confusion. Here's a simple snippet which demonstrates the problem. The code creating the tree is contrived (one of the arguments is None), but imagine we arrived at such a tree request through a series of calculations and simplifications. In this case, we don't want to create a new trivial tree with one of the branches being None. We'd just return the left or right branch. And because we are returning a tree of the type in question (a chain of RegexOr), we're going to call __init__() whether we like it or not.

class Regex(type):
    _instances = {}
    def __call__(cls, *args, **kwargs):
        key = cls.make_key(*args)

        if key in cls._instances:
            return cls._instances[key]
        obj = cls._instances[key] = super(Regex, cls).__call__(*args, **kwargs)
        return obj

class RegexOr(metaclass=Regex):
    def __new__(cls, *args, **kwargs):
        print('New RegexOr(%s)' % str(args))
        left, right = args

        if left is None:
            return right
        if right is None:
            return left

        self = super().__new__(cls)

        return self

    def __init__(self, *args, **kwargs):
        print('Init RegexOr(%s)' % str(args))

        if hasattr(self, 'left'):
            assert()

        left, right = args

        self.left = left
        self.right = right

    @staticmethod
    def make_key(*args):
        return ('OR', frozenset(args))

class RegexSym(metaclass=Regex):
    def __init__(self, sym):
        print('Init sym: %s' % sym)
        self.sym = sym

    @staticmethod
    def make_key(*args):
        return ('SYM', args)


y = RegexOr(RegexSym('b'), RegexSym('a'))
x = RegexOr(RegexSym('a'), RegexSym('b'))

z = RegexOr(x, y)

zz = RegexOr(z, None)

print(x == y)
Reply
#5
I should probably add: I did this duplication check work originally in Regex.__new__(), but my frustration with derived.__init__() behavior and having to check if this object had already been initialized drove me to consider metaclasses. However, derived.__new__() is the best place to check for simplification since the rules are specific to each subclass...
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  python update binary object (override delivered Object properties) pierre38 4 1,782 May-19-2022, 07:52 AM
Last Post: pierre38
  want to know the kind of object whether its a python or json object johnkennykumar 5 62,831 Jan-21-2017, 08:47 AM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020