How to auto register a class when it's defined

PythonOopDesign PatternsDecoratorMetaclass

Python Problem Overview


I want to have an instance of class registered when the class is defined. Ideally the code below would do the trick.

registry = {}

def register( cls ):
   registry[cls.__name__] = cls() #problem here
   return cls

@register
class MyClass( Base ):
   def __init__(self):
      super( MyClass, self ).__init__() 

Unfortunately, this code generates the error NameError: global name 'MyClass' is not defined.

What's going on is at the #problem here line I'm trying to instantiate a MyClass but the decorator hasn't returned yet so it doesn't exist.

Is the someway around this using metaclasses or something?

Python Solutions


Solution 1 - Python

Yes, meta classes can do this. A meta class' __new__ method returns the class, so just register that class before returning it.

class MetaClass(type):
    def __new__(cls, clsname, bases, attrs):
        newclass = super(MetaClass, cls).__new__(cls, clsname, bases, attrs)
        register(newclass)  # here is your register function
        return newclass

class MyClass(object):
    __metaclass__ = MetaClass

The previous example works in Python 2.x. In Python 3.x, the definition of MyClass is slightly different (while MetaClass is not shown because it is unchanged - except that super(MetaClass, cls) can become super() if you want):

#Python 3.x

class MyClass(metaclass=MetaClass):
    pass

As of Python 3.6 there is also a new __init_subclass__ method (see PEP 487) that can be used instead of a meta class (thanks to @matusko for his answer below):

class ParentClass:
    def __init_subclass__(cls, **kwargs):
        super().__init_subclass__(**kwargs)
        register(cls)

class MyClass(ParentClass):
    pass

[edit: fixed missing cls argument to super().__new__()]

[edit: added Python 3.x example]

[edit: corrected order of args to super(), and improved description of 3.x differences]

[edit: add Python 3.6 __init_subclass__ example]

Solution 2 - Python

Since python 3.6 you don't need metaclasses to solve this

In python 3.6 simpler customization of class creation was introduced (PEP 487).

> An __init_subclass__ hook that initializes all subclasses of a given class.

Proposal includes following example of subclass registration

class PluginBase:
    subclasses = []

    def __init_subclass__(cls, **kwargs):
        super().__init_subclass__(**kwargs)
        cls.subclasses.append(cls)

> In this example, PluginBase.subclasses will contain a plain list of > all subclasses in the entire inheritance tree. One should note that > this also works nicely as a mixin class.

Solution 3 - Python

The problem isn't actually caused by the line you've indicated, but by the super call in the __init__ method. The problem remains if you use a metaclass as suggested by dappawit; the reason the example from that answer works is simply that dappawit has simplified your example by omitting the Base class and therefore the super call. In the following example, neither ClassWithMeta nor DecoratedClass work:

registry = {}
def register(cls):
    registry[cls.__name__] = cls()
    return cls

class MetaClass(type):
    def __new__(cls, clsname, bases, attrs):
        newclass = super(cls, MetaClass).__new__(cls, clsname, bases, attrs)
        register(newclass)  # here is your register function
        return newclass

class Base(object):
    pass


class ClassWithMeta(Base):
    __metaclass__ = MetaClass
    
    def __init__(self):
        super(ClassWithMeta, self).__init__()


@register
class DecoratedClass(Base):
    def __init__(self):
        super(DecoratedClass, self).__init__()

The problem is the same in both cases; the register function is called (either by the metaclass or directly as a decorator) after the class object is created, but before it has been bound to a name. This is where super gets gnarly (in Python 2.x), because it requires you to refer to the class in the super call, which you can only reasonably do by using the global name and trusting that it will have been bound to that name by the time the super call is invoked. In this case, that trust is misplaced.

I think a metaclass is the wrong solution here. Metaclasses are for making a family of classes that have some custom behaviour in common, exactly as classes are for making a family of instances that have some custom behavior in common. All you're doing is calling a function on a class. You wouldn't define a class to call a function on a string, neither should you define a metaclass to call a function on a class.

So, the problem is a fundamental incompatibility between: (1) using hooks in the class creation process to create instances of the class, and (2) using super.

One way to resolve this is to not use super. super solves a hard problem, but it introduces others (this is one of them). If you're using a complex multiple inheritance scheme, super's problems are better than the problems of not using super, and if you're inheriting from third-party classes that use super then you have to use super. If neither of those conditions are true, then just replacing your super calls with direct base class calls may actually be a reasonable solution.

Another way is to not hook register into class creation. Adding register(MyClass) after each of your class definitions is pretty equivalent to adding @register before them or __metaclass__ = Registered (or whatever you call the metaclass) into them. A line down the bottom is much less self-documenting than a nice declaration up the top of the class though, so this doesn't feel great, but again it may actually be a reasonable solution.

Finally, you can turn to hacks that are unpleasant, but will probably work. The problem is that a name is being looked up in a module's global scope just before it's been bound there. So you could cheat, as follows:

def register(cls):
    name = cls.__name__
    force_bound = False
    if '__init__' in cls.__dict__:
        cls.__init__.func_globals[name] = cls
        force_bound = True
    try:
        registry[name] = cls()
    finally:
        if force_bound:
            del cls.__init__.func_globals[name]
    return cls

Here's how this works:

  1. We first check to see whether __init__ is in cls.__dict__ (as opposed to whether it has an __init__ attribute, which will always be true). If it's inherited an __init__ method from another class we're probably fine (because the superclass will already be bound to its name in the usual way), and the magic we're about to do doesn't work on object.__init__ so we want to avoid trying that if the class is using a default __init__.
  2. We lookup the __init__ method and grab it's func_globals dictionary, which is where global lookups (such as to find the class referred to in a super call) will go. This is normally the global dictionary of the module where the __init__ method was originally defined. Such a dictionary is about to have the cls.__name__ inserted into it as soon as register returns, so we just insert it ourselves early.
  3. We finally create an instance and insert it into the registry. This is in a try/finally block to make sure we remove the binding we created whether or not creating an instance throws an exception; this is very unlikely to be necessary (since 99.999% of the time the name is about to be rebound anyway), but it's best to keep weird magic like this as insulated as possible to minimise the chance that someday some other weird magic interacts badly with it.

This version of register will work whether it's invoked as a decorator or by the metaclass (which I still think is not a good use of a metaclass). There are some obscure cases where it will fail though:

  1. I can imagine a weird class that doesn't have an __init__ method but inherits one that calls self.someMethod, and someMethod is overridden in the class being defined and makes a super call. Probably unlikely.
  2. The __init__ method might have been defined in another module originally and then used in the class by doing __init__ = externally_defined_function in the class block. The func_globals attribute of the other module though, which means our temporary binding would clobber any definition of this class' name in that module (oops). Again, unlikely.
  3. Probably other weird cases I haven't thought of.

You could try to add more hacks to make it a little more robust in these situations, but the nature of Python is both that these kind of hacks are possible and that it's impossible to make them absolutely bullet proof.

Solution 4 - Python

The answers here didn't work for me in python3, because __metaclass__ didn't work.

Here's my code registering all subclasses of a class at their definition time:

registered_models = set()

class RegisteredModel(type):
    def __new__(cls, clsname, superclasses, attributedict):
        newclass = type.__new__(cls, clsname, superclasses, attributedict)
        # condition to prevent base class registration
        if superclasses:
            registered_models.add(newclass)
        return newclass

class CustomDBModel(metaclass=RegisteredModel):
    pass

class BlogpostModel(CustomDBModel):
    pass

class CommentModel(CustomDBModel):
   pass

# prints out {<class '__main__.BlogpostModel'>, <class '__main__.CommentModel'>}
print(registered_models)

Solution 5 - Python

Calling the Base class directly should work (instead of using super()):

  def __init__(self):
        Base.__init__(self)

Solution 6 - Python

It can be also done with something like this (without a registry function)

_registry = {}

class MetaClass(type):
    def __init__(cls, clsname, bases, methods):
        super().__init__(clsname, bases, methods)
        _registry[cls.__name__] = cls


class MyClass1(metaclass=MetaClass): pass
class MyClass2(metaclass=MetaClass): pass

print(_registry)
# {'MyClass1': <class '__main__.MyClass1'>, 'MyClass2': <class '__main__.MyClass2'>}

Additionally, if we need to use a base abstract class (e.g. Base() class), we can do it this way (notice the metacalss inherits from ABCMeta instead of type)

from abc import ABCMeta

_registry = {}

class MetaClass(ABCMeta):
    def __init__(cls, clsname, bases, methods):
        super().__init__(clsname, bases, methods)
        _registry[cls.__name__] = cls

class Base(metaclass=MetaClass): pass
class MyClass1(Base): pass
class MyClass2(Base): pass

print(_registry)
# {'Base': <class '__main__.Base'>, 'MyClass1': <class '__main__.MyClass1'>, 'MyClass2': <class '__main__.MyClass2'>}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questiondeft_codeView Question on Stackoverflow
Solution 1 - PythondappawitView Answer on Stackoverflow
Solution 2 - PythonmatuskoView Answer on Stackoverflow
Solution 3 - PythonBenView Answer on Stackoverflow
Solution 4 - PythonhlidkaView Answer on Stackoverflow
Solution 5 - PythonAndreas JungView Answer on Stackoverflow
Solution 6 - PythonAziz AltoView Answer on Stackoverflow