A simple introduction to using Python’s data model, *args, **kwargs, __getattr__, and __getattribute__ methods to implement dynamic methods and properties for classes and objects.

Techniques are introduced for refactoring while maintaining backwards compatibility e.g. for an API or library.

Prerequisite:
Python Installation

What is a dynamic method or property?

Simply, a dynamic method or property is an attribute that is not explicitly defined. Instead, its functionality is dynamically resolved at runtime, rather than pointing to an exact function in memory.

How can dynamic methods and properties be used?

One use of a dynamic method or property is for improving syntax while reducing code: it allows for the generic definition of “families” of methods for an object instead of multiple explicitly defined methods that perform nearly identical tasks.

To illustrate, let’s consider the scenario of an AnimalShelter which tracks tuples defining individual animals in a list:

# tracks list of animals, each animal identified by a tuple of properties
class AnimalShelter(object):
    def __init__(self):
        self.animals = []
    def add_cat(self, name: str, breed: str):
        self.animals.append(("cat", breed, name))
    def add_dog(self, name: str, breed: str):
        self.animals.append(("dog", breed, name))
    def add_horse(self, name: str, breed: str):
        self.animals.append(("horse", breed, name))

We can improve this by abstracting repeated logic into a method add_animal():

 # abstract common logic into add_animal method
class AnimalShelter(object):
    def __init__(self):
        self.animals = []
    def add_animal(self, species: str, breed: str, name: str):
        self.animals.append((species, breed, name))
    def add_cat(self, breed: str, name: str):
        self.add_animal("cat", breed, name)
    def add_dog(self, breed: str, name: str):
        self.add_animal("dog", breed, name)
    def add_horse(self, breed: str, name: str):
        self.add_animal("horse", breed, name)

Notice how each explicitly defined add_* method has the same underlying structure and method call to add_animal. We could completely remove add_cat(), add_dog(), add_horse() and only use add_animal()… this is one typical way of performing abstraction and code simplification:

 # remove similar methods, force API to use add_animal()
class AnimalShelter(object):
    def __init__(self):
        self.animals = []
    def add_animal(self, species: str, breed: str, name: str):
        self.animals.append((species, breed, name))

Let’s suppose we want our API to continue supporting methods like add_cat(), add_dog(), add_horse(). This could be for a variety of reasons: the AnimalShelter class is utilized by many end-users and we’d like to preserve backwards compatibility, or maybe there is a preference for the explicit syntax.

Whatever the case is, Python’s data model provides methods for resolving accesses to undefined attributes. Let’s use these methods to define logic for how to handle any request that begins with the string add_.

There are two methods we will be using: __getattr__ and __getattribute__. We will examine both of these methods and different approaches to utilizing them. It is important to realize that simple logic errors can result in infinite recursion for both __getattr__ and __getattribute__() methods. It tends to be easier to produce infinite recursion in __getattribute__.

Here is how attribute access is resolved in Python using __getattr__ and __getattribute__:

__getattribute__ is called to resolve access to an object’s underlying attribute, be it a method or property
– if __getattribute__ cannot find the requested attribute, an AttributeError is raised
– if __getattr__ is defined, it is called to resolve the AttributeError

Overloading the __getattribute__ object allows modifying how all attribute accesses occur for an object. Let’s use that first:

 # dynamic methods implemented using __getattribute__
# overrides all attribute accesses
class AnimalShelter(object):
    def __init__(self):
        self.animals = []
        self._species = ''
    def add_animal(self, breed: str, name: str):
        self.animals.append((self._species, breed, name))
    def __getattribute__(self, attr: str):
        try:
            return super().__getattribute__(attr)
        except AttributeError as e:
            if len(attr) > 4 and attr.startswith("add_"):
                self._species = attr[4:]
                return self.add_animal
            raise e

Try-except super().__getattribute__

For every attribute access attempt, the object first tries to use the language’s built-in __getattribute__ defined by the object class. It calls this method using super().__getattribute__().

For reference, the super() function is used to determine the parent class of an object instance self using the class’s MRO (Method Resolution Order). The Python 2 syntax would be: super(AnimalShelter, self) which would return object. In Python 3, super() is equivalent to super(AnimalShelter, self).

If the parent __getattribute__ (the Python built-in __getattribute__ method defined for the object class) returns a value, then the try-except block does not raise an AttributeError because the AnimalShelter object already has the desired attribute.

However, if an AttributeError is raised, that means the desired attr is not defined for the AnimalShelter instance. The following except block tries to dynamically resolve the attribute access by seeing if it is an attempt to call add_cat(), add_dog(), or add_horse().

It does this by checking what attr is requested and matching the request to the following pattern:

– the length of the attr requested is more than four characters,
– the attr requested starts with the string add_

If both conditions are true, then we assume that the user wants to add a specific species of animal. So we set an internal variable _species where the prefixed underscore in self._species is used to syntactically denote a “protected” variable that is needed for the class to function correctly, and should not be modified outside of the class unless the user knows what they are doing.

Appropriately, the add_animal() method is updated to expect the instance variable self._species, and the __init__() method is updated to initialize self._species to an empty string ''.

If the user is requesting an undefined add_ function, a reference (pointer to a function) for the add_animal() method is now returned. Note that a function pointer can be returned in Python by returning a function attribute without the parentheses (), e.g. someObject.someMethod is a function pointer whereas someObject.someMethod() is a function call.

Therefore, when a user calls add_cat("Calico", "Princess Carolyn") and add_dog("Golden Retriever", "Mr. Peanut Butter"), the add_animal() will be called instead with the respective arguments, and self._species will be updated and passed internally with the value "cat" or "dog".

Use __getattr__ instead of __getattribute__ and try-except

As discussed earlier, Python actually calls __getattr__ automatically if __getattribute__ raises an AttributeError. Rather than overloading __getattribute__ which can have more far-flung and undesirable consequences, let’s implement what we want by overloading __gettattr__ instead:

 # dynamic methods implemented using __getattr__ instead of __getattribute__
class AnimalShelter(object):
    def __init__(self):
        self.animals = []
        self._species = ''
    def add_animal(self, breed: str, name: str):
        self.animals.append((self._species, breed, name))
    def __getattr__(self, attr: str):
        if len(attr) > 4 and attr.startswith("add_"):
            self._species = attr[4:]
            return self.add_animal
        raise AttributeError(f"'{self.__class__}' object has no attribute '{attr}'")

Notice how the implementation of __getattr__ is nearly identical to the __getattribute__ implementation. All that was done:

– the try-except block was removed
– the logic inside the except block was moved out
– the method explicitly raises an AttributeError mimicking Python’s built-in error

Make methods generic using *args

Functions can be defined as accepting N-many unnamed inputs using *args. This syntax collects all unnamed arguments into a list and passes them as a single variable input to a function. To apply it to the add_animal() method to allow us to pass in three or two arguments, either with or without species, we can write:

# use inspect to determine a function's name from within that function
import inspect

# use *args for add_animal
class AnimalShelter(object):
    def __init__(self):
        self.animals = []
        self._species = ''
    def add_animal(self, *args):
        if len(args) == 2:
            species = self._species
            breed = args[0]
            name = args[1]
        elif len(args) == 3:
            species = args[0]
            breed = args[1]
            name = args[2]
        else:
            func = inspect.stack()[0][3]
            raise TypeError(f"{func} takes 3 to 4 positional arguments but {len(args)} were given")
        self.animals.append((species, breed, name))
    def __getattr__(self, attr: str):
        if len(attr) > 4 and attr.startswith("add_"):
            self._species = attr[4:]
            return self.add_animal
        raise AttributeError(f"'{self.__class__}' object has no attribute '{attr}'")

Notice how now we are discriminating against the len() of *args. If two arguments are given, we assume they are breed and name, in that order. If three arguments are given, we assume they are species, breed, and name.

Notice also how in all other circumstances, we raise a TypeError mimicking Python’s built-in behaviour indicating that an unexpected number of positional (non-keyword) arguments was specified. Take care to realize that self is the implicit first argument of all instance methods, therefore our error message says “3 to 4 positional arguments” instead of “2 to 3”.

It is worth noting that the variable itself can be renamed, the single-asterisk syntax * is what is significant. Therefore, we could rewrite our method prototype to be: def add_animal(self, *stuff) if we so desired.

The *args syntax encompasses all non-keyword arguments in excess of defined arguments to a method, meaning we can express our function prototype in a number of ways:

add_animal(self, *args)
add_animal(self, breed, *args)
add_animal(self, breed, name, *args)

Make methods more generic using **kwargs

Keyword args, or **kwargs can also be used to make methods more generic. The above example with *args is a bit limited because if the arguments are passed in by keyword, the add_animal() method will fail and raise an error.


# use **kwargs for add_animal
class AnimalShelter(object):
    def __init__(self):
        self.animals = []
        self._species = ''
    def add_animal(self, **args):
        if "species" not in args:
            species = self._species
            # if species has not been set, it is required as an argument
            if not species:
                raise TypeError("Missing required argument: 'species'")
        if "breed" not in args:
            raise TypeError("Missing required argument: 'breed'")
        if "name" not in args:
            raise TypeError("Missing required argument: 'name'")
        self.animals.append((species, breed, name))
    def __getattr__(self, attr: str):
        if len(attr) > 4 and attr.startswith("add_"):
            self._species = attr[4:]
            return self.add_animal
        raise AttributeError(f"'{self.__class__}' object has no attribute '{attr}'")

Syntactically, it is not necessary to call the variable **kwargs: it can be called anything that is convenient. The notation of the double-asterisk ** is the essential syntax here, and is symbolically akin to a double-dereference in C e.g. dereferencing a list-of-lists, or an array-of-arrays or other similar 2D data structures.

Just like *args, the **kwargs variable “encompass” any undefined or arbitrary keyword arguments. In other words, we can pass all our arguments through **kwargs, or we can explicitly define some keyword arguments and not others. To illustrate, viable method prototypes include:

add_animal(self, **kwargs)
add_animal(self, breed, **kwargs)
add_animal(self, breed, name, **kwargs)

The above implementation has an issue: we can no longer do positional arguments, we must pass all arguments by name. That is, the following syntax will now raise an error: self.add_animal("Cat", "Calico", "Princess Carolyn"). We must perform method calls like so: self.add_animal(species="Cat", breed="Calico", name="Princess Carolyn")

Combining *args and **kwargs

This can be done, but should be done with care. Both *args and **kwargs can be combined, but the interplay between both inputs needs to be handled carefully. Typically, it is instead recommended to identify the use-cases (or user stories) for software and enforce specific paradigms of usage.

In general, it is better practice to use explicit arguments in function prototypes instead of *args or **kwargs.

Ready for more?
Computer Science Fundamentals – Variables and Data
Floating Point – Mathematically Elegant Data Storage


0 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *