Skip to main content

Creating a Custom List in Python

·1935 words·10 mins· 0
Kay Herklotz
Author
Kay Herklotz
DevOps-savvy developer passionate about software architecture, with a knack for streamlining development workflows, automating processes, and ensuring robust, scalable systems. Dedicated to optimizing collaboration between development and operations teams to achieve continuous delivery and innovation.

Overview #

In this blog post, I want to explore the usage of defining custom lists in Python. Furthermore, a sample application is shown as to why you would want to define your own list, and when you should rather stick to Python’s list object.

The code for this short tutorial can be found here.

Motivation #

First, why would you want to create a custom list type? The main answer is that it gives you more control over the list and enables you to expand its functionality. For a simple Python list it is possible to add any type of data to the list. Take the following code as an example:

data_list = [2, "Peter", {"working": "yes"}, True]

This code would run just fine in Python, but would but a source for all kind of problems. To have different types in a list makes it impossible to loop through and process the data, sublicense at every iteration, you need to enforce if the type is correct for the kind of processing you want to do. Conversely, if you want to make sure that all data entered into a list is of a certain type, you will always have to check that before inserting it into a list. Having a custom list can spare you the redundancy of always checking the type.

A second important reason for a custom list is to always have custom behaviour for the list readily available. Say you have a list storing signal data, a. k. a. a list full of floats. You may want to operate on the signal data. This may be in terms of filtering or normalization. Of course, you could write the corresponding functions, which would perform these actions, but you would always need to make sure that these functions are readily available. Your list can exist and be filled with data without having any of those functions. An elegant way to solve that problem is to have both the data structure and methods in one place.

list vs. UserList #

In order to write your own custom list, you can either directly inherit from list or you can use the UserList from the collections module. While this post focuses more on the UserList I would like to list considerations, that need to be made when choosing which type or class to inherit from.

Inheriting Directly from list: #

When you inherit directly from list, your custom list is essentially a subclass of the built-in list type. This approach can be suitable if you don’t need to override or extend many methods of the built-in list and you want a more direct connection to the standard list behavior.

However, there are some considerations:

  • Dunder Methods: If you want to override or extend specific methods, you’ll need to implement the relevant dunder methods (e.g., getitem, setitem, delitem, etc.) since the built-in methods of list don’t necessarily use each other.
  • Compatibility: Inheriting directly from list might lead to unexpected behavior if the methods you override or extend are not used consistently throughout the list implementation.

Inheriting from UserList: #

The UserList class, on the other hand, is specifically designed to make it easier to create custom list-like objects. It is a wrapper around the standard list type, providing a clear interface for subclassing and extending.

Advantages of using UserList:

  • Clear Interface: UserList provides a clear and consistent interface for subclassing, making it easier to understand and work with custom lists.
  • Default Implementation: It comes with default implementations for many common list-related methods, reducing the amount of boilerplate code you need to write.
  • Consistency: By using UserList, you’re less likely to encounter unexpected behavior or bugs that might arise from inheriting directly from list.

Performance Considerations #

While custom lists offer great flexibility, it’s essential to consider performance implications. Depending on your use case, the overhead of maintaining a custom list might be negligible or significant. Profiling and benchmarking your code can help you make informed decisions about when to use custom lists.

We will take advantage of the UserList in this post and show you how to implement a custom list and show some features you can add for a nice developer experience.

Implementation #

In my example, I want to create a custom list, which only holds instances of the type person:

@dataclass
class Person:
    name: str
    age: int
    is_employed: bool = False
    num_children: int = 0

We begin with writing the append function. This is called whenever we want to add a single instance to our list.

class PersonList(UserList):
    def append(self: Self, item: Person) -> None:
        if not isinstance(item, Person):
            raise TypeError("PersonList can only append Person objects")
        return super().append(item)

What happens if we want to add two PersonLists together? This can either be done by calling the extend method or by adding two lists. Since we have already written a valid way to append a new item, we can build upon the append method. This is generally considered a best practice, since it avoids repeated code and is easier to test.

class PersonList(UserList):
    # ... (previous code)

    def extend(self: Self, other: Iterable) -> None:
        for person in other:
    	    self.append(person)

    def __add__(self: Self, other: Iterable) -> Self:
        new = self.copy()
        new.extend(other)
        return new

As you can see, our extend method relies on append and the add functionality relies on extend. Hence, once we have got a good append method implemented, the rest will follow. Now we have created a custom list, which enforces that only instances of type person are added, how can we further benefit from such a list?

Imagine now that you want to be able to quickly filter the people in your list. Maybe you don’t want people below a certain age in your list, or only people who are employed. We can in our PersonList class implement filters, which are applied on the data contained in the list. Here are a few examples:

class PersonList(UserList):
    # ... (previous code)

    def filter_employed(self: Self) -> Self:
        return self.__class__(x for x in self if x.is_employed)

    def filter_older_than(self: Self, age: int) -> Self:
        return self.__class__(x for x in self if x.age > age)

Note that the content of the class is not altered. You will be able to chain several filters and receive a list only with entries according to those filters. Should you print out the instance again, you will notice that all entries are still available. That way, you can save the filtered content in a new variable if required.

Further Enhancements #

Custom Iteration #

In Python, the process of iteration is facilitated by the __iter__ and __next__ methods. The __iter__ method is responsible for returning an iterator object, and the __next__ method is called to retrieve the next item in the iteration. When there are no more items to iterate over, the __next__ method should raise the StopIteration exception.

In the context of our PersonList example, let’s consider a scenario where we want to iterate over the list of persons based on a specific criterion, such as sorting them by age.

class PersonList(UserList):
    # ... (previous code)

    def __iter__(self):
        # Sort the list by age before iterating
        self.index = 0
        self.data = sorted(self.data, key=lambda x: x.age)
        return self

    def __next__(self):
        if self.index < len(self.data):
            result = self.data[self.index]
            self.index += 1
            return result
        else:
            raise StopIteration

Alternative Implementation #

Alternatively, you can use the iter and yield combination, which provides a more concise and readable way to define a custom iterator.

class PersonList(UserList):
    # ... (previous code)

    def __iter__(self):
        return iter(sorted(self.data, key=lambda x: x.age))

In this version, the iter method simply returns an iterator object using Python’s built-in iter function. The sorting logic is applied within the iterator itself. This approach is often preferred for its simplicity and readability.

Serialization and Deserialization #

For applications that involve data persistence or communication between systems, you might find it useful to implement serialization and deserialization methods for your custom list. This ensures that your data can be easily converted to a format suitable for storage or transmission. While there are many formats that can be serialised to and from, JSON is chosen here as an example.

import json

class PersonList(UserList):
    # ... (previous code)

    def to_json(self) -> str:
        return json.dumps([vars(person) for person in self])

    @classmethod
    def from_json(cls, json_data: str) -> Self:
        data = json.loads(json_data)
        return cls([Person(**person_data) for person_data in data])

Here, the to_json method converts the PersonList to a JSON-formatted string, and the from_json method creates a new PersonList instance from a JSON string.

Application of the custom list #

Once we have set the base, lets see how sucha list can benefit us. Idealy we now have an object that is intuitive to work with, hence enhancing the developer experience.

Data Integrity and Type Enforcement #

The primary advantage of using a PersonList over a regular Python list is the enforcement of data integrity. With the custom append method, we ensure that only instances of the Person class can be added to the list. This eliminates the need for runtime type checks and guarantees that the list contains only valid data.

# Example usage
people = PersonList()
person1 = Person("Alice", 25, True, 1)
people.append(person1)  # Valid
people.append("Invalid")  # Raises TypeError

Combining and Copying Lists #

The custom list provides methods for extending itself with another iterable (using the extend method) and adding two lists together (using the add method). These operations leverage the well-defined append method, promoting code reusability and maintainability.

# Example usage
people_a = PersonList([Person("Bob", 30, True, 2)])
people_b = PersonList([Person("Charlie", 22, False, 0)])
combined_people = people_a + people_b
# or
people_a.extend(people_b)

Applying Filters #

One powerful aspect of the custom list is the ability to apply filters directly to the data it contains. The PersonList class includes methods for filtering employed individuals (filter_employed) and individuals older than a specified age (filter_older_than). These methods create new instances of the PersonList, preserving the original data.

# Example usage
employed_people = people.filter_employed()
older_than_25 = people.filter_older_than(25)

Chaining Filters #

An additional advantage is the ability to chain filters seamlessly. This allows for complex filtering criteria to be applied, providing a convenient and expressive way to manipulate the data.

# Example chaining
filtered_people = people.filter_employed().filter_older_than(25)

Iterating over PersonList #

Now, when you iterate over a PersonList instance, the custom iterator ensures that the persons are presented in the desired order.

# Example usage
people = PersonList([
    Person("Alice", 25, True, 1),
    Person("Bob", 30, True, 2),
    Person("Charlie", 22, False, 0)
])

# Iterating over the custom-sorted PersonList
for person in people:
    print(person.name)

This results in age sorted iteration yielding: Charlie, Alice and Bob.

To and Back From JSON #

Serialising the data to JSON is now as simple as:

# Example usage
people = PersonList([
    Person("Alice", 25, True, 1),
    Person("Bob", 30, True, 2),
    Person("Charlie", 22, False, 0)
])

json_string = people.to_json()
print(json_string)

And we can crate a new PersonList from JSON data:

json_data = '[{"name": "David", "age": 28, "is_employed": True, "num_children": 1}]'
new_people = PersonList.from_json(json_data)

You should keep in mind though that when deserializing data from untrusted sources to avoid security vulnerabilities like code injection. The json module in Python is designed to handle safe data, but it’s essential to be mindful of security considerations when dealing with external data.

Conclusion #

We explored in this post the creation of custom lists in Python using the UserList class from the collections module. Furthermore we added custom features to the list to show the potential power of a custom list.

I hope with this little post you have learned a little about custom types, why you should use the collections module and got a few ideas for possible use cases.