Python's Named Tuples and Why They Rock

I’m a big fan of using the right tool for the job. When it comes to storing data, there are all sorts of flashy buzzwords thrown around — predictive analytics, cognitive computing, data mining — but at the root of it all, data typing remains… almost unchanged. You’ve got strings, integers, arrays… but Python is one of a rare few languages that incorporates a more uncommon data type: the tuple.

Tuples are a powerful tool in a Python programmer’s toolbox; they process faster and they’re more memory-efficient, benefits which add up to significant performance boosts in any code base. They also come in two varieties: named and unnamed. Named tuples are something that I find more and more opportunities to use.

This post goes into detail on named tuples and why they’ll help you cut down your code. Soon, you’ll be seeing how you can get a more readable project by utilising this often-overlooked Python gem.

Laptop Before Window
Faster, cleaner, stronger.

Tuples vs. Lists: What’s the Diff?

Ask someone the difference between a tuple and a list and the most common response is that a tuple is immutable and a list is mutable; or rather, the contents of a list can be changed easily, while the contents of a tuple are locked in.

There’s more to it, though. Even though each one’s elements are accessed using the same syntax, they have different purposes.

A list contains an ordered collection of “things”. In a homogenous list all the elements will be of the same type, but of course lists can contain different types.

Tuples are designed to store a collection of related bits of data, which, all together, represent one “thing”. They’re kind of like a lightweight “class”, but without any associated methods.

To put it more simply, a tuple explicitly can only represent one “thing”, whereas a list can represent one or many things.

A simple example that I’ll use throughout this post is some kind of result that is to be returned from a function. Each result has a success flag (a bool) and a message (a str).

If I was reading some code and I saw this list:

result = [False, "No space left on device."]

My first thought would be that maybe this was a typo. Perhaps it should be results? And for some reason the first result is a bool and the second is a str?

To someone other than the author, the use of a list can introduce ambiguity. The above list could be referring to one aggregate result from one “thing”, or it could be referring to multiple results from multiple “things”. How are you supposed to know which one it is, without reading through even more code?

This code is more intuitively comprehensible:

result = (False, "No space left on device.")

Since the result is returned as a tuple, I can immediately infer that the result is comprised of a bool and a str and (probably) quite easily guess the purpose of each item.

Later on, though, when I want to access the data inside the tuple, errors can creep in.

print("Message is " + result[0])

Whoops, that was meant to be result[1], wasn’t it? You’ve gotta admit, result[0] and result[1] aren’t exactly informative as to what’s supposed to be in which field.

We can make this all simpler with the use of named tuples.

Why not a class?

Since most Python developers are familiar with classes we can take a peek at how you might implement a Result class in the “normal” way.

class Result:
    def __init__(self, success, message):
        self.success = success
        self.message = message

This is how you’d use the class to store your result:

result = Result(False, "No space left on device.")

Accessing the attributes is a bit more clear as well:

print("Message is " + result.message)

But writing the same class boilerplate all the time can be annoying. Each __init__ method is so similar, yet subtly different – can we do better?

Enter named tuples

Naples
No, not Naples. These are way more organised.

Named tuples do some magic to automatically generate classes, kind of like what you see above. But, they come with a couple extra nice features.

A code snippet is worth 1,000 words, so let’s see how we make use of them. First, we need to import the namedtuple class from the collections module.

from collections import namedtuple

Then we define our named tuple class with a name and its fields. The fields can be defined as a space separated string:

Result = namedtuple("Result", "success message")

or, as a list of strings:

Result = namedtuple("Result", ["success", "message"])

Both of these behave equivalently, but the second format requires less interpretation from the computer. As such, it’s marginally faster; for this example, we’re looking at about .15 seconds over 100,000 iterations (or 1.5 microseconds per iteration). My guess is that this time difference might be larger if your named tuple has more fields, but it’s unlikely to make much difference to your code’s performance overall – use whichever syntax you think looks nicer.

Once you’ve defined your named tuple Result “class”, instantiate it like a normal class.

result = Result(True, "Completed successfully.")

The benefit of this is that it gives you options for ways you can access the attributes. You can still access them by their index:

>>> result[0]
True
>>> result[1]
'Completed successfully.'

But now, you can also access them by name:

>>> result.success
True
>>> result.message
'Completed successfully.'

You can also unpack them, like a normal tuple.

>>> s, m = result
>>> s
True
>>> m
'Completed successfully.'

And like a normal tuple, it’s immutable.

>>> result[0] = False
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'Result' object does not support item assignment
>>> result.message = "New Message"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: can't set attribute

You can’t even set attributes that weren’t already in the definition.

>>> result.foo = "abc123"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Result' object has no attribute 'foo'

So if you want to treat them like an ordinary object and be lackadaisical about adding arbitrary attributes, you’re out of luck.

This is also what makes them faster and more memory-efficient; since they’re immutable and won’t get any longer or shorter, they’ll never require more or less memory space than they’re assigned upon creation. And since tuples exist as single blocks (unlike lists, which store their Python object information separately from the data), the indexing speed is faster.

Named tuples can also be converted to dicts with the built in _asdict() method.

>>> result._asdict()
{'success': True, 'message': 'Completed successfully.'}

(Don’t worry about the leading _ in the method name. It’s not intended to be a private method, the _ is just there to prevent conflicts with user defined attributes.)

Can named tuples get any better?

By now I hope you’re seeing the benefits of named tuples, but named tuples have one more trick up their sleeve, in the form of named tuples that support type annotations.

These are implemented slightly differently. To use typed named tuples, we import NamedTuple from the typing module instead of the collections module (which means you need Python 3.6 or later). Then our named tuple should be defined as a class that inherits from NamedTuple. The fields are defined with type annotations as we would with a normal class.

from typing import NamedTuple

class Result(NamedTuple):
    success: bool
    message: str

result = Result(True, "Completed successfully.")

It may require a little bit more laborious keyboard work, but you get type safety with a linting tool (like mypy), and it’s still less work than having to write out the whole class with the __init__ method each time.

You’ll find that setting defaults on fields becomes a bit easier too. Untyped named tuples also support default values (see the documentation), although it’s not quite as intuitive as the normal Python way.

And since (unlike their predecessors) typed named tuples are classes, you can even add methods to them.

Jungle Classroom
Class is in session.

Here’s an example of a typed named tuple with a default attribute value and a method:

from typing import NamedTuple, Optional

class Result(NamedTuple):
    success: bool
    message: Optional[str] = None

    def get_error_message(self) -> Optional[str]:
        # return the message if we weren't sucesseful,
        # since that must be an error message
        return None if self.success else self.message

And here’s our snippet in action:

>>> result = Result(True, "Success")
>>> result.message
>>> result.get_error_message()
>>> error = Result(False, "Something bad happened")
>>> error.get_error_message()
'Something bad happened'

Like a normal class, you can use it to annotate types:

def get_the_result() -> Result:
    return Result(True, "This is good!")

You can annotate functions with non-typed named tuples as well, but if you’re using the type system, you may as well go all in and use the typed version of named tuples.

So you get built-in type safety, shorter and more legible code, nice fast immutable objects, and IDE autocompletions. Typed named tuples are a WIN-WIN-WIN!

Some concrete examples

Now that I’ve developed a “named tuple mindset”, I like to use them anywhere that’s intended to return more than a simple value from a function, or where a function takes a bunch of related values as arguments.

Here’s a few examples from some of my various projects. This first one is a named tuple representing an HTTP Request that can be “executed”.

class HttpRequest(typing.NamedTuple):
    method: HttpMethod
    path: str
    parameters: typing.Optional[dict] = None
    data: typing.Any = None
    data_is_encoded: bool = False

This saves me having to pass method, path, and all other parameters down through a chain of functions.

Here’s another named tuple that re-presents a DNS entry.

class ZoneRecord(typing.NamedTuple):
    record_type: str
    content: str

I could hypothetically implement this with a base Record class which ARecord and CnameRecord, etc, inherit from… but honestly, that sounds like a lot more code. Plus I’d need a field on those classes to store the record type anyway – either that or work it out from reflection or something. Named tuples in this vein have saved me a lot of typing.

This last one is fairly basic too; it represents a log message that has a level (like INFO, WARNING, or ERROR, etc) and a string with the message itself.

class Message(typing.NamedTuple):
    level: int
    message: str

This is a simple way of passing around messages that might need to be logged or displayed — for example, as a Django message.

Conclusion

As I said at the start of this post, I’m a big fan of named tuples, both in my own work and others’. They’re so handy, they cut down on code, and they ultimately make your project more understandable — again, both to yourself and others. Like enums, I don’t see them getting enough love in Python projects.

I hope this post inspires you to start using (or keep using) named tuples. If you’ve never heard of them before now, I urge you to give them a try and see what they can do for you. Once you start using them, you’ll wonder if you’ll ever have to define a class again!

About Tera Shift

Tera Shift Ltd is a software and data consultancy. We help companies with solutions for development, data services, analytics, project management, and more. Our services include:

  • Working with companies to build best-practice teams
  • System design and implementation
  • Data management, sourcing, ETL and storage
  • Bespoke development
  • Process automation

We can also advise on how custom solutions can help your business grow, by using your data in ways you hadn’t thought possible.

About the author

Ben Shaw (B. Eng) is the Director of Tera Shift Ltd. He has over 15 years’ experience in Software Engineering, across a range of industries. He has consulted for companies ranging in size from startups to major enterprises, including some of New Zealand’s largest household names.

Email ben@terashift.co.nz