|

my personal blog

Some fun with Python Enum

September 8, 2024

Most of my Python experience is with Python 2.X, so I’ve been taking some time recently to catch up on the latest features in Python 3.X.

Enum is a class type that was introduced in Python 3.4 which can be used to group values with some type-safety. It’s got some nifty features I’d like to explore.

Why even use Enum?

You could just use a str or an int to represent some concept. Especially with the addition of Literal Types one gets static type checking to help catch type errors.

Here’s an example from a discussion on why Literal Types are helpful:

@overload
def open(name: str, mode: Literal['r']) -> IO[str]: ...

@overload
def open(name: str, mode: Literal['rb']) -> IO[bytes]: ...

@overload
def open(name: str, mode: str) -> IO[Any]: ...

The mode parameter can be set to one of three options: "r", "rb", or any other str and the type checker will know exactly what the corresponding return type will be (i.e. IO[str], IO[bytes] or IO[Any]).

But how far can we go with just Literal Types?

Example: unix file permissions

Suppose we want to build a new function that works with unix file permissions e.g. .rw-r--r--. We can define a file_permission parameter that covers all the options,

FilePermission = Literal["r", "w", "x"] 
FilePermissionSet = set[FilePermission]

def handle_file(f: File, permissions: FilePermissionSet): ...
    pass

# handles file with read+write permission
handle_file(f, {"r", "w"})

Seems great, but what happens when we compare user permissions with group permissions? The static type checker doesn’t know if the read r permission pertains to the user or group.

p: FilePermissionSet = get_user_permission()
set_group_permission(p)  # Uh oh!

This is one of the reasons Enum was created. From PEP 435,

It is possible to simply define a sequence of values of some other basic type, such as int or str, to represent discrete arbitrary values. However, an enumeration ensures that such values are distinct from any others including, importantly, values within other enumerations…

Enum’s can only be compared to the same enum type

This is one of the key invariants and advantages of using Enum. The Enum members of two different Enum classes cannot be compared. UserPermission.READ == GroupPermission.READ will return False and the type checker will complain if a GroupPermission is expected and a UserPermission is used.

We get both static type safety and runtime safety.

Note: IntEnum and StrEnum actually break this invariant so be careful when using them.

Using __new__ to enforce invariants

Something I see myself using is the ability to provide additional information along with an Enum value. Suppose we have some software to manage our cookie business. With __new__ we can ensure that every cookie type is marked as vegan or not.


class CookieType(Enum):
    CHOCOLATE_CHIP = ("chocolate chip", False)
    OATMEAL_RAISIN = ("oatmeal raisin", True)

    def __new__(cls, value: str, is_vegan: bool):
        member = object.__new__(cls)
        member._value_ = value
        
        if is_vegan is None:
            raise ValueError("need to know if cookie is vegan or not")

        member.is_vegan = is_vegan
        return member

    @classmethod
    def get_vegan_cookies(cls) -> list[str]:
        return [cookie.value for cookie in cls if cookie.is_vegan]

# We can easily inform our vegan conscious customers! 
print(CookieType.get_vegan_cookies())

Yes we could use a dict to store a mapping like,

cookie_is_vegan_by_type = {
    CookieType.CHOCOLATE_CHIP: False,
    CookieType.OATMEAL_RAISIN: True,
    ...
}

But then we lose the benefits of the runtime check. We could accidentally forget to update the dict when we add a new cookie type.

It’s also nice to just be able to access the property easily on the Enum member itself: CookieType.CHOCOLATE_CHIP.is_vegan.

The power of enum.Flag

We talked about the advantage of using Enum in our unix file permission example, but how would we represent a permission like rw-r--r--? Would we need to define our Enum classes like this:

class UserUnixPermission(Enum):
    NONE = 0
    READ = 1
    WRITE = 2
    EXECUTE = 3
    READ_WRITE = 4
    READ_EXECUTE = 5
    WRITE_EXECUTE = 6
    READ_WRITE_EXECUTE = 7

class GroupUnixPermission(Enum):
    NONE = 0
    READ = 1
    WRITE = 2
    ...

...

That seems like a heavy cost for the benefit.

We’ll need to pass a unix file permission around as a tuple (user_permission, group_permission, world_permission). If we want a function that checks if the file is executable by anyone at all then we’ll have to check every value in the tuple and check if some form of _EXECUTE is set.

We can greatly simplify all of this with the Flag enum.

Flag is the same as Enum, but its members support the bitwise operators & (AND), | (OR), ^ (XOR), and ~ (INVERT); the results of those operations are (aliases of) members of the enumeration.

Using Flag our Enum looks like this,

class FilePermission(Flag):
    USER_READ = auto()
    USER_WRITE = auto()
    USER_EXECUTE = auto()

    GROUP_READ = auto()
    GROUP_WRITE = auto()
    GROUP_EXECUTE = auto()

    WORLD_READ = auto()
    WORLD_WRITE = auto()
    WORLD_EXECUTE = auto()

Now rw-r--r-- can be represented as USER_READ | USER_WRITE | GROUP_READ | WORLD_READ.

We can even set variables like ANY_READ = USER_READ | GROUP_READ | WORLD_READ and use it in functions,

def has_read_permission(p: FilePermission):
    return p & ANY_READ != 0    

The advantage of a Flag over some sort of bit array is that the resulting code is more readable and less likely to introduce errors. Just like how Enum provides a little extra safety and readability compared to using native str or int types.

My one feature request

I wish mypy could check if match statements are exhaustive.

def handle_file(f: File, permissions: FilePermission):
    match permissions:  # throws non-exhaustive mypy error !
        case FilePermission.USER_READ:
            print("file handled.")
    
    return

There does seem to be a way to support this, but it doesn’t feel super ergonomic.

Overall though I’m very happy with the introduction of Enum and its features in Python 3.4+. I have a tendency to misspell things and I can already feel the benefits of avoiding obvious string typos: "receive" != "recieve".