Some fun with Python Enum
September 8, 2024
Most of my Python experience is with Python 2.X, so I’ve been taking some time recently to catch up on the latest features in Python 3.X.
Enum is a class type that was introduced in Python 3.4 which can be used to group values with some type-safety. It’s got some nifty features I’d like to explore.
Why even use Enum?
You could just use a str
or an int
to represent some concept. Especially with the addition of Literal Types one gets static type checking to help catch type errors.
Here’s an example from a discussion on why Literal Types are helpful:
@overload
def open(name: str, mode: Literal['r']) -> IO[str]: ...
@overload
def open(name: str, mode: Literal['rb']) -> IO[bytes]: ...
@overload
def open(name: str, mode: str) -> IO[Any]: ...
The mode
parameter can be set to one of three options: "r"
, "rb"
, or any other str
and the type checker will know exactly what the corresponding return type will be (i.e. IO[str]
, IO[bytes]
or IO[Any]
).
But how far can we go with just Literal Types?
Example: unix file permissions
Suppose we want to build a new function that works with unix file permissions e.g. .rw-r--r--
. We can define a file_permission
parameter that covers all the options,
FilePermission = Literal["r", "w", "x"]
FilePermissionSet = set[FilePermission]
def handle_file(f: File, permissions: FilePermissionSet): ...
pass
# handles file with read+write permission
handle_file(f, {"r", "w"})
Seems great, but what happens when we compare user permissions with group permissions? The static type checker doesn’t know if the read r
permission pertains to the user or group.
p: FilePermissionSet = get_user_permission()
set_group_permission(p) # Uh oh!
This is one of the reasons Enum
was created. From PEP 435,
It is possible to simply define a sequence of values of some other basic type, such as int or str, to represent discrete arbitrary values. However, an enumeration ensures that such values are distinct from any others including, importantly, values within other enumerations…
Enum’s can only be compared to the same enum type
This is one of the key invariants and advantages of using Enum. The Enum members of two different Enum classes cannot be compared.
UserPermission.READ == GroupPermission.READ
will return False
and the type checker will complain if a GroupPermission
is expected and a UserPermission
is used.
We get both static type safety and runtime safety.
Note: IntEnum
and StrEnum
actually break this invariant so be careful when using them.
Using __new__ to enforce invariants
Something I see myself using is the ability to provide additional information along with an Enum value. Suppose we have some software to manage our cookie business. With __new__
we can ensure that every cookie type is marked as vegan or not.
class CookieType(Enum):
CHOCOLATE_CHIP = ("chocolate chip", False)
OATMEAL_RAISIN = ("oatmeal raisin", True)
def __new__(cls, value: str, is_vegan: bool):
member = object.__new__(cls)
member._value_ = value
if is_vegan is None:
raise ValueError("need to know if cookie is vegan or not")
member.is_vegan = is_vegan
return member
@classmethod
def get_vegan_cookies(cls) -> list[str]:
return [cookie.value for cookie in cls if cookie.is_vegan]
# We can easily inform our vegan conscious customers!
print(CookieType.get_vegan_cookies())
Yes we could use a dict
to store a mapping like,
cookie_is_vegan_by_type = {
CookieType.CHOCOLATE_CHIP: False,
CookieType.OATMEAL_RAISIN: True,
...
}
But then we lose the benefits of the runtime check. We could accidentally forget to update the dict
when we add a new cookie type.
It’s also nice to just be able to access the property easily on the Enum member itself: CookieType.CHOCOLATE_CHIP.is_vegan
.
The power of enum.Flag
We talked about the advantage of using Enum in our unix file permission example, but how would we represent a permission like rw-r--r--
? Would we need to define our Enum classes like this:
class UserUnixPermission(Enum):
NONE = 0
READ = 1
WRITE = 2
EXECUTE = 3
READ_WRITE = 4
READ_EXECUTE = 5
WRITE_EXECUTE = 6
READ_WRITE_EXECUTE = 7
class GroupUnixPermission(Enum):
NONE = 0
READ = 1
WRITE = 2
...
...
That seems like a heavy cost for the benefit.
We’ll need to pass a unix file permission around as a tuple (user_permission, group_permission, world_permission)
. If we want a function that checks
if the file is executable by anyone at all then we’ll have to check every value in the tuple and check if some form of _EXECUTE
is set.
We can greatly simplify all of this with the Flag
enum.
Flag is the same as Enum, but its members support the bitwise operators & (AND), | (OR), ^ (XOR), and ~ (INVERT); the results of those operations are (aliases of) members of the enumeration.
Using Flag our Enum looks like this,
class FilePermission(Flag):
USER_READ = auto()
USER_WRITE = auto()
USER_EXECUTE = auto()
GROUP_READ = auto()
GROUP_WRITE = auto()
GROUP_EXECUTE = auto()
WORLD_READ = auto()
WORLD_WRITE = auto()
WORLD_EXECUTE = auto()
Now rw-r--r--
can be represented as USER_READ | USER_WRITE | GROUP_READ | WORLD_READ
.
We can even set variables like ANY_READ = USER_READ | GROUP_READ | WORLD_READ
and use it in functions,
def has_read_permission(p: FilePermission):
return p & ANY_READ != 0
The advantage of a Flag
over some sort of bit array is that the resulting code is more readable and less likely to introduce errors. Just like how Enum
provides a little extra safety and readability compared to using native str
or int
types.
My one feature request
I wish mypy
could check if match
statements are exhaustive.
def handle_file(f: File, permissions: FilePermission):
match permissions: # throws non-exhaustive mypy error !
case FilePermission.USER_READ:
print("file handled.")
return
There does seem to be a way to support this, but it doesn’t feel super ergonomic.
Overall though I’m very happy with the introduction of Enum
and its features in Python 3.4+. I have a tendency to misspell things and I can already feel the benefits of avoiding obvious string typos: "receive" != "recieve"
.