We use arbitrary numbers in software all the time. In relational databases, we use 8 to 128 bit numbers as unique identifiers, much like SSNs. If the text changes in the table row, the ID value stays the same, and relational integrity is preserved. We also use numbers to indicate status. They are often used in C/C++ as return values from functions, where every possible return value indicates a distinctly different result. Integers are suitable for this purpose because they are far more efficient to return than, say, strings.
Since numbers used as status codes are pretty arbitrary, the numbers themselves have very little significance outside the programmer's mind. If a function call fails and logs an error code of 1002 to an event log, it means very little to anyone lacking source code. If you use a function written by someone else, and it returns 1002, it likewise means very little. Enter enumerations, wherein numeric values have a text name in code. An example:
If you call into someFunction(), then you can clearly test the return value using the definition of MyEnum. Though the values in MyEnum map to 0, 1, and 2, respectively, you can test the return value by name. So because of this, you get a sense of the meaning attached to each return value.
enum MyEnum {
Ok,
NotSoGood,
QuiteBadActually,
}
MyEnum someFunction()
{
//...blah, blah, blah
return Ok;
}
Unfortunately, if you do the following in C++:
MyEnum result = someFunction();
cout << result;
... you get numeric output to the console. So when you try to log status codes defined in enumerations, you're still stuck with totally arbitrary information that's only meaningful to the original programmer. Even worse, since enumerated values can be defined without explicit assignment, you can't search source code and always find the definition of the error code.
.Net helps with this, somewhat. If you convert an enumerated value to a string, it extracts the name rather than the numeric value -- much better for event logging! Programmers are never satisfied, though, and they always want to come up with better ways of doing things. So if you get a return value of NotSoGood, what does that mean to a person in China viewing the event log? Frankly, it doesn't say much even in English. What we need is for status codes to be more than just a key/pair -- they should also supply descriptive information in the user's native language.
Not all functions return status codes. I write alot of void functions/methods that throw exceptions when they get upset. If we're going to make a better enum, we should make it throwable. In Python, everything is throwable, so our code below can be used for exception tossing.
I can't think of a language where it's not possible to implement what I will show in Python. I'm creating 'status' objects called EnumValue, and instances belong to a containing enumeration. EnumValue encapsulates a name, number, and localized description. I've done this in C++, and it required code generation to be practical (in C/C++, you need smart pointer classes if you want to return class objects, and this is outside the scope of this discussion). I've seen similar exploits in Java. It's probably trivial in .Net.
There were a lot of techniques available to make enums in Python. There are alot of great blog entries showing various ways to define Python enums. I chose to use dynamic definition of classes as my secret sauce, and by doing this, I got around having to implement a singleton class around each enumeration. I'll come back to this later.
Alas, my code:
from new import classobj
import sys
import inspect
import types
class EnumValue:
def __init__(self, ownerClass, name, value):
v = value + 0 #test that it's an int without having to know what to throw
v = name + 'test' #same sort of thing here
if not issubclass(ownerClass, Enumeration):
raise 'Owner must be an Enumeration'
self._owner = ownerClass
self._value = value
self._name = name
@property
def Name(self):
return self._name
@property
def Value(self):
return self._value
@property
def Owner(self):
return self._owner
def __str__(self):
s = self._owner.__name__ + "." + self._name + "=" + str(self._value)
if len(self.Description) > 0:
s = s + ";" + self.Description
return s
_descript = None #demand-loaded member
@property
def Description(self):
if not self._descript:
#Try to load a resource string...
#... default to "" if we don't have a localized resource
self._descript = ""
return self._descript
class EnumScope:
def __init__(self, name, initValue=0):
try: #Must clean up the frame references per inspect module docs
fcur = inspect.currentframe()
frame = fcur.f_back
self._locals = frame.f_locals
finally:
if frame:
del frame
if fcur:
del fcur
self._name = name
self._counter = initValue - 1
@property
def Locals(self):
return self._locals
@property
def Name(self):
return self._name
@property
def Next(self):
self._counter = self._counter + 1
return self._counter
#Just a tag class
class Enumeration:
def __str__(self):
s = "Enumeration " + self.__class__.__name__
for key in self._keyOrder:
if len(s) > 0:
s = s + "\n"
s = s + ' ' + str(getattr(self, key))
return s
def __makeenum(scope, members, keyOrderList):
cls = classobj("EnumClass_" + scope.Name, (Enumeration,), {})
cls.__shared_state = {'_keyOrder':keyOrderList, '__class__': cls}
#Change the defining module to that of the calling scope.
cls.__module__ = scope.Locals['__name__']
lastVal = 0
for key in members:
numval = members[key]
if type(numval) != types.IntType:
numval = lastVal + 1
lastVal = numval
cls.__shared_state[key] = EnumValue(cls, key, numval)
inst = cls()
inst.__dict__ = cls.__shared_state
scope.Locals[scope.Name] = inst
def makeenum(scope, **members):
mems = {}
for key in members:
mems[key] = members[key]
__makeenum(scope, mems, members.keys() )
def makeenum2(scope, *members):
mems = {}
nextval = scope.Next
keyOrder = []
for key in members:
ar = key.split('=')
if len(ar) > 1:
key = ar[0]
val = int(ar[1])
else:
val = nextVal
mems[key] = val
keyOrder.append(key)
nextVal = val + 1
__makeenum(scope, mems, keyOrder )
enumdef.py
import enumdef
#Make an enum named MyEnum...
scope = enumdef.EnumScope("MyEnum", 1)
enumdef.makeenum(scope, a=scope.Next, b=scope.Next, c=scope.Next)
#Did it really set the module scope to THIS module?
assert MyEnum.__module__ == globals()['__name__']
#Now use a different syntax to create an enum OtherEnum...
enumdef.makeenum2(enumdef.EnumScope("OtherEnum",0), "x=4", "y", "z")
print MyEnum.a
print MyEnum.b
print MyEnum.c
print OtherEnum.x
print OtherEnum.y
print OtherEnum.z
print 'Test out string coercion:'
print MyEnum #Note the key order is random.
print OtherEnum #Keys are well-ordered here.
PyEnumHarness.py -- Test harness application.
You can run the code simply by typing PyEnumHarness.py from a command prompt, or just create a shortcut to it. The PY extension should map to Python.exe in Windows. Here's the output of the program:
EnumClass_MyEnum.a=1
EnumClass_MyEnum.b=2
EnumClass_MyEnum.c=3
EnumClass_OtherEnum.x=4
EnumClass_OtherEnum.y=5
EnumClass_OtherEnum.z=6
Test out string coercion:
Enumeration EnumClass_MyEnum
EnumClass_MyEnum.a=1
EnumClass_MyEnum.c=3
EnumClass_MyEnum.b=2
Enumeration EnumClass_OtherEnum
EnumClass_OtherEnum.x=4
EnumClass_OtherEnum.y=5
EnumClass_OtherEnum.z=6
I created two ways to define an enum. First, MyEnum is defined using named arguments. Named args have the disadvantage that they do not preserve key order -- they're probably a hashtable internally. Because of this, I created another option, makeenum2, which I used in the definition of OtherEnum. The second function allows you to preserve key order. By key, I am referring to the name associated with each EnumValue.
The EnumScope class was necessary for a couple of reasons. It caches away the context in which it was instantiated, and a variable will be created within that scope. In this case, it's the scope of the actual module. Notice that you never actually declare the variable MyEnum, but I call into it a couple lines afterward. This is a bit of a magic step that can baffle anyone looking at the code, but it illustrates how you can manipulate a variable scope in Python non-declaritively.
I left it as an exercise for the reader to implement the mathematical operations on EnumValue. This allows you to add, subtract, or perform bitwise operations on EnumValues as if they were integers.
One member worth noting is EnumValue.Description. This can be beefed up to perform a resource lookup for localized text. Remember, part of the use case was to be able to supply localized, descriptive information about the EnumValue.
I mentioned the singleton pattern earlier. This is a point of much discussion in the Pythonic blogosphere. I actually danced around having to implement singleton by basing my enum definitions on function calls. An instance from a dynamic class is returned, but the rest of your code will be blissfully unaware of the existence of that class. Your code will be happy to just use the Enumeration instance variable as it if was a singleton.
Cheers,
Chris
