Enumerated types that encapsulate no behaviour at all are sometimes, very rarely, useful. Usually, they are simply not taught how to behave and are therefore lazy and anti-social.
Over the past decade, I have grown to hate enumerated types1. ‘What have the poor little categoricals done to deserve your scorn?’ you might ask. Read on and I’ll tell you.
Actually, it’s more what they do not do that upsets me. More often than not, I find them lazy, expecting everybody else to satisfy their needs. In addition, wherever they go, they leave a cyclomatic detritus that encourages infestations of bugs. In short, they are usually not taught how to behave. This suggests that it’s actually the fault of their parents, so my dislike should probably be redirected. The good news is that their anti-social behaviour can be reformed. All that is required is some love and attention, and probably the banning of their progenitors from polite programming society.
The problem comes in two forms, one of which is totally degenerate and the other slightly less so.
The Totally Degenerate Unconstrained Enumeration
This form of the problem occurs when a programmer realises that instances of a class need to be configured with varied behaviour (either dynamic or static) in a way that does not suggest subclassing the class in question. They look in their arsenal of antiquated programming weaponry and decide that the blunderbuss of choice is either an int or a string. Being professionals though, they define a bunch of constant ints or strings to represent the enumeration, probably write copious comments and then, believing their societal duties to be discharged, proceed to allow their progeny to carpet bomb the code.
There are three problems with this approach:
- Every piece of API that accepts or returns one of those enumerated values simply declares the type as an int or string. This is a significant weakness in a system since the programmer has discarded type safety; the API can statically accept any int or string, not just the ones for which the constants are defined. The compiler will no longer help. Therefore, each use needs to be documented so that colleagues know what values are acceptable. What we have is an enumeration of values but no explicit type that circumscribes them.
- There will no doubt be conditional logic spread throughout the code that analyses variables for their particular current values in terms of the enumeration and then varies behaviour accordingly. This conditional logic will take the same form wherever it appears – a switch statement or cascaded if … else … with exactly the same set of conditions in each instance. This duplication is a cancer that will spread throughout the codebase with consequences predictable from the onset.
- When new values are added to the enumeration, the code must be searched to find out where new conditions and consequences must be coded. This will generally be achieved by searching for the constants. This is a pretty weak approach.
The Slightly Less Degenerate Typesafe Enum
A step up from the use of ints and strings is the typesafe enum. I will not describe this idiom/pattern here since Google will return a large number of pages describing, and probably lamentably advocating it’s use. Suffice it to say that this ‘solution’ solves the type safety problem described above but does not deal with the conditional logic issue. Type-safe enumerations are therefore well defined but still usually lazy and anti-social.
The solution to the problem is usually very straightforward: polymorphism. First, recognise that behaviour peculiar to each value in the enumeration belongs to that value. Second, conclude that the enumerated type would be better expressed as a class heirarchy2 consisting of an abstraction (class or interface) of the set of behaviours associated with the enumerated type and a set of implementations – one per enumerated value.
In so doing, we have removed the repeated conditional logic and replaced it with polymorphism.
Let me repeat the really important part of that last sentence: you will not have to write conditional logic associated with the enumerated type. Your code will not have as much conditional logic if you follow this procedure. Your code will be simpler if you do this. Can I say this in any other way?
We have also bought together the totality of behaviours associated with each value so that the reason for and meaning of the enumeration is more obvious. This often results in new insights which will further improve the system.
Incidentally, when this procedure is followed, frequently a few of the Gang Of Four’s design patterns ( ) will emerge, centered on the new classes. Flyweight, State and Strategy are the most common in my experience.
Spaghetti, Ravioli and Other Pasta Based Metaphors
Those who have not embraced the core OO notion of message passing3, with its implied de-emphasis of state oriented metaphors, often prefer the ‘smash and grab’ approach to software development in which no abstraction barrier is worth preserving in the pursuit of a result. The argument is usually that it is easier to see the full logic and that all of that delegation and forwarding is just unnecessary spaghetti4.
The argument about the visibility of the logic is only important if it is believed that logic is only understable when expressed at the bit-twiddling level. Clearly this is nonsense; logic is most comprehensible when it is expressed at a level of abstraction appropriate to the context. In order to achieve this it is important to maintain abstraction barriers and therefore the increase in message sends becomes a necessary step. Furthermore, these message sends do not generally prevent understanding as long as the method in which they occur maintains a consistent level of abstraction.
It’s Just a Value. It Shouldn’t Know How to Do These Things
Another frequent misconception is that it is incorrect that these enumerated type categories should have the behaviours with which they are associated. In the real world, the argument goes, the value doesn’t do the behaviour, so neither should it be required to in the ‘domain model5’.
The problems with this reasoning are twofold. First, it confuses simulation with modeling. Second, it is the very essence of object-oriented analysis and design that agents that do work upon objects in the real world are removed in the model and the objects themselves assume the responsibility for doing that work. If this was not the case, then most systems would have one or a few classes that represent the external actors, in which all system behaviour is placed, and the rest of the system would consist of nothing more than data to be updated by those ‘god-classes’. Such a division of responsibility is generally considered an anti-pattern.
It’s Easier To Unit Test
I have been hearing this argument more and more in recent times. It appears to come up when the person saying it lacks sufficient reason for the point of view they are advocating. It is often equivalent to them saying ‘because my dad said so’.
However, let us take the point at face value. There are two possible aspects that might be the subject of the assertion. First, the conditional logic which has been replaced with polymorphism. Second, the behaviour that is applied contingent upon those conditions.
With regard to the first point, I trust the type system to route my message to the correct method, so I do not need to test this. Therefore, there is something in the original code that needs testing that simply does not exist in the reworked code. I would argue that not having something to test is infinitely easier than having something that needs testing.
With regard to the second point, since the contingent behaviour has now been moved to a number of methods in different subclasses, the testing of this can now proceed without setting up the conditions necessary to reach that behaviour. This also therefore, must be considered to be easier.
I Don’t See the Difference; Something is Still Having to Make the Decision
Write your system in machine code then. It would be no different to Java/C#/C++/Delphi/etc… Something is still having to produce the machine code.
The point is that the programmer does not have to explicitly write the condition. In a very real sense, the condition does not exist any more.
This kind of argument often points to a lack of modeling experience and expertise.
I started this article by saying that I have come to hate enumerated types. Having hopefully grabbed your attention, I will now moderate that statement a little.
Enumerated types that encapsulate no behaviour at all are sometimes, very rarely, useful. The only time I have ever come across a useful instance of this is at the fringes of systems where they communicate with other systems. I have frequently come to systems that have been built by developers from a non-OO background, been appalled at the amount of unnecessary, flaky, conditional logic caused by the use of delinquent enumerations and been able to make significant reductions in code size while improving comprehensibility and maintainability by giving the enumerations their appropriate responsibilities.
While I accept that the single-dispatch, message passing paradigm has its limits I find the arguments against polymorphism and in favour of conditional logic deeply flawed, often stemming from a lack of fundamental comprehension of the paradigm.
Using polymorphic mechanisms, rather than the weak enumerated types described above, results in less code. Furthermore, the code that remains is more maintainable. Where’s the down side?
1 This article is generally concerned with the use of enumerated types in single-dispatch, object oriented languages. An equivalent argument can be advanced for many other kinds of language, however.
2 There may be equivalent alternatives. For example, Java 5’s enum construct achieves the same end.
3 At least as far as single-dispatch OO languages go. The CLOS guys tend to think that this notion is lunacy. I have a great deal of sympathy with their position, but until Lisp or one of the other more powerful languages (Haskell, ML, etc.) becomes acceptable to my clients, I have to work with what I have.
4 ‘Ravioli’ would be a more apt description of an OO system.
5 A poorly understood term.