Deprecated: Function set_magic_quotes_runtime() is deprecated in /home/raedan/public_html/textpattern/lib/txplib_db.php on line 14
State of Flow: Domain Specific Language Nonsense::journal

Domain Specific Language Nonsense

Lance Walton - Sunday January 28, 2007

Much of the pop-literature about DSLs has a single concern, which is how to make code look more like natural English and is uninterested in those things that make languages more or less useful.

The last few years has seen an increase in interest in Domain-Specific Languages (DSLs) within the Java development community. This appears to have been ignited by the work of Charles Simonyi on Intentional Software, Martin Fowler’s article on DSLs, the development of JMock and a few other articles and initiatives that have made news.

I am in favour of raising the bar in mainstream development, but I find the approach to DSLs as practiced by many of those currently enamoured of them somewhat lacking, for the very same reason that the mainstream development bar is low in the first place.

It is sad that, in commercial software development, many developers are burdened with three difficulties. First, they find it hard to sustain interest in a topic whose foundations, results and consequences require non-trivial effort to understand, no matter how germane it may be to their chosen profession and situation. Second, they believe that their ability to construe meaning from the vapour of an introductory paragraph verges on the paranormal. Third, their ability to critically review an idea is often impaired as the blood flows away from the part of the brain that deals with analysis to that responsible for creativity.

As a result, many of those who are happily inflicting ‘DSLs’ on their teams, or blogging about them thus inflicting their dysfunctional notions on the world, do not know what DSLs are, or even, to get right back to basics, how they might evaluate a language, domain-specific or otherwise. They appear to have built up a plausible but wholly incorrect idea of the meaning of ‘domain-specific language’ based on nothing more than that which can be deduced by the three words ‘domain’, ‘specific’ and ‘language’, often bracing that meaning with more enthusiasm and ill-informed references to Literate Programming than they should be legally licensed for1.

My main issue with much of the pop-literature about DSLs is that it has a single concern, which is how to make code more ‘readable’, by which is meant more immediately understandable. While comprehensibility is a laudable aspiration, the singular obsession of the would-be DSL writers appears to be that this should be achieved by making the code look more like natural English, using an idiom known as a ‘fluent interface’, and they are uninterested in those things that make languages more or less useful.

This is an extraordinarily narrow view of what makes a body of work understandable; a work can be understood if it is both clearly expressed and the reader has the necessary background in the subject matter. The heroic attempt to make Java look like natural English tries to trivialise the second point and actually tends to undermine the first at every level other than the ‘top-most’, rather than support it.

In addition, it ignores the facts that English has evolved over millennia as a general purpose language, is frequently imprecise, often only acquiring precision through redundancy and also often conveys any meaning at all only because of a vast shared context2. Imprecision, redundancy and the need for a massive shared context is hardly a sound basis for programming. It is, however, a swamp in which code of the most turgid kind can be found.

If you doubt my characterisation of English (or any natural language), look up almost any non-technical, general term in a dictionary and you will usually find a large number of possible definitions, some of which are logically contradictory. Take for example the word ‘general’ used as part of a technical term earlier in this paragraph. Amongst its non-technical meanings, it can mean both ‘universal’ and ‘most’. The majority of English speakers would agree that one acceptable meaning of ‘most’ would be something like ‘the majority but not all’ the last clause of which contradicts ‘universal’ as a definition. Yet listeners don’t stare at me in confusion when I use the word ‘general’ in a phrase (they frequently do stare at me but usually because they understand exactly what I mean). The phrase as a whole yields its meaning by some interactive process that the words engage in upon alighting on the mind of the reader or listener.

There is clearly a contradiction between the informal use of English as practiced by natural speakers and the needs of a language that can be used for precisely specifying processes and structures.

In order to build useful domain-specific languages, we must answer at least two questions:

The first point seems obvious and should need no further discussion. Sadly, this is frequently not the case. Even so, I am not going to talk about it any more here. Instead I would like to focus on the second point.

In the excellent book Structure and Interpretation of Computer Programs (commonly referred to as SICP), Abelson, Sussman and Sussman state3 that three mechanisms are present in every powerful language4:

Furthermore, they admonish us to ‘pay particular attention to the means that the language provides for combining simple ideas to form more complex ideas5’.

It is plain that this way of thinking allows us to understand a wide variety of recognised languages. For example:

If we evaluate the ‘fluent interface’ reduction of DSL’s ( i.e. the subject of this article) according to the criteria described in SICP, we find the reason for my distaste of them; in general they are nothing more than a collection of primitive expressions, no abstractions above those available in the implementation language, and a single means of combination, usually using auxilliary classes whose methods are meaningless outside of the context of the caller, which must be implemented for each required combination.

The result of this is a thin shell of fluent DSL-ness, with an underlying mess of accidental complexity, implemented in code that is several times more voluminous than the simple, direct, natural OO implementation would have required. How much more degenerate could it be?

Note that the API presented by JMock, for example, does not fit into this category. It goes about as far as is possible, given Java’s constraints, to give clients power to produce combinations through the use of InvocationMatcher, Stub, etc.

In addition, JMock combines the concept of DSL with that of fluent interface quite beautifully, so that to the client, it truly does provide a more powerful mechanism than would exist if either was removed.

Finally, JMock is a highly-geared API; it took Nat Pryce and Steve Freeman some time to build something that tens of thousands can use without worrying about the internals6. This description cannot be applied to bespoke software development.

So, for those developers who find themselves enamoured of the notion of domain-specific languages, I beg you to consider what you are hoping to achieve other than a superficial sense of ‘easy readability’ for a non-technical audience, and to critically evaluate your language using appropriate criteria; criteria that would have been familiar to John Locke in 16907:

The acts of the mind, wherein it exerts its power over simple ideas, are chiefly these three: 1. Combining several simple ideas into one compound one, and thus all complex ideas are made. 2. The second is bringing two ideas, whether simple or complex, together, and setting them by one another so as to take a view of them at once, without uniting them into one, by which it gets all its ideas of relations. 3. The third is separating them from all other ideas that accompany them in their real existence: this is called abstraction, and thus all its general ideas are made.

John Locke, An Essay Concerning Human Understanding (1690)

1 Donald Knuth’s ‘Literate Programming’ is a fine paper, often cited in this context, but not well read it seems.

2 This context is given not only by shared definition of terms but also by similarity of experience.

3 Section 1.1 The Elements of Programming of SICP

4 The remainder of this article depends on the congruency of meanings of the word ‘language’ in the context of SICP and in the term ‘Domain-Specific Language’. I assume this.

5 Robert Floyd presents a broadly similar view in his classic paper The Paradigms of Programming

6 Actually, the internals look pretty good.

7 This passage was quoted at the beginning of Chapter 1 in SICP indicating, unsurprisingly, that the authors connect the power of languages with their ability to map onto mental processes involved in understanding.